fossilesque@mander.xyzM to Science Memes@mander.xyzEnglish · 2 months agoPublishers Always Innovatingmander.xyzimagemessage-square34fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1imagePublishers Always Innovatingmander.xyzfossilesque@mander.xyzM to Science Memes@mander.xyzEnglish · 2 months agomessage-square34fedilink
minus-squarekeepthepace@slrpnk.netlinkfedilinkEnglisharrow-up0·2 months agoYes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images! PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing
minus-squareJackbyDev@programming.devlinkfedilinkEnglisharrow-up0·2 months agoOh nice, thanks for sharing that project. I haven’t heard of it before!
Yes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images!
PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing
Oh nice, thanks for sharing that project. I haven’t heard of it before!