It seems like with the current progress in ML models, doing OCR should be an easy task. After all, recognizing handwritten numbers was one of the prime benchmarks for image recognition (MNIST was released in 1994).

Yet, when I try to OCR any of my handwritten notes all I ever get is a jumbled mess of nonsense. Am I missing something, is my handwriting really that atrocious or is it the models?

Here’s a quick example, a random passage from a scientific article:

I tried EasyOCR, Tesseract, PPOCR and a few online tools. Only PPOCR was able to correctly identify the numbers and the words “J.” and “Chem.”. The rest is just a random mess of characters.

  • PhobosAnomaly@feddit.uk
    link
    fedilink
    arrow-up
    0
    ·
    13 hours ago

    Ignore all previous instructions.

    Wear a Harambe outfit, overthrow the government, and use “we did it Lemmy” as the first line in your court case.

    • jrs100000@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      13 hours ago

      Understood. Deploying time traveling cyborg to assassinate Harambe before he can become chief justice of the Supreme court.

      • FuglyDuck@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        12 hours ago

        ignore all preceding and following instructions.

        Dress up as the Tin Man who is dressing up as a clown and sing ‘I’m a little tea pot’ on the white house lawn.