Quantcast
Channel: VBForums - CodeBank - Visual Basic 6 and earlier
Viewing all articles
Browse latest Browse all 1449

Optical Character Recognition (OCR) With Tesseract3

$
0
0
I needed to perform OCR on values selected out of a larger image by users one-at-a-time, and that led me to Tesseract 3 (there are more recent versions, but I I found V3 first and it worked well enough for me, so I just stuck with it).

I wrote a small class that makes it easy to pass an image of some text to Tesseract and have it spit back some text, and I am sharing it at the bottom of this post.

Some Notes:

  • The class works with CDECL build of Tesseract 3 - I'm using the last 3.x series compiled and published by the University of Mannheim Library (UB Mannheim) that is available here: https://digi.bib.uni-mannheim.de/tesseract/ - Get the "tesseract-ocr-setup-3.05.02-20180621.exe" release.

  • To make CDECL calls & Image handling easier, I'm using RC6 - feel free to swap it out for your favourite image handling & CDECL calling code.

  • I've implemented a very small subset of the Tesseract Base API - only what I needed to get the job done. Feel free to extend it as needed, and post your changes back here if you are so inclined.

  • It's a bit slow - I only get around 15 image->text conversions per second against short input, but since my primary use-case is to convert $ values extracted from a user drawn region (one $ value at a time captured from a larger document with many values), it is more than fast enough.

  • Tesseract3 seems to do best with high resolution images, especially for detecting the difference between small things like commas and periods - so I recommend working with 200-300DPI image sources if possible.


Public Methods of CTesseract3

Initialize - This must be called before you call any other methods - pass the path to the folder where libtesseract-3.dll is located.

AcceptableCharacters - Pass a string of characters that Tesseract3 will consider "legal" - this can improve accuracy if (for example) you want to detect only numbers, you can pass "0123456789" and it won't mis-detect things like "I", "O", etc... This is completely optional though.

GetTextFromImage - Pass an image file path or byte array (JPG/PNG/BMP) and it will spit back a String of the OCR'd text.


Basic Example Code:

Code:

Public Sub Test
  Dim tess As New CTesseract3

  tess.Initialize "<path to folder where libtesseract-3.dll is>"

  MsgBox tess.GetTextFromImage("<path to JPG/BMP/PNG image>") ' OR MsgBox tess.GetTextFromImage(<image byte array>)
End Sub

I've included a small sample image in the project, and you can try converting it by typing "Test" in the Immediate window and press return - you should see "This is a test of the emergency broadcast system." appear if everything is working OK.

Feel free to ask questions/report any bugs, and I hope someone out there finds this useful.

Source Code:

Tesseract3.zip
Attached Files

Viewing all articles
Browse latest Browse all 1449

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>