I needed to perform OCR on values selected out of a larger image by users one-at-a-time, and that led me to Tesseract 3 (there are more recent versions, but I I found V3 first and it worked well enough for me, so I just stuck with it).
I wrote a small class that makes it easy to pass an image of some text to Tesseract and have it spit back some text, and I am sharing it at the bottom of this post.
Some Notes:
Public Methods of CTesseract3
Initialize - This must be called before you call any other methods - pass the path to the folder where libtesseract-3.dll is located.
AcceptableCharacters - Pass a string of characters that Tesseract3 will consider "legal" - this can improve accuracy if (for example) you want to detect only numbers, you can pass "0123456789" and it won't mis-detect things like "I", "O", etc... This is completely optional though.
GetTextFromImage - Pass an image file path or byte array (JPG/PNG/BMP) and it will spit back a String of the OCR'd text.
Basic Example Code:
I've included a small sample image in the project, and you can try converting it by typing "Test" in the Immediate window and press return - you should see "This is a test of the emergency broadcast system." appear if everything is working OK.
Feel free to ask questions/report any bugs, and I hope someone out there finds this useful.
Source Code:
Tesseract3.zip
I wrote a small class that makes it easy to pass an image of some text to Tesseract and have it spit back some text, and I am sharing it at the bottom of this post.
Some Notes:
- The class works with CDECL build of Tesseract 3 - I'm using the last 3.x series compiled and published by the University of Mannheim Library (UB Mannheim) that is available here: https://digi.bib.uni-mannheim.de/tesseract/ - Get the "tesseract-ocr-setup-3.05.02-20180621.exe" release.
- To make CDECL calls & Image handling easier, I'm using RC6 - feel free to swap it out for your favourite image handling & CDECL calling code.
- I've implemented a very small subset of the Tesseract Base API - only what I needed to get the job done. Feel free to extend it as needed, and post your changes back here if you are so inclined.
- It's a bit slow - I only get around 15 image->text conversions per second against short input, but since my primary use-case is to convert $ values extracted from a user drawn region (one $ value at a time captured from a larger document with many values), it is more than fast enough.
- Tesseract3 seems to do best with high resolution images, especially for detecting the difference between small things like commas and periods - so I recommend working with 200-300DPI image sources if possible.
Public Methods of CTesseract3
Initialize - This must be called before you call any other methods - pass the path to the folder where libtesseract-3.dll is located.
AcceptableCharacters - Pass a string of characters that Tesseract3 will consider "legal" - this can improve accuracy if (for example) you want to detect only numbers, you can pass "0123456789" and it won't mis-detect things like "I", "O", etc... This is completely optional though.
GetTextFromImage - Pass an image file path or byte array (JPG/PNG/BMP) and it will spit back a String of the OCR'd text.
Basic Example Code:
Code:
Public Sub Test
Dim tess As New CTesseract3
tess.Initialize "<path to folder where libtesseract-3.dll is>"
MsgBox tess.GetTextFromImage("<path to JPG/BMP/PNG image>") ' OR MsgBox tess.GetTextFromImage(<image byte array>)
End Sub
Feel free to ask questions/report any bugs, and I hope someone out there finds this useful.
Source Code:
Tesseract3.zip