Quantcast
Channel: VBForums - CodeBank - Visual Basic 6 and earlier
Viewing all articles
Browse latest Browse all 1448

[VB6] MS Office (MODI) OCR - OCR "for free"

$
0
0
This may already have been posted, but a search turned nothing up here in the CodeBank.


If you have an appropriate version of Microsoft Office you can use MODI to do OCR on images. The obvious candidates are 32-bit Office 2003 and 2007, but supposedly this can be made to work in 32-bit Office 2010 as well.

As far as I can tell there is no way to feed images to MODI.Document except by having it load them from disk. But you could always "dump" images to a temporary folder as required, so that isn't a nasty restriction.


Requirements

VB6 development tools. Of course the logic can be trivially ported to Office VBA or even a WSH or MSHTA script written in VBScript.

A version of 32-bit Office supporting MODI.


Notes

This program uses early binding against Microsoft Office Document Imaging 11.0 Type Library (Office 2003). This is used to give us easy access to MODI.Document's OnOCRProgress event and the predefined constant miLANG_ENGLISH in Enum MiLANGUAGES.

To use this code with Office 2007 you'd need to change the reference to Microsoft Office Document Imaging 12.0 Type Library and recompile.

You could also use late binding, but then you would either have to give up using WithEvents (not valid for As Object) and the OnOCRProgress event entirely... or else use additional code or a C++ helper DLL to bind to the event.

MODI was removed in Office 2010, but you might look at:



Attached Demo

The attachment is large because of included image files.

The program just grabs the file names from a hard-coded folder, then loads and OCRs them one by one and displays the resulting text in a RichTextBox. A status line reports progress on each image as it works.

A Timer control is used to work through the queue of images, primarily to help avoid the program being marked unresponsive by Windows.

The demo helps illustrate the "garbage in, garbage out" nature of OCR: the quality of the results depends on what you feed into it.
Attached Files

Viewing all articles
Browse latest Browse all 1448

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>