Currently we use a VO program based on the Office generated (if a recall correctly, it's long ago and I didn't write it) iDocument Class to OCR documents. I asked my trainee today to find .Net based solutions so we can rewrite this for as being the first independent X# program for this client.
He reported back that OCR programs, usually in C#, were either based on paid components or unacceptably slower than the VO program. So either we 'transport' this VO solution to X#, keep the VO program forever, or - we hope- find an existing sample/solution which we can use to create an X# project.
did you have a look at Tesseract and .Net Wrappers ?
I suggest you start here : github.com/charlesw/tesseract
The licence is Apache Licence V2.0, which means it is free and you can use it in a commercial product.
Thanks for your reply. Yes, Jelle tried that. There's a small sample TIFF in hires and this works, also fast. However, we found:
1 On low resolutions it's fast enough but not accurate
2 On high resolutions the speed is comparable with that of the VO program but it is still not accurate. We miss crucial info from the documents we scan while it works 100% in the VO program.
Do you (or anybody else) have experience with the OCR results? Or a way to increase reliability? Or an alternative?
If it works better in VO there's little incentive for my client to convert it to a .Net program...