I saw different articles speaking about OCR form recognition (data extraction) and they said that they used Neural Network in order to do form recognition, so what's the relation between Artificial Neural network (ANN) and form recognition? If I want to extract fields from a BusinessCard, is it required to use ANN or is it optional? In other words when do I need to use ANN and when I don't?
It's a little different. ANN is just an "expert" in all OCR. But OCR engines contain many experts. When you study ANN you will build a simple OCR engine using just ANN but this does not compare to modern engines that use this in conjunction with tri-grams, morphology, data types ( very important for BCR and Forms ), dictionaries, connected components algorithm, etc. So look at it as just one of the tools in the bag of tricks to extract quality results. A good engine will incorporate ANN and all the others. In BCR there are additional considerations and it should be very heavy on connected components, dictionaries first, then use ANN and pattern matching for the actually recognition.
ANN is one way to perform OCR. There are others. Hence if you want to extract fields from a BusinessCard using ANN is only optional.
Good question. I recently spent some time playing with OCRopus, a Google project that does OCR - you can get it for free and play with it yourself. I'm pretty sure that it has an ANN as one of the modules behind it. However, the whole process of Optical Character Recognition can have many steps (lots of different little modules that each do something and pass the results to the next module).
So, here are some of the things I remember as being done by modules in that project:
Basically, you can do the above using little bits of code that don't involve a neural net. So it's simpler doing it with these little bits of code.
The neural net I think is used just to recognize the individual characters - which character of a group of possible characters is it.
There's a training command in the OCRopus that I had running for over a week on end, and it kept sending line samples to the map, slowly changing the map as it went. I think it was training the ANN part.