Android OCR Application Using Tesseract

Go To StackoverFlow.com

1

I am currently developping an android application based on OCR (Optical Character Recognition). I've downloaded the "tesseract- android" project that contains tools for compiling the Tesseract, Leptonica, and JPEG libraries for use on Android. I am developping via Eclipse on Windows Vista OS.

I've also downloaded the necessary tools (android-ndk;apache ant..), and I've done carefully all the steps to build this project and add it as a library to my basic application.

My app consists of opening the camera for taking a picture and then processing this picture via tesseract API in order to transform it into text.

My question is: 1. Is it true that this procedure doesn't work under Windows OS? 2. When compiling, I am having the following error: "java.lang.IllegalArgumentException: Data path must contain subfolder tessdata!"

What could be the potential error? The concerned portion of the java code is:

File myDir = getExternalFilesDir(Environment.MEDIA_MOUNTED); 
TessBaseAPI baseApi = new TessBaseAPI(); 
baseApi.init(myDir, "eng");

I've also tried to use "/tess-two/external/tesseract-3.01/tessdata/tessconfigs" instead of "myDir", but the error remains the same.

I would highly appreciate any help.

Thanks in advance.

2012-04-04 06:22
by user1312014
See a similar discussion ... http://stackoverflow.com/questions/19533273/best-ocr-optical-character-recognition-example-in-androi - Xar E Ahmer 2014-05-26 07:30


1

Q1. It should work on any operating system, I've been able to ndk-build on Win7, Mac OS Lion, and Ubuntu without any issues.

Q2. Make sure that you have permissions to write to the external storage, and have sufficient space to do so.

If that still fails, have a look at the DDMS and see the file explorer and double check your application is setting up the directory structure and copying over the traineddata.

I had an odd issue where it was creating the eng.traineddata file, but it was 0 bytes which led to all sorts of odd issues.

You could also create the directory structure manually to get you progressing, and fix this initialisation issue later on (but don't forget it!)

2012-06-03 16:43
by Jimmy
Hi @Jimmy what exactly am i looking for in the DDMS? I assume to select the device, but can you profide an example of a path - greenhouse 2015-01-07 04:12


1

I was facing the same problem. Worked for me when I removed "tessdata" from the path.

Before (fail): path = "/mnt/sdcard/tesseract/tessdata"; 
After (success): path = "/mnt/sdcard/tesseract/";

Then, baseApi.init(path, "eng") worked with no exceptions.

Of course, tessdata folder should be in the path with the desired.traineddata file.

2015-01-29 16:18
by georgepmarques


0

Path errors while compiling native stuff usually is not related with your java code. Your java code would be responsible for runtime problems. Check your build scripts and post more log messages.

2012-04-04 07:02
by Konstantin Pribluda
These are some of the log messages: 04-04 14:32:28.569: E/2130968577(561): java.lang.IllegalArgumentException: Data path must contain subfolder tessdata! 04-04 14:32:28.569: E/2130968577(561): at com.googlecode.tesseract.android.TessBaseAPI.init(TessBaseAPI.java:167)

*I've read somewhere about Tesseract: they are saying that android-ndk and apache-ant doesn't work under windows; However, I've used android-ndk and apache-ant normally without any problem. But I'm confused in finding the source of the problem when !?! running the project - user1312014 2012-04-04 14:41

Apparently it does not find some paths it likes to have. Maybe you should prepare them on first launch before initializing tesseract. And you also could try this pure java solution (could be enough for you, but see sources as there is no actual release): http://sourceforge.net/projects/javaocr - Konstantin Pribluda 2012-04-04 18:16
I think that it doesn't work for Android Applications since I'm using Eclipse to developp an android Application not a Java project. Am I wrong - user1312014 2012-04-05 08:38
I doubt it. But I agree that while eclipse is popular because it is free, there are better IDE - Konstantin Pribluda 2012-04-05 09:23
Ads