OCR on iPhone demo

Update: Source code for demo project released.


i finally got around to building a proof of concept implementation of tesseract-ocr for the iPhone. months ago, i documented the steps which helped to get the library cross-compiled for the iPhone’s ARM processor, and how to build a fat library for use with the simulator as well. several folks have helped immensely in noting how to actually run the engine in obj-c++. thanks to everyone who has commented so far.

anyway, below is a short video of the POC in action. the basic workflow is: select image from photo library or camera, crop tightly on the box of text you’d like to convert, wait while it processes, select / copy or email text.

there are loads of improvements which could be implemented (image histogram adjustment, rotation / perspective correction, automatic text box/layout detection, content detection – dates, links, contact information…) but this is a nice point to stop and document.

i realize that there are several OCR applications available for the iPhone, including a few which also run the engine on the device rather than handing it off to a web service. this started as an educational project on cross-compiling, and to fill a personal want for a handheld OCR app of my own. for these reasons, i’m going to open-source the entire app. look for it after this semester ends when i’ll have some more time to properly document the code. in the meantime, enjoy these code snippets demonstrating how to initialize the engine and process an image.

Initialize the engine:

    NSString *dataPath = [[self applicationDocumentsDirectory] stringByAppendingPathComponent:@"tessdata"];
     Set up the data in the docs dir
     want to copy the data to the documents folder if it doesn't already exist
    NSFileManager *fileManager = [NSFileManager defaultManager];
    // If the expected store doesn't exist, copy the default store.
    if (![fileManager fileExistsAtPath:dataPath]) {
        // get the path to the app bundle (with the tessdata dir)
        NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
        NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
        if (tessdataPath) {
            [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];

    NSString *dataPathWithSlash = [[self applicationDocumentsDirectory] stringByAppendingString:@"/"];
    setenv("TESSDATA_PREFIX", [dataPathWithSlash UTF8String], 1);

    // init the tesseract engine.
    tess = new TessBaseAPI();

    tess->SimpleInit([dataPath cStringUsingEncoding:NSUTF8StringEncoding],  // Path to tessdata-no ending /.
                     "eng",  // ISO 639-3 string or NULL.

Process an image. This should be threaded as it’s a heavy process:

    CGSize imageSize = [uiImage size];
    double bytes_per_line	= CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel	= CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;

    CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
    const UInt8 *imageData = CFDataGetBytePtr(data);

    // this could take a while. maybe needs to happen asynchronously.
    char* text = tess->TesseractRect(imageData,
                                     0, 0,
                                     imageSize.width, imageSize.height);

    // Do something useful with the text!
    NSLog(@"Converted text: %@",[NSString stringWithCString:text encoding:NSUTF8StringEncoding]);

    delete[] text;

Enjoy the video!

Tags: , , , , ,

30 Responses to “OCR on iPhone demo”

  1. Mark Pfeiffer Says:

    Didn’t understand a word of this. But I suspect it is way cool.

  2. Robert Says:

    Thanks Mark. In simple terms – for you ;-) snap a photo of printed text on your camera phone, this converts it to editable text which you can save or email.

  3. Jan Says:

    Thank you for the snippet but where do you put your tessdata folder?

  4. Robert Says:

    If you add it to your project, it should get copied into the app bundle during the build. The code snippet looks for the data folder in the app Documents directory; if it’s not found (ie. first run) copies it from the app bundle to the Documents dir.

  5. Jan Says:

    Ok, i had done this but i get the error:
    : *** -[EventDetailViewController applicationDocumentsDirectory]: unrecognized selector sent to instance 0x46a390.

  6. Nolan Says:

    I got everything compiled and working great on a few pictures but mostly I’m getting EXC_BAD_ACCESS inside the TessBaseAPI::HistogramRect method. XCode isn’t give me anything more to work off of but all my variables and data look solid. Could you post your project (.a and all) or maybe you’ve seen this error before?

  7. exploration » Blog Archive » cross-compiling for iPhone dev Says:

    [...] Proof-of-concept demo. Also, updated the script for building with the 10.6 [...]

  8. Colin Says:

    Hi. I’m attempting to compile this using the iOS 4 SDK and am having some problems. I’m using a version compiled with the instructions here:
    The sample project is written in a previous version of the OS SDK, and isn’t compatible.

    When I come to trying to copy the sample code into a new project in the SDK version 4, I get the following error on the @class line:
    error: forward declaration of ‘struct TessBaseAPI’

    I also get the following error on the tess = new TessBaseAPI(); line:
    error: invalid use of incomplete type ‘struct TessBaseAPI’

    Any ideas?

  9. Robert Says:

    I’m afraid I’ve only just started using the 4.0 SDK…and haven’t tried to migrate this yet.

  10. Hanno Says:

    Hi Robert,

    any idea to get it to work on iOS4 or OS 3.2 on the iPad.
    Would be helpful for me.

    Thanks, Hanno

  11. Robert Says:

    It works fine for me on iOS4. Haven’t tried running on an iPad, but I don’t see why that would be a problem, either.
    Note, I’ve only used the 3.x SDK to compile the app.

  12. Sat Says:

    Hi Robert,
    Sorry if i am clear in my earlier post, but downloaded ur source from github(Pocket-OCR).

    Sorry for the dumb question.
    Do i need to build “libtesseract_full.a” again. Whatever included in your xcode project; is that not sufficient. is that not going to work in my env?. Is this seperate for seperate projects?

    I am using iphone sdk4 & xcode 3.2.3

    Any help would be great.

  13. Robert Says:

    Yes, you need to build, or at least drag in an existing copy of the tesseract library (v2.0.4) to the PocketOCR project.

  14. sam Says:

    I was able to cross compile the tesseract 2.0.4 to iphone and run the pocket ocr, i used the new tessdata file trained for identify the numbers. when i run the tesseract on mac os x with the above tessdata it gives correct out put,but when i run it in the iphone i’m getting extra number added and some space to result,if i ignore the spaces and the extra number added to end my out put result is identical with mac os x result, can anyone tell me what is the problem,

  15. federicocappelli.myopenid.com/ Says:

    Hi all, i import the lib in my xCode project, import baseapi.h (all found in https://github.com/MarceloEmmerich/Tesseract-iPhone-Demo), but probably the compiler try to compile baseapi.h like objective-c and i have this error:

    ocrObj.mm:69: error: invalid use of incomplete type ‘struct TessBaseAPI’
    baseapi.h:32: error: forward declaration of ‘struct TessBaseAPI’
    and some errors on baseapi.h

    anyone have ideas?

  16. Espen Overaae Says:

    In the version of tesseract I have, TessBaseAPI is inside the namespace tesseract. To get rid of that error, add this line:

    using namespace tesseract;

    to the file(s) where you include baseapi.h

  17. Thuan Says:

    Hi, thanks for the write up! Is there a reason why we have to copy the tessdata files to the app’s documents directory and not directly load them from the app bundle? Thanks in advance.

  18. bob Says:

    [dataPath cStringUsingEncoding:NSUTF8StringEncoding] should be written as [dataPath UTF8String]

  19. Ashraf Says:

    Hi there,
    I’m using tesseract in my application but it seems there’s an error with the initialization it closes the application I think that it can’t find the tesserdata folder somehow.
    I added the “tesserdata” folder to the resources and also added the “basepi.h” and the library yet it doesn’t work it always crashes at this line:

    tess->SimpleInit([dataPath cStringUsingEncoding:NSUTF8StringEncoding], // Path to tessdata-no ending /.
    “eng”, // ISO 639-3 string or NULL.
    Thanks in advance.

  20. Robert Says:

    PocketOCR on github should have tessdata already included. (Unless you’ve changed the datapath var, you’ve got your tessdata dir named incorrectly as “tessERdata”). Otherwise, if you’re adding new tessdata, ensure that you add it as a Folder rather than as a Group in Xcode, and make sure that it’s included in the copy resources build step for the app target.

  21. ryen Says:

    Question about image resolution :

    1. Are you using compressed images ( I think you might be b/c I thought iPhone camera returns PNG format ) ? If so, did you have to compile/link in
    the extra libraries for other non-TIF formats, when building tesseract ?


  22. Ashraf Says:

    Hi there I’m using tesseract in my application and I added the library and the header file but the output of the application is strange it gives strange characters knowing that when I run the OCRDemo application on the same photos it gives good output there should be no differences between them as I copied the code into my application.
    Thanks in advance.

  23. Srinivas Says:

    Hi Robert,
    Thanks for the demo app on OCR, but when I take the snap from camera or gallery, it is not parsing the image, instead it is giving crash at
    char* text = tess->TesseractRect(imageData,
    0, 0,
    imageSize.width, imageSize.height);
    Could you help me to get it run with my own images

  24. Manuel Lugo Says:

    Hey robert thanks for the post, i was able to get it to work with a full size picture, but what i need is to only capture a smaller part of a picture i have a Mask on the camera view to let the user know what to shoot, the image is 320,110 pixels and when i try to use and send it to the OCR //[self ocrImage: image]; i get an error of bad_exec_access does the bytes_per_line or bytes_per_pixel need to change? can you give me a hand on this?


  25. Robert Says:

    My POC app uses the built in iOS cropping, but sends that image (a square) to the OCR engine without a problem. Do you have a code example?

  26. Manuel lugo Says:

    OK i just tried again and square pictures are working no matter the size 100X100 or 200X200 or 320×320 my problem comes when i pass a picture with rectangular proportions…

    CGImageRef cgImage = [[self imageByScalingAndCroppingForSize:CGSizeMake(320,110) withSourceImage:image] CGImage];
    UIImage *copyOfImage = [[UIImage alloc] initWithCGImage:cgImage];
    [self ocrImage: copyOfImage];

    and it potions this line
    char* text = tess->TesseractRect(imageData,(int)bytes_per_pixel,(int)bytes_per_line, 0, 0,(int) imageSize.height,(int) imageSize.width);

  27. James Elsey Says:

    Hi Robert,

    I’ve checked out your pocket OCR project, built it and deployed to my 4S. I’m unable to get it to recognise anything at all. Theres no tesseract errors in the debug logs, it just seems that tesseract is hopeless at finding any meaningful text.

    I’ve tried taking pictures of book covers with clear printed text, road signs, car registration plates, everything I use either returns a blank value, or some nonsense like “GFSDGT”.

    I’ve also built tesseract 3.01 and included it in my own app, with similar such results. I’ve also tried about 5 other apps on the app store and all have similar success rates.

    Is it safe to say that tesseract has a low success rate of finding text? Or can all the above apps (only one of which I’ve had my hand in) are all poorly implemented?

  28. Robert Says:

    I find that you need to crop out any distracting elements to get the OCR to perform. With PocketOCR, pinch zoom the image to get your text near the upper left corner.

    Also, good, even lighting with high contrast between the text and background helps.

    I’d think that image processing before the recognizing step would be a good idea for better results.

    I’m assuming that you’re using English. Otherwise, be sure to include the appropriate training data.

    Good luck!

  29. Deepak Says:

    Hi, Robert

    You have done a remarkable job. I read whole research which has given me loads of confidence to start on this project.
    Well I was wondering how do I download tesseract for Mac it is no where on google’s website. Please refer me a download link.

    Thanks in advance

  30. Robert Says:

    What do you mean by “tesseract for Mac”? The tesseract ocr project is hosted on Google code: https://code.google.com/p/tesseract-ocr/
    You can compile the libraries from that source. There may also be static libs floating around on Github.

Leave a Reply