OCR on iPhone demo

Update: Source code for demo project released.

TessIcon

i finally got around to building a proof of concept implementation of tesseract-ocr for the iPhone. months ago, i documented the steps which helped to get the library cross-compiled for the iPhone’s ARM processor, and how to build a fat library for use with the simulator as well. several folks have helped immensely in noting how to actually run the engine in obj-c++. thanks to everyone who has commented so far.

anyway, below is a short video of the POC in action. the basic workflow is: select image from photo library or camera, crop tightly on the box of text you’d like to convert, wait while it processes, select / copy or email text.

there are loads of improvements which could be implemented (image histogram adjustment, rotation / perspective correction, automatic text box/layout detection, content detection – dates, links, contact information…) but this is a nice point to stop and document.

i realize that there are several OCR applications available for the iPhone, including a few which also run the engine on the device rather than handing it off to a web service. this started as an educational project on cross-compiling, and to fill a personal want for a handheld OCR app of my own. for these reasons, i’m going to open-source the entire app. look for it after this semester ends when i’ll have some more time to properly document the code. in the meantime, enjoy these code snippets demonstrating how to initialize the engine and process an image.

Initialize the engine:

    NSString *dataPath = [[self applicationDocumentsDirectory] stringByAppendingPathComponent:@"tessdata"];
    /*
     Set up the data in the docs dir
     want to copy the data to the documents folder if it doesn't already exist
     */
    NSFileManager *fileManager = [NSFileManager defaultManager];
    // If the expected store doesn't exist, copy the default store.
    if (![fileManager fileExistsAtPath:dataPath]) {
        // get the path to the app bundle (with the tessdata dir)
        NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
        NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
        if (tessdataPath) {
            [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];
        }
    }

    NSString *dataPathWithSlash = [[self applicationDocumentsDirectory] stringByAppendingString:@"/"];
    setenv("TESSDATA_PREFIX", [dataPathWithSlash UTF8String], 1);

    // init the tesseract engine.
    tess = new TessBaseAPI();

    tess->SimpleInit([dataPath cStringUsingEncoding:NSUTF8StringEncoding],  // Path to tessdata-no ending /.
                     "eng",  // ISO 639-3 string or NULL.
                     false);

Process an image. This should be threaded as it’s a heavy process:

    CGSize imageSize = [uiImage size];
    double bytes_per_line	= CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel	= CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;

    CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
    const UInt8 *imageData = CFDataGetBytePtr(data);

    // this could take a while. maybe needs to happen asynchronously.
    char* text = tess->TesseractRect(imageData,
                                     bytes_per_pixel,
                                     bytes_per_line,
                                     0, 0,
                                     imageSize.width, imageSize.height);

    // Do something useful with the text!
    NSLog(@"Converted text: %@",[NSString stringWithCString:text encoding:NSUTF8StringEncoding]);

    delete[] text;

Enjoy the video!

  • Twitter
  • Facebook
  • Slashdot
  • Digg
  • Google Bookmarks
  • del.icio.us
  • RSS

Tags: , , , , ,

13 Responses to “OCR on iPhone demo”

  1. Mark Pfeiffer Says:

    Didn’t understand a word of this. But I suspect it is way cool.

  2. Robert Says:

    Thanks Mark. In simple terms – for you ;-) snap a photo of printed text on your camera phone, this converts it to editable text which you can save or email.

  3. Jan Says:

    Thank you for the snippet but where do you put your tessdata folder?

  4. Robert Says:

    If you add it to your project, it should get copied into the app bundle during the build. The code snippet looks for the data folder in the app Documents directory; if it’s not found (ie. first run) copies it from the app bundle to the Documents dir.

  5. Jan Says:

    Ok, i had done this but i get the error:
    : *** -[EventDetailViewController applicationDocumentsDirectory]: unrecognized selector sent to instance 0×46a390.

  6. Nolan Says:

    I got everything compiled and working great on a few pictures but mostly I’m getting EXC_BAD_ACCESS inside the TessBaseAPI::HistogramRect method. XCode isn’t give me anything more to work off of but all my variables and data look solid. Could you post your project (.a and all) or maybe you’ve seen this error before?

  7. exploration » Blog Archive » cross-compiling for iPhone dev Says:

    [...] Proof-of-concept demo. Also, updated the script for building with the 10.6 [...]

  8. Colin Says:

    Hi. I’m attempting to compile this using the iOS 4 SDK and am having some problems. I’m using a version compiled with the instructions here:
    http://iphone.olipion.com/home
    The sample project is written in a previous version of the OS SDK, and isn’t compatible.

    When I come to trying to copy the sample code into a new project in the SDK version 4, I get the following error on the @class line:
    error: forward declaration of ’struct TessBaseAPI’

    I also get the following error on the tess = new TessBaseAPI(); line:
    error: invalid use of incomplete type ’struct TessBaseAPI’

    Any ideas?

  9. Robert Says:

    I’m afraid I’ve only just started using the 4.0 SDK…and haven’t tried to migrate this yet.

  10. Hanno Says:

    Hi Robert,

    any idea to get it to work on iOS4 or OS 3.2 on the iPad.
    Would be helpful for me.

    Thanks, Hanno

  11. Robert Says:

    It works fine for me on iOS4. Haven’t tried running on an iPad, but I don’t see why that would be a problem, either.
    Note, I’ve only used the 3.x SDK to compile the app.

  12. Sat Says:

    Hi Robert,
    Sorry if i am clear in my earlier post, but downloaded ur source from github(Pocket-OCR).
    http://github.com/rcarlsen/Pocket-OCR

    Sorry for the dumb question.
    Do i need to build “libtesseract_full.a” again. Whatever included in your xcode project; is that not sufficient. is that not going to work in my env?. Is this seperate for seperate projects?

    I am using iphone sdk4 & xcode 3.2.3

    Any help would be great.

  13. Robert Says:

    Yes, you need to build, or at least drag in an existing copy of the tesseract library (v2.0.4) to the PocketOCR project.

Leave a Reply