cross-compiling for iPhone dev

Update: Proof-of-concept demo. Also, updated the script for building with the 10.6 SDK.

Update #2: Source code for demo project released.

Update #3: script for use with tesseract v3 posted.

I recently had need to use an open-source library in an iPhone project. Recalling the earlier work necessary in compiling the libraries needed for openFrameworks I started looking for a more generic way to build for iPhone development. Thankfully, LateNiteSoft wrote a great article about using a shell script to cross-compile linux projects, building a Universal Binary with versions for the Simulator and Device.

I configured their provided code snippets to build tesseract-ocr for iPhone, referring to the set-up for freetype and freeimage to fill in some c++ gaps. Anyway, the library seems to have built correctly. I’ll know for sure when I incorporate it into a project, soon.

To use it, copy the script into the project directory, next to the configure script. For a simple project which generates one monolithic library, edit the LIBFILE variable to reflect the location and name of the library. I’ve only used this for static libraries…other work may be necessary to correctly generate dynamic libraries (however, the iPhone SDK prohibits linking to dynamic libraries, so in this case it seems moot). Run ./build_fat.sh to kick off the process. Look for the compiled libraries in the “lnsout” directory. There’s no error checking, so caveat emptor. :)

Cross-compile shell script follows:

#!/bin/sh

# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009.
# Updated 6.12.2009 to build i386 (simulator) on an x86_64 platform (10.6 SDK)
# build an arm / i386 lib of standard linux project
#
# adopted from:
# http://latenitesoft.blogspot.com/2008/10/iphone-programming-tips-building-unix.html
#
# initially configured for tesseract-ocr

# Set up the target lib file / path
# easiest to just build the package normally first and watch where the files are created.
LIBFILE=ccmain/libtesseract_full

# Select the desired iPhone SDK
export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS3.0.sdk

# Set up relevant environment variables
export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/arm-apple-darwin9/4.0.1/include/ -I$SDKROOT/usr/include/ -miphoneos-version-min=2.2"
export CFLAGS="$CPPFLAGS -arch armv6 -pipe -no-cpp-precomp -isysroot $SDKROOT"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

# Dynamic library location generated by the Unix package
LIBPATH=$LIBFILE.dylib
LIBNAME=`basename $LIBPATH`

export LDFLAGS="-L$SDKROOT/usr/lib/ -Wl,-dylib_install_name,@executable_path/$LIBNAME"

# Static library that will be generated for ARM
LIBPATH_static=$LIBFILE.a
LIBNAME_static=`basename $LIBPATH_static`

# TODO: add custom flags as necessary for package
./configure CXX=$DEVROOT/usr/bin/arm-apple-darwin9-g++-4.0.1 CC=$DEVROOT/usr/bin/arm-apple-darwin9-gcc-4.0.1 LD=$DEVROOT/usr/bin/ld --host=arm-apple-darwin

make -j4

# Copy the ARM library to a temporary location
mkdir -p lnsout
cp $LIBPATH_static lnsout/$LIBNAME_static.arm

# Do it all again for native cpu
make distclean

# Restore default environment variables
unset CPPFLAGS CFLAGS CPP LDFLAGS CXXFLAGS DEVROOT SDKROOT

export DEVROOT=/Developer
export SDKROOT=$DEVROOT/SDKs/MacOSX10.6.sdk

export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/ -I$SDKROOT/usr/include/ -mmacosx-version-min=10.5"
export CFLAGS="$CPPFLAGS -pipe -no-cpp-precomp -isysroot $SDKROOT -arch i386"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

 #Overwrite LDFLAGS
# Dynamic linking, relative to executable_path
# Use otool -D to check the install name
export LDFLAGS="-Wl,-dylib_install_name,@executable_path/$LIBNAME"

# TODO: error checking
./configure
make -j4

# Copy the native library to the temporary location
cp $LIBPATH_static lnsout/$LIBNAME_static.i386

# Create fat lib by combining the two versions
/usr/bin/lipo -arch arm lnsout/$LIBNAME_static.arm -arch i386 lnsout/$LIBNAME_static.i386 -create -output lnsout/$LIBNAME_static

unset CPPFLAGS CFLAGS CPP LDFLAGS CPP CXXFLAGS DEVROOT SDKROOT

For reference, here is the original script (written for use with 10.5):

#!/bin/sh

# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009.
# build an arm / i686 lib of standard linux project
#
# adopted from:
# http://latenitesoft.blogspot.com/2008/10/iphone-programming-tips-building-unix.html
#
# initially configured for tesseract-ocr

# Set up the target lib file / path
# easiest to just build the package normally first and watch where the files are created.
LIBFILE=ccmain/libtesseract_full

# Select the desired iPhone SDK
export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS2.2.sdk

# Set up relevant environment variables
export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/arm-apple-darwin9/4.0.1/include/ -I$SDKROOT/usr/include/"
export CFLAGS="$CPPFLAGS -arch armv6 -pipe -no-cpp-precomp -isysroot $SDKROOT"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

# Dynamic library location generated by the Unix package
LIBPATH=$LIBFILE.dylib
LIBNAME=`basename $LIBPATH`

export LDFLAGS="-L$SDKROOT/usr/lib/ -Wl,-dylib_install_name,@executable_path/$LIBNAME"

# Static library that will be generated for ARM
LIBPATH_static=$LIBFILE.a
LIBNAME_static=`basename $LIBPATH_static`

# TODO: add custom flags as necessary for package
./configure CXX=$DEVROOT/usr/bin/arm-apple-darwin9-g++-4.0.1 CC=$DEVROOT/usr/bin/arm-apple-darwin9-gcc-4.0.1 LD=$DEVROOT/usr/bin/ld --host=arm-apple-darwin

make -j4

# Copy the ARM library to a temporary location
mkdir -p lnsout
cp $LIBPATH_static lnsout/$LIBNAME_static.arm

# Do it all again for native cpu
make distclean

# Restore default environment variables
unset CPPFLAGS CFLAGS CPP LDFLAGS CXXFLAGS

# Overwrite LDFLAGS
# Dynamic linking, relative to executable_path
# Use otool -D to check the install name
export LDFLAGS="-Wl,-dylib_install_name,@executable_path/$LIBNAME"

# TODO: error checking
./configure
make -j4

# Copy the native library to the temporary location
cp $LIBPATH_static lnsout/$LIBNAME_static.i386

# Create fat lib by combining the two versions
$DEVROOT/usr/bin/lipo -arch arm lnsout/$LIBNAME_static.arm -arch i386 lnsout/$LIBNAME_static.i386 -create -output lnsout/$LIBNAME_static

Tags: , , , ,

108 Responses to “cross-compiling for iPhone dev”

  1. skay Says:

    thanks a lot for the post.
    its been a great help to me.

    so once i include libtesseract_full.a in framework, which method(and how) do i call it it my .m file to process image into text. A quick reply would be helpfull.
    thanks in advance

  2. Robert Says:

    I haven’t had a chance to actually implement the library in a project. There some comments on the tesseract Google code forum about this topic. Browsing the baseapi header may give some insight, too.

    Good luck and please share what you find out!

  3. skay Says:

    Every folder in tesseract has a .a file.
    so do we have to run same script for all .a files and include all of them in framework.

  4. Robert Says:

    This build script only targets libtesseract_full.a. In this case, tesseract compiles a monolithic static library which is all you need to include in your project. The readme (or install) txt files in the tesseract src indicate this.

    You mileage may vary with other source code.

  5. skay Says:

    tesseractmain.h has a method api_main() which we probably have to call from our api.
    However when i try to import

    // Copyright 2009 __MyCompanyName__. All rights reserved.
    //

    #import
    #include “tesseractmain.h”

    it says file not found for tesseractmain.h

    when i add file tesseractmain.h it give compilation error for C++ syntex used in tesseractmain.h

    how do i actually include it.

  6. Robert Says:

    is the file extension of your source .mm? mixing objective-c and c++ code requires that extension. easiest to do it in the Finder, then reimport the renamed file(s) into Xcode, removing the old files as well.

  7. skay Says:

    thanks a lot robert.
    it worked . i was using .mm file but there was other problem which i sorted out.
    however i am clueless how to call methods in baseapi.h
    might be it ll take some time to understand from code.
    can u help me giving some clue how to go about it.

  8. Robert Says:

    again, I haven’t had a chance to use this library in a project. I’d imagine that you create a new tesseract object, init it with the language model, then provide it with image data. Here’s my best guess without actually trying it myself:

    TessBaseAPI *tess = new TessBaseAPI();
    NSString *datapath = [NSString stringWithString:[[NSBundle mainBundle] bundlePath]];
    tess->Init([datapath UTF8String], NULL);

    // this is the main method for doing char recognition
    // you’ll have the provide the raw image data yourself
    /*
    char* returnedString = TesseractRect(const unsigned char* imagedata,
    int bytes_per_pixel, int bytes_per_line,
    int left, int top, int width, int height);
    */

    good luck!

  9. skay Says:

    thanks a lot robert
    figured it out. The app is up and running now.

  10. Robert Says:

    would you mind sharing the steps you ended up using?

  11. skay Says:

    will surely do that in a day or 2 :)
    once again thanks for support. it was very helpful.

  12. skay Says:

    i am trying to use ImageMagick to convert the image into tiff and passing image to tesseract to process it.
    there is a small hole right now. ImageMagick is converting successfully to bmp and jpg .
    tesseract reads bmp files but it should be with bpp 1,2,4,8 and image i get using ImageMagick is 32BPP.
    tryimg my hand on it.
    will keep u posted.
    Do u have any idea to change image bpp so that its readable by tesseract without tiff support.

  13. Matt Says:

    Firstly, thanks for the shell script, I’d been struggling trying to figure out the instructions from LateNiteSoft.

    At the moment I’m trying to get baseapi to work. My problem is getting the image into the format needed to send as an argument to tesseract. Currently I have something like this:

    TessBaseAPI::InitWithLanguage(“DataPath”, NULL, NULL, NULL, false, 0, NULL);

    char* text;
    NSBundle *bundle = [NSBundle mainBundle];
    NSString *imgPath = [bundle pathForResource:@"testImage" ofType:@"tif"];
    UIImage *uiImage = [UIImage imageWithContentsOfFile:imgPath];

    CGSize imageSize = [uiImage size];
    double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage])/8.0;

    //This line doesn’t work
    unsigned char* imageData = [uiImage bitmapData];

    text = TessBaseAPI::TesseractRect((const unsigned char*)imageData, bytes_per_pixel,bytes_per_line, 0, 0, imageSize.width, imageSize.height);

    Anyone have any ideas? Is it something I should be doing in C++ instead of objective-C?

    Also, I’m a complete beginner so there might be lots of errors in the code above.

  14. Robert Says:

    i don’t believe that there is a property called “bitmapData” in the UIImage class. however, you can get a CGImageRef with [(UIImage *)image CGImage]. maybe there is a way to get NSData / CFData from there?

    perhaps this dev note is helpful for retrieving the pixel data from an image:
    http://developer.apple.com/iphone/library/qa/qa2007/qa1509.html
    or perhaps this article:
    http://www.cocoadev.com/index.pl?NSBitmapImageRepFromCGImage

  15. Ankit Gupta Says:

    Hey,

    Thanks for the script! I am on sdk 3.1 beta 3 (don’t think this should matter though). and I am trying to run it but am running into this error.

    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    Could you guide me as to what I am doing wrong?

    Thanks :)
    Ankit

  16. Ankit Gupta Says:

    HI,

    Sorry for the previous comment. I had a previous version of tesseract. I downloaded the new one and the script ran fine. I have included the library libtesseract_full.a in my project and tesseractmain.h in my header. But it says no such file or directory when I try to compile!

    Ankit.

  17. Ankit Gupta Says:

    Sorry about the multiple posts but I started getting this error again.

    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    I hadn’t downloaded the data files earlier and it worked fine. But when I put the data files in the tessdata directory and reran the script, got this error. Any thoughts on how to get around this?

    Thanks,
    Ankit

  18. Robert Says:

    no, i’m not sure about that particular issue. will the lipo tool work correctly when you remove the data files? i’ve successfully generated the lib with the english language data files in tessdata.

    maybe one of the earlier commenters who have actually used the resulting lib in a project would be able give you some more guidance. i haven’t yet had a chance to use this in a project myself.

  19. Lars Says:

    Hi, I’ve compiled tesseract succesfully and included header and framework search paths in xcode,
    This line works both compiling and linking:
    TessBaseAPI *tess = new TessBaseAPI();
    but when I add this line:
    static int result = tess->InitWithLanguage(NULL,NULL,NULL,NULL,false,0,NULL);
    I get a linking error. Unknown symbol. I’ve renamed the source file to .mm

    I’ve tried to find a solution. There are solutions on the net like “make sure your file is included in the project…” and “declare it as extern”, but I can’t get it to work. Anybody who has managed to use tesseract in a program yet? Some helping code mayby? or some clues?
    Thanks, Lars.

  20. Stefano Says:

    this is working for me

    - (NSString *)readAndProcessImage:(UIImage *)uiImage {
    CGSize imageSize = [uiImage size];
    double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;

    CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
    const UInt8 *imageData = CFDataGetBytePtr(data);

    char* text = TessBaseAPI::TesseractRect(imageData,
    bytes_per_pixel,
    bytes_per_line,
    0, 0,
    imageSize.width, imageSize.height);

    return [NSString stringWithUTF8String:text];
    }

  21. Robert Says:

    Great! Thanks for that code snippet.

  22. Lars Says:

    I tried it all over again. It works!
    I’ll leave a link to my project – when (or if) I have something useful…

    Thanks,
    Lars.

  23. Kevin Says:

    Lars a link to your project would be fantastic.

    I managed to compile the libtesseract_full.a and add it. Thanks Robert!!
    However I cant figure out how to include “baseapi.h” or other header files without getting a dozen errors

    -Kevin

  24. Kevin Says:

    So in case anyone got stuck on my last question.. Changing the .m file that you include “baseapi.h” needs to be .mm.

    Has anyone had luck installing languages in Tesseract?

    When I call:
    TessBaseAPI::InitWithLanguage(NULL, NULL, NULL, NULL, false, 0, NULL);

    I get the error
    Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset

    I have download the english files and put them into tessdata before compiling tesseract for the iPhone/iPhone simulator. Any ideas?

  25. Robert Says:

    yes, as noted above, any source file which mixes obj-c and c++ needs to have a .mm file extension.

    i’m currently getting the same error with the location of the tessdata folder. if you note the configure process, this location seems to be hardcoded to the prefix variable.

    the library works in the simulator if you install/symlink the tessdata folder to /usr/local/share/tessdata, but of course this doesn’t fly in the iPhone sandbox.

    it seems as though TessBaseAPI::Init() expects the first argument to be the location of the data folder, but including the data files in the app bundle and passing that location to Init() has no effect on the error.

    i’m at a loss for the moment, and the semester has just begun so time is at a premium. i welcome any other suggestion on this matter.

  26. dflynn Says:

    To resolve this error:
    Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset

    Locate the ‘Executables’ node in the xcode project tree.
    Double-click your app executable.
    In the window that opens, choose the ‘Arguments’ tab.
    Add a new Environment Variable called TESSDATA_PREFIX and set it equal to the path on disk preceding the “tessdata” directory. That will get you past the error and allow tesseract to init in the Simulator.

    In order to work on the phone, you need to set TESSDATA_PREFIX to a different environment variable representing the app install directory. The app install directory will need a subdirectory called tessdata containing all the language files.

  27. Kevin Says:

    Thanks dflynn. Your fix works perfectly for the simulator.

    For running on the iPhone:
    I have started looking for this environmental variable that will find the /documents directory. This directory path changes each time the app is installed….
    <>

    I can find this location problematically during run time with:
    NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString *documentsDirectory = [paths objectAtIndex:0];

    However that seems to be useless since we cannot change the TESSDATA_Prefix Environment Variable during run time..

    I’ll keep searching and keep you updated. Thanks again.

  28. Kevin Says:

    K I got it working on the iPhone.
    Turns out you can change the Environment Variable within the code.

    My Implementation:
    I am creating a tessdata subdirectory in the Applications Directory Folder.
    Then copying the language files from my bundle into that folder.

    Then setting the TESSDATA_Prefix Environment Variable during run time with:
    setenv(“TESSDATA_PREFIX”, [tessDataPath UTF8String], 1);

    And that will do it.

  29. Morten Says:

    I have used the scripts and followed the comments but I can’t get it to work. Can those of you who worked out how to use it please post a short guide describing what you did? Ex. did you change anything in the build_fat.sh file, which properties/settings do you have to change in xcode and which header files should you include in the .m-file?

  30. Laurent Says:

    Hi Everyone

    Has anyone been able to build it with Snow Leopard?
    I am able to build the library on the ARM architecture, include it in my project but impossible to get it to work on the Simulator. I get the error “ld: warning: in /Developer/photoStrip/lib/iPhone/libtesseract_full.a, file is not of required architecture” when I build the App.

    My Setup is Snow Leopard 10.6.2, SDK 3.1, XCode 3.2.

    Any help would be greatly appreciated

    Thanks

  31. Robert Says:

    on your computer, it has likely been built for i686 / x64_64. you may have to tinker to get the library to build for i386 explicitly. i noticed this change myself after updating to Snow Leopard. (the instructions were written when I was running 10.5)

    you can check by using otool:
    otool -h libtesseract_full.a

    check the cputype and subtype. look in:
    /usr/include/mach/machine.h

  32. Laurent Says:

    Thanks Robert. I’ll check all this later today and will let you know.

  33. Laurent Says:

    Robert I recompiled clean and ran the otool command and this is what I get:

    otool -h libtesseract_full.a

    Archive : libtesseract_full.a (architecture arm)
    libtesseract_full.a(libtesseract_full.o) (architecture arm):
    Mach header
    magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
    0xfeedface 12 0 0×00 1 3 1248 0×00002000

    Archive : libtesseract_full.a (architecture x86_64)
    libtesseract_full.a(libtesseract_full.o) (architecture x86_64):
    Mach header
    magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
    0xfeedfacf 16777223 3 0×00 1 3 1480 0×00002000

    To do the ‘lipo’ I had to change the ‘-arch i386 libtesseract_full.a.386′ command to ‘-arch x86_64 libtesseract_full.a.386′. The problem seems to be my CPU type as I should have 7 i.o 16777223, which indicates an unknown value.

    I have 3 machine.h but not where you mentioned: /System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach /Developer/SDKs/MacOSX10.6.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach /Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach
    The first 2 are exactly the same.
    The 10.5 is an older version from 3/25/08.

    I checked the 10.6 and this is what I have as CPU type and subtype (the 10.5 looks exactly the same).

    #define CPU_ARCH_MASK 0xff000000 /* mask for architecture bits */
    #define CPU_ARCH_ABI64 0×01000000 /* 64 bit ABI */
    /* * Machine types known by all. */
    #define CPU_TYPE_ANY ((cpu_type_t) -1)
    #define CPU_TYPE_VAX ((cpu_type_t) 1)

    #define CPU_TYPE_MC680x0 ((cpu_type_t) 6)
    #define CPU_TYPE_X86 ((cpu_type_t) 7)
    #define CPU_TYPE_I386 CPU_TYPE_X86 /* compatibility */
    #define CPU_TYPE_X86_64 (CPU_TYPE_X86 | CPU_ARCH_ABI64)

    #define CPU_TYPE_ARM ((cpu_type_t) 12)


    #define CPU_SUBTYPE_VAX_ALL ((cpu_subtype_t) 0)
    #define CPU_SUBTYPE_VAX780 ((cpu_subtype_t) 1)
    #define CPU_SUBTYPE_VAX785 ((cpu_subtype_t) 2)
    #define CPU_SUBTYPE_VAX750 ((cpu_subtype_t) 3)
    #define CPU_SUBTYPE_VAX730 ((cpu_subtype_t) 4)
    #define CPU_SUBTYPE_UVAXI ((cpu_subtype_t) 5)
    #define CPU_SUBTYPE_UVAXII ((cpu_subtype_t) 6)
    #define CPU_SUBTYPE_VAX8200 ((cpu_subtype_t) 7)
    #define CPU_SUBTYPE_VAX8500 ((cpu_subtype_t) 8)
    #define CPU_SUBTYPE_VAX8600 ((cpu_subtype_t) 9)
    #define CPU_SUBTYPE_VAX8650 ((cpu_subtype_t) 10)
    #define CPU_SUBTYPE_VAX8800 ((cpu_subtype_t) 11)
    #define CPU_SUBTYPE_UVAXIII ((cpu_subtype_t) 12)

    Do you have any idea why I get this unknown CPU type? Any suggestion?

    Thanks – Laurent

  34. Robert Says:

    Laurent:

    I *believe* that cputype 16777223 is x86_64. The build script above uses the native cpu to do the compile for the simulator, which is the correct output for your computer, OS and XCode. However, it seems as though the simulator needs i386.

    You”l have to add CFLAGS and CPPFLAGS variables with -arch i386 before the second configure (for the simulator). Try this:
    # Set up relevant environment variables
    export CPPFLAGS=”-arch i386″
    export CFLAGS=”$CPPFLAGS”
    export CPP=”$CPPFLAGS”
    export CXXFLAGS=”$CFLAGS”

    Also, some information I wish I had read a long time ago:
    http://developer.apple.com/mac/library/documentation/Darwin/Conceptual/64bitPorting/building/building.html

  35. Laurent Says:

    Hi Robert

    Unfortunately same results. I think CPU 16777223 means unknown but not 100% sure. I will keep on playing with all these flags but if anyone has a recommendation that would be great

    Thanks

  36. Laurent Says:

    Robert, I opened a thread on the iPhone developer’s forum and a guy from Apple gave me this first answer:

    “CPU type 16777223 is x86_64. (See mach/machine.h.) The Simulator only uses 32-bit i386. You need to configure your build to use i386.
    Note that recent compiler versions default to x86_64 on 64-bit machines. Make sure you’re not using the default anywhere.”

    I asked him more information on the configure option but you were right on CPU type 16777223. It does not mean ‘unknown’ but X86_64.

  37. Rob Says:

    Hi,
    I hope someone can help me out because this is driving me insane…
    I believe I’ve compiled the library so that it can run on both the phone and the simulator, having tweaked the build script when it refused to work due to 368/x86-64 issues, and then tweaked it again when it refused to work with OS X 10.6 due to fopen$UNIX2003 problems… but I’m stuck actually trying to get it to run.

    Sorry if I’m being dumb, but when I call TessBaseAPI::SimpleInit([documentsDirectory UTF8String], “eng”, false); the programs crashes on that line with exit code 1. Is this because it can’t find the data files? or am I doing something stupid?

    Thanks in advance, and thanks for the original build script

  38. Robert Says:

    yes, there’s an issue with the paths…look at a previous comment for setting an environment variable in Xcode for the tess data path.

  39. Rob Says:

    Thanks Robert, it’s working perfectly now.
    Excellent blog post.

  40. Jan Says:

    Hello, I dont’t understand which files i have to include. I have compiled the static libary and i want to use the libary. can you help me please.

  41. Robert Says:

    you need to include: libtesseract_full.a library, baseapi.h and the tessdata folder with the language files.

    the above comments have snippets for setting an environment variable for teh data folder, initializing the tesseract engine and processing an image.

  42. Laurent Says:

    Hi Rob

    Would you mind posting your script? It seems you figured out how to do this cross compiling working with both the simulator and the device. I am still stuck with the stuck with the Simulator

    Thanks

  43. Jan Says:

    Thanks Robert for the new Script it works perfect.
    I do the following things in a .mm(rename in XCode) file and get many errors:
    #include “libtesseract_full.a”
    Should this work?

    Thanks

  44. Robert Says:

    @Jan…you add libtesseract_full.a to the Frameworks group in XCode, rather than as an include in your source files.

    you’ll need to add #include “baseapi.h” to your code, however…and drag baseapi.h from the tesseract source folder into your XCode project. you need to also add the tessdata folder to the project.

  45. Jan Says:

    Thank you so much. It work’s.

  46. Mart Says:

    Hi Robert, thanks for a great write up!
    I have managed to compile and include the libtesseract_full.a, but when i try to use the API with Stefanos code above my app crashes w/o message.
    A few questions:
    - In the code above it looks like Stefano calls TesseractRect(const UInt8 *, double, double, int, int, float, float), but the method signature says: TesseractRect(const unsigned char*, int, int, int, int, int, int). Could it possibly work anyway? Tried to typecast all variables to int , but still no luck.
    -Could the app crash because of a faulty tessdata directory reference?
    Thankful for any help.

  47. Robert Says:

    @mart yes, absolutely..look above for specific details about including and referencing the tessdata folder. also, you can look at my updated link at the top of the page for obj-c++ code snippets for invoking tesseract.

  48. dreampowder Says:

    hi there, is it possible that someone can send me a compiled tesserract static library via e-mail? i am new to obj-c and xcode enviroment and i dont know how to implement all those scripts and code shown above.

    my computer is macbook 2.1 with snow leopard 10.6. i’d be grateful if someone can send me the library to

    ‘coskun.serdar-at-gmail.com’.

  49. Scott Says:

    I got the OCR code all to work but am I crazy or are pictures taken using an iPhone not recognized well at all. I have a 3g and if I take a picture and run it through, for the most part comes back with garbage. Now if I run through it some text at say 20 point that I capture on my Mac, it works much better. Does one need the resolution of the 3gs to make this useful?

  50. Robert Says:

    the autofocus lens on the 3GS helps greatly. there is an external lens for the iPhones without an autofocus lens which is supposed to help the macro focus.

  51. exploration » Blog Archive » OCR on iPhone demo Says:

    [...] around to building a proof of concept implementation of tesseract-ocr for the iPhone. months ago, i documented the steps which helped to get the library cross-compiled for the iPhone’s ARM processor, and [...]

  52. Stéphane Tavera Says:

    Hi everyone,
    Robert, congrats for the hard work done and to share this.
    I was able to compile the library, and added libtesseract_full.a and baseapi.h to the demo iPhone project.
    My config : snow leopard, 10.6.2, xcode 3.2.1
    In build info, I changed the “Mac OS X Deployment Target” to “10.6″
    However, I get an error at build (see trace)

    Anyone can help ?
    thanks in advance !

    trace :

    Ld build/Release-iphoneos/OCR.app/OCR normal armv6
    cd /Users/st/Pocket-OCR
    setenv IPHONEOS_DEPLOYMENT_TARGET 3.1.2
    setenv MACOSX_DEPLOYMENT_TARGET 10.6
    setenv PATH “/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin:/Developer/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin”
    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/g++-4.2 -arch armv6 -isysroot /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS3.1.2.sdk -L/Users/st/Pocket-OCR/build/Release-iphoneos -L/Users/st/Pocket-OCR -L/Users/st/Pocket-OCR/../../tesseract-ocr-svn/api -L/Users/st/Pocket-OCR/../../tesseract-ocr-svn -L/Users/st/Pocket-OCR/../../tesseract-2.04/libraries -F/Users/st/Pocket-OCR/build/Release-iphoneos -filelist /Users/st/Pocket-OCR/build/OCR.build/Release-iphoneos/OCR.build/Objects-normal/armv6/OCR.LinkFileList -mmacosx-version-min=10.6 -dead_strip -miphoneos-version-min=3.1.2 -framework Foundation -framework UIKit -framework CoreGraphics -ltesseract_full -framework MessageUI -o /Users/st/Pocket-OCR/build/Release-iphoneos/OCR.app/OCR

    ld: library not found for -ltesseract_full
    collect2: ld returned 1 exit status
    Command /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/g++-4.2 failed with exit code 1

  53. Robert Says:

    Check the “Link Binary with Libraries” section of the OCR Target of the Xcode project. I had it included relative to the project path on my development machine. Ensure that your version of the libtesseract_full is included there (it should be displayed in black, rather than red text)

  54. Stéphane Tavera Says:

    About previous post.
    I just had forgotten to update SDKROOT like this :
    export SDKROOT=$DEVROOT/SDKs/iPhoneOS3.1.2.sdk
    and everything works ;-)
    Once again, congrats and many thanks for sharing.
    Isn’t a keynote supposed to begin ;-) ?

  55. elninom Says:

    I wrote “./configure”, “make” and “sudo make install”
    Then entered “./build_fat.sh”; but I’ve got these error:

    /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    Do you know how to solve this error?

  56. Robert Says:

    Have you edited the build_fat script to point to your installation of the iPhone SDK?
    I believe that the error message indicates that the library was compiled for intel rather than arm.

    Also, when using the build script, you don’t need to run configure and make…it does those things twice: once for arm (iPhone SDK) and again for the host architecture (generally intel, either i386 or x86_64).

    Is this for tesseract? Do you have the iPhone SDK installed?

  57. elninom Says:

    @Robert Thank you for replying.
    I’ve installed (re-installed again) and tried on iPhone SDK 3.1.3 and 3.2 beta. Also, tried to compiling tesseract 2.04 and 3.0 svn.
    I checked and corrected the directory and SDK path. I entered only ./build_fat.sh command but still no luck. It gives same and weird error.

    ./configure: line 1965: test: /developer/pocket: binary operator expected
    ./configure: line 1968: test: /developer/pocket: binary operator expected
    checking whether build environment is sane… yes
    /bin/sh: /developer/pocket: No such file or directory
    configure: WARNING: `missing’ script is too old or missing
    checking for a thread-safe mkdir -p… config/install-sh -c -d
    checking for gawk… no
    checking for mawk… no
    checking for nawk… no

    make[4]: Nothing to be done for `all-am’.
    make[3]: Nothing to be done for `all-am’.
    make[2]: Nothing to be done for `all-am’.
    /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    I’m stuck at this point for 5 days. I tried every way. Please help me elninomelninom@gmail.com

  58. Robert Says:

    do you have spaces in your path?
    ie. /developer/pocket (is there more to the path here…?)

    try removing (or escaping) spaces in your path.

  59. elninom Says:

    I don’t have /developer/pocket path or pocket directory.
    How can I make it?

  60. hytgbn Says:

    Hi, I was using tesseract library with my leopard 10.5 and xcode 3.1.x( maybe 3.1.4)

    It works very fine on simulator and device, cool.

    first of all, I appreciate your blog that makes me compile well on my environment.

    But I update my OSX few days ago, and I install xcode 3.2.

    After update it makes compile error, while linking.

    It says it cannot find some object which is referenced from some.o file.

    GOMP library and fopen, fdopen functions are the reason.

    If I update xcode from 3.1 to 3.2 , is there some framework I must additionally add?
    or is there option I have to declare?
    or do I have to compile it again? (actually I compiled it again and again :( )

    If you know something, please give me the clue..

    Thank you.

  61. Robert Says:

    Without the specific error log it’s difficult to tell. I’m using 10.6 with XCode 3.2 and it all works fine. There are two versions of the build_fat script…one for 10.5, another for 10.6.
    Maybe try recompiling the tesseract library with the 10.6 version of the build script.

    Are you using your own iPhone project or the Pocket OCR project?

  62. Wilson Says:

    Robert, I set out a few days ago to take Tesseract for a test drive and your posts/sample code have been very helpful (Thx!). I’ve managed to get the Pocket OCR project to run on the Simulator, but I get the dreaded “libtesseract_full.a file is not of the required architecture” error when I try to build & run for my iPhone 3Gs. I’m not an XCode veteran and unfortunately couldn’t implement the fix you provided above on 11/24/09 (re: adding some new environment variables). Setting these environment variables using the “env” command didn’t seem quite right. Would you mind giving me the “for dummies” version of this fix? Thx in advance.

    I’ve included some potentially relevant info about my setup below:
    - Running Snow Leopard 10.6.2, Using SDK 3.1 & XCode 3.2
    - Running otool -h on libtesseract_full.a yields cputype of 7 and cpusubtype of 3 (note: a review of my /usr/include/mach/machine.h file revealed cputype 7 defined as CPU_TYPE_X86 and cpusubtype 3 as either CPU_SUBTYPE_X86_ALL, CPU_SUBTYPE_X86_64_ALL, CPU_SUBTYPE_386, or CPU_SUBTYPE_I386_ALL

  63. Robert Says:

    The output from otool indicates that it’s not a FAT library, but one only built for your native computer. Also, there may be two libtesseract_full.a files. look for the one in the “lnsout” directory inside the tesseract folder. If that folder doesn’t exist, run the build_fat script again and look for it.

  64. Wilson Says:

    Robert, you were right… I had the wrong libtesseract_full.a included in my project (I was using the one in /ccmain instead of the one in /lnsout). Tesseract is now working on my iPhone! Thx again for your help!

  65. S Woodside Says:

    Very interesting! I’ve tried it out and it’s a bit slow … do you have any idea on how to make it faster? Did you look at that at all?

  66. kazuar Says:

    Hello,

    Thanks for this great article.

    I’m having the same “libtesseract_full.a file is not of the required architecture” that everyone keeps getting. I would like to know where is the problem.
    Please see the following details:
    1) I run Snow Leopard 1.6.4 with XCode 3.2.3 and iPhoneOS4 SDK.
    2) I’ve changed in the script the reference to the iPhoneSDK.
    3) I also changed everywhere darwin9 to darwin10.

    I still get the same message: /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0)).

    Another thing I’ve noticed is that while this message appear, two files are being created in the lnsout directory:
    1) libtesseract_full.a.arm
    2) libtesseract_full.a.i386
    Both of them have the same size: 3.6MB

    Does this means that I can ignore this message? Is there anything else I can check?

    Any help would be appreciated.

    Thanks,
    Kazuar

  67. kazuar Says:

    Hello again,

    I’ve actually succeeded compiling tesseract with another tutorial that I’ve found here:
    http://iphone.olipion.com/cross-compilation/tesseract-ocr

    I only had to do small changes.
    When I’ve finished compiling, I got libtesseract_full.a.
    I changed its name to libtesseract_full.a.arm and then tried to run lipo command with the i386 I got from the initial compile.
    Now I got libtesseract_full.a with no problems.

    I hope it will work in a real project.

    I will let you know.

    Thanks,
    Kazuar

  68. Avicene Says:

    @kazuar were you able to compile the library on Mac OS 10.6 and iOS4?
    I tried to use the steps in this blog but arm-apple-darwin9 compilers are nowhere to be found.
    I am working on this script to make it work for Mac OS 10.6 and iOS4.
    I don’t want to reinvent the wheel, so if anybody is aware of similar work was done elsewhere please aknowledge.
    Thanks.

  69. Damian Says:

    kazuar, did u make it to run for OS4? i can’t get it to run, if you did, please send me your script cause this is making me mad.
    damcho@gmail.com
    thanks

  70. Robert Says:

    I have Pocket OCR running on my iPhone 4 (iOS 4 of course). I believe that I compiled it using the 3.x SDK still, however.

  71. Amol Says:

    Hello Robert,

    Thanks for your awesome script.

    But i have few doubts. Do i need the libtiff,jpeglib…. all the other libraries mentioned at the tesseract ocr site to read compressed image? or is it the script has done all the necessary things for us.

    I know i might sound naive, but please do reply.

    -Amol

  72. Robert Says:

    The script builds tesseract quite simply, with only what is provided in the source distro. However, the iPhone SDK and PocketOCR app sends raw image data to the engine and therefore doesn’t need the external dependencies.

  73. JohnDoe Says:

    I cant find /usr/local is there such a directory?

  74. Robert Says:

    In OS X (or any *nix like operating system), there will be a /usr dir, and most likely /usr/local as well. This dir (and other usual *nix dirs) are hidden by the OS X Finder.

    You can get there by using the “Go to folder…” function in the Finder: Command+Shift+G

    Of course, you can also cd into it using the Terminal.

    Info about the filesystem: http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

  75. Patrick Says:

    Hi Robert,

    Your script is wonderful. Thanks for the great help there.

    I have no problems running the app on my simulator but when I load it into my iphone, it will terminate itself with the following error message:

    Unable to load unicharset file /var/mobile/Applications/1635E488-7449-4F87-BD20-A7CEF5E357A0/Documents/tessdata/eng.unicharset

    Usually when i encounter this problem in the simulator, I will delete the application on the iphone simulator and/or restart Xcode and rebuild & install the app which will do the trick. But for the iPhone device, it seems that i do not have a Documents/tessdata directory instead it is AppName.app/tessdata.

    How can solve this problem? Thanks in advance!

  76. Robert Says:

    In Pocket OCR the tessdata folder is included in the app bundle and copied to the app’s Documents dir on the first run. Note that recently I updated the project on Github for using tesseract 3.0 and made a change to the name of the tessdata folder that it’s looking to copy (it’s now looking for “tessdata-svn” in the app bundle, and by extension, that is what is included in the Xcode project as a folder reference).

  77. Patrick Says:

    Hi Robert,

    To make things clearer, i have been using the older version of the scripts on tesseract 2.04 and not tesseract 3.0, no issues with running on simulator but unable to run on device. What i did was adapted the PocketOCR’s viewDidLoad section onto my codes.

    BTW is there any way i can access the Application directory of the iPhone app on Mac computer? I want to take a look at what’s inside that folder. Thanks!

    Cheers

  78. Robert Says:

    Patrick,

    I was noting that if you’ve recently cloned the github version of PocketOCR then I’ve modified it to work with tesseract 3.0 and changed the tessdata folder name to tessdata-svn (completely arbitrary…and probably shouldn’t have committed that change, but there’s that).

    The iPhone Simulator apps live in your home directory:
    ~/Library/Application Support/iPhone Simulator//Applications//

  79. Patrick Says:

    Nope, I’m using the older version which I took sometime back when it was still on tesseract 2.04. PocketOCR is able to run on my device smoothly, no problems. But when I cloned some parts off to my app, it don’t run on my device.

    Prompting with that error msg for device:

    1. Unable to load unicharset file /var/mobile/Applications/1635E488-7449-4F87-BD20-A7CEF5E357A0/Documents/tessdata/eng.unicharset

    Whereas this happens only on simulator:

    2. Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset

    Sorry if it seems like I’m repeating. But if anyone encounters the second error, I would suggest to set environment in code, remove the app from simulator, clean and build project again. This works for me.

    For the first error, I’m still working on it. hopefully I can find it soon.

  80. Robert Says:

    The path to the tessdata dir is hardcoded into the lib at compile time but can be overridden at run time by an environment variable. It sounds like you missed this bit when “cloning” some of the code…

    …erm, you mention setting the env var…so then i’m not sure about the second error.

    the first error seems as though the tessdata dir isn’t getting copied from the app bundle…

  81. Patrick Says:

    Robert,

    yes, it failed to copy tessdata dir from app bundle over to application document dir.

    Now, i’m trying to figure out what happened.

  82. Patrick Says:

    I found out the problem was when i imported tessdata folder into my application bundle. Instead of “Recursively create groups for any added folders” , select “Create Folder References for any added folders”. and checked copy items into destination group folder (if need).

  83. Bob M Says:

    HI Robert,

    After a few tweaks I managed to get the app running in xCode device debugging mode. Thanks for such a great time saver. What I’m seeing or think I’m seeing is that in the threadedReadAndProcessImage method of the OCRDisplayViewController class when the image data is obtained from the input UIImage the bytes_per_pixel local var value is 2 ie. 16 bpp. . In the comments of the baseAPI class for the TesseractRect method it states “Greyscale of 8 and color of 24 or 32 bits per pixel may be given.
    “. I’m using a clean text image (not the camera) for input. I thought at first perhaps the camera image quality was the issue. What I’m seeing is the image get processed but the output text is clearly not what is in the image. I’m wondering if its the bpp value for the input data being 16. Has anyone else seen anything similar ? Could this (the bpp) be the problem ? Any insight would be greatly appreciated.
    Thanks

  84. Robert Says:

    That’s very interesting and could certainly be an oversight on my part. Pocket OCR was a proof of concept for my one of my first forays into cross-compiling line for iPhone and was also one of my first iOS apps.

    Once it was running and successfully recognized text in photos I was taking (I promise it works, at least sometimes :) ) I hadn’t really looked into optimization or into training tesseract, either.

    I can certainly look into the bpp issue; otherwise, let us know if you get a chance to tinker with it first….

  85. Bob M Says:

    It appears that the original image returned by the image picker returns a bpp of 16 even though the image ‘picked’ was saved to the phone as a .png. (screen snapshot) RGB color model, which tells me that should be 24 bpp which according to baseapi is valid. Strange to say the least.

  86. jose Says:

    Hi,

    I’m really new on iphone.. my background is from Java Dev. I want to build the OCR lib using your script configured with my system but I don’t understand how I run it :S sorry I know is really dummy question but could you make an explanation for dummys? I understand that I have to modify the LIBFILE but I don’t know with what and how I get to link to the script? Basically I need help

    thanks a lot and excuse my bad english

  87. jose Says:

    Hi I just compiled the library tesseract version 3. Everything works ok, I can execute tesseract on the command line (hint that it’s installed to me) but it won’t copy any copiled file to folderTesseract/ccmain/. For version 2 the file is called libtesseract_full.a and for version 3 I think it’s called libtesseract_api.a

    anyway there’s no file copied to ccmain which I need to run the ./build_fat

    any ideas? do I have to edit the makefile of the tesseract? is the file anywhere else? I can’t figure out what it is…. Any ideas will be more than welcome as I’m lost right now!

    regards,

    jose

  88. Robert Says:

    Check out the newer blog post which has an updated build script for tesseract v3.

  89. Felix Says:

    I have been able to build library, run the example, but most times it doesn’t recognize correctly the text on the image.. For instance when i take an screenshot of an email on iPhone 3gs, it works more or less.
    Using camera doesn’t work, the text is not recognized… I am wondering if is necessary to include something to train OCR, or similar…

    Thanks!

  90. apekshit sharma Says:

    Hello, I downloaded the example and running on iOS 4.0 SDK,
    application launch successfully and then TAP on Start Camera
    Button, then select a Capture a image from camera view which
    contains text on it. But when i tap on USE button “EXC_BAD_ACCESS”
    message will occurred and application has been crash. I debug it
    and found method – (NSString *) ocrImage: (UIImage *) uiImage and
    line char* text =
    tess->TesseractRect(imageData,(int)bytes_per_pixel,(int)bytes_per_line,
    0, 0,(int) imageSize.height,(int) imageSize.width); which is
    occurred the error. Can you please tell me that what is the problem
    with it. Thanks in Advance.

  91. CSoft Says:

    I have use you script to compile source code , then i have
    meet some problems, can you help me to fix it. thanks very much
    Making all in vs2008 Making all in dlltest make[3]: Nothing to be
    done for `all’. Making all in include Making all in leptonica
    make[4]: Nothing to be done for `all’. make[4]: Nothing to be done
    for `all-am’. make[3]: Nothing to be done for `all-am’. make[2]:
    Nothing to be done for `all-am’. cp: ccmain/libtesseract_full.a: No
    such file or directory /usr/bin/lipo: can’t open input file:
    lnsout/libtesseract_full.a.arm (No such file or directory)

  92. Karl Says:

    Robert, taking the time to answer everyone here and providing source code to such a superb demonstration, all I can say is that you are the King!
    Thank you so much!

  93. Jes Says:

    first of thanks for a, for me, very important insight in crosscompiling and xcode usage of tesseract.
    I wonder if any of you have some insight in for to for example use the leptonica command pixConvertTo8(“image.tif”); directly in the following?
    I mean how to preprocess an image without having to save it to a file and then opening that resulting file to a UIImage…

    in other words:
    is there a way to “transfer” the image raw data from UIImage -> PIX (see pix.h from the leptonica library) and back from PIX -> UImage

    - (NSString *)readAndProcessImage:(UIImage *)uiImage {

    // preprocess UIImage here with fx: pixConvertTo8();

    CGSize imageSize = [uiImage size];
    double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;

    CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
    const UInt8 *imageData = CFDataGetBytePtr(data);

    char* text = TessBaseAPI::TesseractRect(imageData,
    bytes_per_pixel,
    bytes_per_line,
    0, 0,
    imageSize.width, imageSize.height);

    return [NSString stringWithUTF8String:text];
    }

  94. jes Says:

    i solved the above. now i am stuck after providing the init method with the name of my own traineddata file like so:
    tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “sid”);
    even though all my language files plus the sid.traineddata if was correctly copied from the bundle to copied to the iphone simulator folder.

    I still get this weird error:
    actual_tessdata_num_entries_ kMaxNumTessdataEntries);
    if (swap) {
    actual_tessdata_num_entries_ = reverse32(actual_tessdata_num_entries_);
    }
    ASSERT_HOST(actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES);
    fread(offset_table_, sizeof(inT64),
    actual_tessdata_num_entries_, data_file_);
    if (swap) {
    for (i = 0 ; i < actual_tessdata_num_entries_; ++i) {
    offset_table_[i] = reverse64(offset_table_[i]);
    }
    }
    if (global_tessdata_manager_debug_level) {
    tprintf("TessdataManager loaded %d types of tesseract data files.\n",
    actual_tessdata_num_entries_);
    for (i = 0; i < actual_tessdata_num_entries_; ++i) {
    tprintf("Offset for type %d is %lld\n", i, offset_table_[i]);
    }
    }

    any insights?

  95. Robert Says:

    For the benefit of all else, what did you do to solve the prior issue?

  96. jes Says:

    argh, i didnt mean to double post. Here is the error I get when i run this line: tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “sid”);

    Error: actual_tessdata_num_entries_ PIX and back problem by using a file as a transition (works ok for now). i’ll post back any more elegant solution on that i get running.
    ill post some code if necessary. but it’s plain save uiimage – perform processing – save the processed image – load the saved image to uiimage…

  97. jes Says:

    hmm… the code snippet broke the comment due to a less than or equal sign! last try .
    Here is the error I get when i run this line: tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], “sid”);

    Error: actual_tessdata_num_entries_ “less than or equal” TESSDATA_NUM_ENTRIES:Error:Assert failed in file tessdatamanager.cpp line 55
    Program received signal: “EXC_BAD_ACCESS”

    By the way, i solved the UIImage -> PIX and back problem by using a file as a transition (works ok for now). i’ll post back any more elegant solution on that i get running.
    ill post some code if necessary. but it’s plain save uiimage – perform processing – save the processed image – load the saved image to uiimage…

  98. zinuk Says:

    Hi Robert, i am using sdk 4.3 and device is 4.0.
    Tesseract engine working for only simulator but when i tried to build it showing me this error….
    Error openning data file /var/mobile/Applications/EA00CB9B-5E99-49E2-BC8D-575217EC4361/Documents/tessdata/eng.traineddata

    but, i placed eng.traineddata to the simulator folder,i can not understand what is wrong here?Please,help me .

  99. zinuk Says:

    Sorry,i am repeating again.I have successed to build pocketOCR for ios 4.3 and device 4.0.
    When,i make a clone and try to build, its working good for my simulator but making an error like it,
    /var/mobile/Applications/EA00CB9B-5E99-49E2-BC8D-575217EC4361/Documents/tessdata/eng.traineddata

    Moreover, i just placed eng.traineddata to this directory
    ……/Applications/EA00CB9B-5E99-49E2-BC8D-575217EC4361/Documents/tessdata/eng.traineddata
    i can’t understand it needs any files with eng.traineddata and how to setup those files for tessdata-svn
    please,anybody help me …

  100. Robert Says:

    I put the eng file in the tessdata dir within the Xcode project. It should increase to the app target’s copy resources step, and will be included in the app bundle. On the first run of the app, it will try to copy the eng file from the app bundle to the Documents dir.

    Try including the eng file in the project rather than copying it manually (which you can’t do on a device)

  101. Nick Says:

    Thank you for this wonderful project and your continuing support Robert.

    I successfully compiled and ran the project on OSX 10.7 with XCode 4.1 and iOS 4.3.

    In addition to the changes already listed in the discussion, I had to add “thresholder.h” to the project from the tesseract source code and remove “armv7″ from “valid architectures” in Build Settings.

    Thanks again!

  102. Florin O. Says:

    Hi,

    Thanks for your sample project and build script. I didn’t manage to build with it.

    I built separate frameworks with

    export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
    export SDKROOT=/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS4.2.sdk/
    export CC=$DEVROOT/usr/bin/gcc
    export LD=$DEVROOT/usr/bin/ld
    export CPP=$DEVROOT/usr/bin/cpp
    export CXX=$DEVROOT/usr/bin/g++
    export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
    export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer

    ./configure –host=arm-apple-darwin10
    make
    ./configure –host=arm-apple-darwin10 –prefix=/mypath

    And from there I included the frameworks from libtesseract_api.a to libtesseract_wordrec.a. Actually I corrected the path for the frameworks in your sample project.

    Everyhing compiles correctly in XCode but when it launches the app on my iPod Touch I get
    dyld: Library not loaded: /usr/local/lib/libtesseract_api.3.dylib
    Referenced from: /var/mobile/Applications/0049CF1E-D6B3-4B21-B78D-F8E490030088/OCR.app/OCR
    Reason: image not found

    Any idea what should i include?

    Thanks

  103. Robert Says:

    Check the linking step of the compile. Ensure that you have included the set of static libs in the app’s build target. It looks like the app linked against the dynamic lib on your computer rather than the static lib in the project.

  104. Matt Says:

    Is there an updated script for LLVM? I dug through this script today but unfortunately running Lion & Xcode4.2 won’t let me run the script. Do you know if an updated version for LLVM exists?

  105. Abdulla Says:

    Hi Robert,

    I see your steps are “To build the tesseract library, download the source code and compile appropriately for the iPhone (arm processor). Add the library to the XCode project and build.”

    I compiled it with the script provided. But no idea how to “Add the library to the XCode”. This is required for debugging as I am facing some issues.

    I am trying to do this as the PocketOCR git is not having the latest Tesseract. Pls let me know how to debug the Tesseract library. I ve built it with external scripts and I dont know how to make it into an XCode project.

  106. Robert Says:

    You only need to drag the Tesseract static lib (the .a file) generated by the script into the PocketOCR Xcode project. Ensure that the PocketOCR target is selected in the import menu. You do not need to create an Xcode project for the Tesseract library.

    You’ll also have to include relevant header files from the Tesseract project – which differ based on the version of Tesseract you are using. Finally, you need to include the Tesseract data dir, which contains the language files. These files need to be copied into the app bundle in the appropriate Build Phase of the PocketOCR target.

    Good luck!

  107. guanyu Says:

    Hi,I try build static lib,but when I built in xcode ,have erro:
    ld: warning: ignoring file /android_developer/Pocket-OCR/libtesseract_wordrec_armv7.a, file was built for archive which is not the architecture being linked (i386)
    Undefined symbols for architecture i386:
    “tesseract::TessBaseAPI::End()”, referenced from:
    -[OCRDisplayViewController dealloc] in OCRDisplayViewController.o
    I know my .a file is wrong,I have been troubled by this problem a long time,can you send me the .a files,thank you very much.guanyu7891@gmail.com,it’s my email

  108. chris Says:

    For anyone having issues with the eng.unicharset (or you notice that at SimpleINit the program seems to crash with no message).
    If you have done everything suggested above and it still does not work like me, i fixed it by doing the following:
    In ‘Build Phases’ go to ‘Copy Bundle Resources’ and you may notice that the eng (or whichever lang you added) files are there but listed as
    eng.word-dawg …in AppName/tessdata
    I *think* this means it is taking the files you have in your tessdata folder and just adding them to the main bundle with everything else…so not in a tessdata directory.

    i think this because i fixed it by adding a new entry and in the dialog to choose another file to add i chose ‘Add Other…’ and then added the actual directory ‘tessdata’.
    i chose Create folder references for any added folders (not 100% sure on this it just looked right, sorry i am new to this as well)
    now you will notice that it will add a directory named tessdata and not just the files.
    i think you can delete references to the other files but at this point i want to continue working and not deal with this anymore ;)

Leave a Reply