cross-compiling for iPhone dev

Update: Proof-of-concept demo. Also, updated the script for building with the 10.6 SDK.

Update #2: Source code for demo project released.

I recently had need to use an open-source library in an iPhone project. Recalling the earlier work necessary in compiling the libraries needed for openFrameworks I started looking for a more generic way to build for iPhone development. Thankfully, LateNiteSoft wrote a great article about using a shell script to cross-compile linux projects, building a Universal Binary with versions for the Simulator and Device.

I configured their provided code snippets to build tesseract-ocr for iPhone, referring to the set-up for freetype and freeimage to fill in some c++ gaps. Anyway, the library seems to have built correctly. I’ll know for sure when I incorporate it into a project, soon.

To use it, copy the script into the project directory, next to the configure script. For a simple project which generates one monolithic library, edit the LIBFILE variable to reflect the location and name of the library. I’ve only used this for static libraries…other work may be necessary to correctly generate dynamic libraries (however, the iPhone SDK prohibits linking to dynamic libraries, so in this case it seems moot). Run ./build_fat.sh to kick off the process. Look for the compiled libraries in the “lnsout” directory. There’s no error checking, so caveat emptor. :)

Cross-compile shell script follows:

#!/bin/sh

# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009.
# Updated 6.12.2009 to build i386 (simulator) on an x86_64 platform (10.6 SDK)
# build an arm / i386 lib of standard linux project
#
# adopted from:
# http://latenitesoft.blogspot.com/2008/10/iphone-programming-tips-building-unix.html
#
# initially configured for tesseract-ocr

# Set up the target lib file / path
# easiest to just build the package normally first and watch where the files are created.
LIBFILE=ccmain/libtesseract_full

# Select the desired iPhone SDK
export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS3.0.sdk

# Set up relevant environment variables
export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/arm-apple-darwin9/4.0.1/include/ -I$SDKROOT/usr/include/ -miphoneos-version-min=2.2"
export CFLAGS="$CPPFLAGS -arch armv6 -pipe -no-cpp-precomp -isysroot $SDKROOT"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

# Dynamic library location generated by the Unix package
LIBPATH=$LIBFILE.dylib
LIBNAME=`basename $LIBPATH`

export LDFLAGS="-L$SDKROOT/usr/lib/ -Wl,-dylib_install_name,@executable_path/$LIBNAME"

# Static library that will be generated for ARM
LIBPATH_static=$LIBFILE.a
LIBNAME_static=`basename $LIBPATH_static`

# TODO: add custom flags as necessary for package
./configure CXX=$DEVROOT/usr/bin/arm-apple-darwin9-g++-4.0.1 CC=$DEVROOT/usr/bin/arm-apple-darwin9-gcc-4.0.1 LD=$DEVROOT/usr/bin/ld --host=arm-apple-darwin

make -j4

# Copy the ARM library to a temporary location
mkdir -p lnsout
cp $LIBPATH_static lnsout/$LIBNAME_static.arm

# Do it all again for native cpu
make distclean

# Restore default environment variables
unset CPPFLAGS CFLAGS CPP LDFLAGS CXXFLAGS DEVROOT SDKROOT

export DEVROOT=/Developer
export SDKROOT=$DEVROOT/SDKs/MacOSX10.6.sdk

export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/ -I$SDKROOT/usr/include/ -mmacosx-version-min=10.5"
export CFLAGS="$CPPFLAGS -pipe -no-cpp-precomp -isysroot $SDKROOT -arch i386"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

 #Overwrite LDFLAGS
# Dynamic linking, relative to executable_path
# Use otool -D to check the install name
export LDFLAGS="-Wl,-dylib_install_name,@executable_path/$LIBNAME"

# TODO: error checking
./configure
make -j4

# Copy the native library to the temporary location
cp $LIBPATH_static lnsout/$LIBNAME_static.i386

# Create fat lib by combining the two versions
/usr/bin/lipo -arch arm lnsout/$LIBNAME_static.arm -arch i386 lnsout/$LIBNAME_static.i386 -create -output lnsout/$LIBNAME_static

unset CPPFLAGS CFLAGS CPP LDFLAGS CPP CXXFLAGS DEVROOT SDKROOT

For reference, here is the original script (written for use with 10.5):

#!/bin/sh

# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009.
# build an arm / i686 lib of standard linux project
#
# adopted from:
# http://latenitesoft.blogspot.com/2008/10/iphone-programming-tips-building-unix.html
#
# initially configured for tesseract-ocr

# Set up the target lib file / path
# easiest to just build the package normally first and watch where the files are created.
LIBFILE=ccmain/libtesseract_full

# Select the desired iPhone SDK
export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS2.2.sdk

# Set up relevant environment variables
export CPPFLAGS="-I$SDKROOT/usr/lib/gcc/arm-apple-darwin9/4.0.1/include/ -I$SDKROOT/usr/include/"
export CFLAGS="$CPPFLAGS -arch armv6 -pipe -no-cpp-precomp -isysroot $SDKROOT"
export CPP="$DEVROOT/usr/bin/cpp $CPPFLAGS"
export CXXFLAGS="$CFLAGS"

# Dynamic library location generated by the Unix package
LIBPATH=$LIBFILE.dylib
LIBNAME=`basename $LIBPATH`

export LDFLAGS="-L$SDKROOT/usr/lib/ -Wl,-dylib_install_name,@executable_path/$LIBNAME"

# Static library that will be generated for ARM
LIBPATH_static=$LIBFILE.a
LIBNAME_static=`basename $LIBPATH_static`

# TODO: add custom flags as necessary for package
./configure CXX=$DEVROOT/usr/bin/arm-apple-darwin9-g++-4.0.1 CC=$DEVROOT/usr/bin/arm-apple-darwin9-gcc-4.0.1 LD=$DEVROOT/usr/bin/ld --host=arm-apple-darwin

make -j4

# Copy the ARM library to a temporary location
mkdir -p lnsout
cp $LIBPATH_static lnsout/$LIBNAME_static.arm

# Do it all again for native cpu
make distclean

# Restore default environment variables
unset CPPFLAGS CFLAGS CPP LDFLAGS CXXFLAGS

# Overwrite LDFLAGS
# Dynamic linking, relative to executable_path
# Use otool -D to check the install name
export LDFLAGS="-Wl,-dylib_install_name,@executable_path/$LIBNAME"

# TODO: error checking
./configure
make -j4

# Copy the native library to the temporary location
cp $LIBPATH_static lnsout/$LIBNAME_static.i386

# Create fat lib by combining the two versions
$DEVROOT/usr/bin/lipo -arch arm lnsout/$LIBNAME_static.arm -arch i386 lnsout/$LIBNAME_static.i386 -create -output lnsout/$LIBNAME_static
  • Twitter
  • Facebook
  • Slashdot
  • Digg
  • Google Bookmarks
  • del.icio.us
  • RSS

Tags: , , , ,

70 Responses to “cross-compiling for iPhone dev”

  1. skay Says:

    thanks a lot for the post.
    its been a great help to me.

    so once i include libtesseract_full.a in framework, which method(and how) do i call it it my .m file to process image into text. A quick reply would be helpfull.
    thanks in advance

  2. Robert Says:

    I haven’t had a chance to actually implement the library in a project. There some comments on the tesseract Google code forum about this topic. Browsing the baseapi header may give some insight, too.

    Good luck and please share what you find out!

  3. skay Says:

    Every folder in tesseract has a .a file.
    so do we have to run same script for all .a files and include all of them in framework.

  4. Robert Says:

    This build script only targets libtesseract_full.a. In this case, tesseract compiles a monolithic static library which is all you need to include in your project. The readme (or install) txt files in the tesseract src indicate this.

    You mileage may vary with other source code.

  5. skay Says:

    tesseractmain.h has a method api_main() which we probably have to call from our api.
    However when i try to import

    // Copyright 2009 __MyCompanyName__. All rights reserved.
    //

    #import
    #include “tesseractmain.h”

    it says file not found for tesseractmain.h

    when i add file tesseractmain.h it give compilation error for C++ syntex used in tesseractmain.h

    how do i actually include it.

  6. Robert Says:

    is the file extension of your source .mm? mixing objective-c and c++ code requires that extension. easiest to do it in the Finder, then reimport the renamed file(s) into Xcode, removing the old files as well.

  7. skay Says:

    thanks a lot robert.
    it worked . i was using .mm file but there was other problem which i sorted out.
    however i am clueless how to call methods in baseapi.h
    might be it ll take some time to understand from code.
    can u help me giving some clue how to go about it.

  8. Robert Says:

    again, I haven’t had a chance to use this library in a project. I’d imagine that you create a new tesseract object, init it with the language model, then provide it with image data. Here’s my best guess without actually trying it myself:

    TessBaseAPI *tess = new TessBaseAPI();
    NSString *datapath = [NSString stringWithString:[[NSBundle mainBundle] bundlePath]];
    tess->Init([datapath UTF8String], NULL);

    // this is the main method for doing char recognition
    // you’ll have the provide the raw image data yourself
    /*
    char* returnedString = TesseractRect(const unsigned char* imagedata,
    int bytes_per_pixel, int bytes_per_line,
    int left, int top, int width, int height);
    */

    good luck!

  9. skay Says:

    thanks a lot robert
    figured it out. The app is up and running now.

  10. Robert Says:

    would you mind sharing the steps you ended up using?

  11. skay Says:

    will surely do that in a day or 2 :)
    once again thanks for support. it was very helpful.

  12. skay Says:

    i am trying to use ImageMagick to convert the image into tiff and passing image to tesseract to process it.
    there is a small hole right now. ImageMagick is converting successfully to bmp and jpg .
    tesseract reads bmp files but it should be with bpp 1,2,4,8 and image i get using ImageMagick is 32BPP.
    tryimg my hand on it.
    will keep u posted.
    Do u have any idea to change image bpp so that its readable by tesseract without tiff support.

  13. Matt Says:

    Firstly, thanks for the shell script, I’d been struggling trying to figure out the instructions from LateNiteSoft.

    At the moment I’m trying to get baseapi to work. My problem is getting the image into the format needed to send as an argument to tesseract. Currently I have something like this:

    TessBaseAPI::InitWithLanguage(”DataPath”, NULL, NULL, NULL, false, 0, NULL);

    char* text;
    NSBundle *bundle = [NSBundle mainBundle];
    NSString *imgPath = [bundle pathForResource:@"testImage" ofType:@"tif"];
    UIImage *uiImage = [UIImage imageWithContentsOfFile:imgPath];

    CGSize imageSize = [uiImage size];
    double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage])/8.0;

    //This line doesn’t work
    unsigned char* imageData = [uiImage bitmapData];

    text = TessBaseAPI::TesseractRect((const unsigned char*)imageData, bytes_per_pixel,bytes_per_line, 0, 0, imageSize.width, imageSize.height);

    Anyone have any ideas? Is it something I should be doing in C++ instead of objective-C?

    Also, I’m a complete beginner so there might be lots of errors in the code above.

  14. Robert Says:

    i don’t believe that there is a property called “bitmapData” in the UIImage class. however, you can get a CGImageRef with [(UIImage *)image CGImage]. maybe there is a way to get NSData / CFData from there?

    perhaps this dev note is helpful for retrieving the pixel data from an image:
    http://developer.apple.com/iphone/library/qa/qa2007/qa1509.html
    or perhaps this article:
    http://www.cocoadev.com/index.pl?NSBitmapImageRepFromCGImage

  15. Ankit Gupta Says:

    Hey,

    Thanks for the script! I am on sdk 3.1 beta 3 (don’t think this should matter though). and I am trying to run it but am running into this error.

    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    Could you guide me as to what I am doing wrong?

    Thanks :)
    Ankit

  16. Ankit Gupta Says:

    HI,

    Sorry for the previous comment. I had a previous version of tesseract. I downloaded the new one and the script ran fine. I have included the library libtesseract_full.a in my project and tesseractmain.h in my header. But it says no such file or directory when I try to compile!

    Ankit.

  17. Ankit Gupta Says:

    Sorry about the multiple posts but I started getting this error again.

    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    I hadn’t downloaded the data files earlier and it worked fine. But when I put the data files in the tessdata directory and reran the script, got this error. Any thoughts on how to get around this?

    Thanks,
    Ankit

  18. Robert Says:

    no, i’m not sure about that particular issue. will the lipo tool work correctly when you remove the data files? i’ve successfully generated the lib with the english language data files in tessdata.

    maybe one of the earlier commenters who have actually used the resulting lib in a project would be able give you some more guidance. i haven’t yet had a chance to use this in a project myself.

  19. Lars Says:

    Hi, I’ve compiled tesseract succesfully and included header and framework search paths in xcode,
    This line works both compiling and linking:
    TessBaseAPI *tess = new TessBaseAPI();
    but when I add this line:
    static int result = tess->InitWithLanguage(NULL,NULL,NULL,NULL,false,0,NULL);
    I get a linking error. Unknown symbol. I’ve renamed the source file to .mm

    I’ve tried to find a solution. There are solutions on the net like “make sure your file is included in the project…” and “declare it as extern”, but I can’t get it to work. Anybody who has managed to use tesseract in a program yet? Some helping code mayby? or some clues?
    Thanks, Lars.

  20. Stefano Says:

    this is working for me

    - (NSString *)readAndProcessImage:(UIImage *)uiImage {
    CGSize imageSize = [uiImage size];
    double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]);
    double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage]) / 8.0;

    CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage]));
    const UInt8 *imageData = CFDataGetBytePtr(data);

    char* text = TessBaseAPI::TesseractRect(imageData,
    bytes_per_pixel,
    bytes_per_line,
    0, 0,
    imageSize.width, imageSize.height);

    return [NSString stringWithUTF8String:text];
    }

  21. Robert Says:

    Great! Thanks for that code snippet.

  22. Lars Says:

    I tried it all over again. It works!
    I’ll leave a link to my project – when (or if) I have something useful…

    Thanks,
    Lars.

  23. Kevin Says:

    Lars a link to your project would be fantastic.

    I managed to compile the libtesseract_full.a and add it. Thanks Robert!!
    However I cant figure out how to include “baseapi.h” or other header files without getting a dozen errors

    -Kevin

  24. Kevin Says:

    So in case anyone got stuck on my last question.. Changing the .m file that you include “baseapi.h” needs to be .mm.

    Has anyone had luck installing languages in Tesseract?

    When I call:
    TessBaseAPI::InitWithLanguage(NULL, NULL, NULL, NULL, false, 0, NULL);

    I get the error
    Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset

    I have download the english files and put them into tessdata before compiling tesseract for the iPhone/iPhone simulator. Any ideas?

  25. Robert Says:

    yes, as noted above, any source file which mixes obj-c and c++ needs to have a .mm file extension.

    i’m currently getting the same error with the location of the tessdata folder. if you note the configure process, this location seems to be hardcoded to the prefix variable.

    the library works in the simulator if you install/symlink the tessdata folder to /usr/local/share/tessdata, but of course this doesn’t fly in the iPhone sandbox.

    it seems as though TessBaseAPI::Init() expects the first argument to be the location of the data folder, but including the data files in the app bundle and passing that location to Init() has no effect on the error.

    i’m at a loss for the moment, and the semester has just begun so time is at a premium. i welcome any other suggestion on this matter.

  26. dflynn Says:

    To resolve this error:
    Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset

    Locate the ‘Executables’ node in the xcode project tree.
    Double-click your app executable.
    In the window that opens, choose the ‘Arguments’ tab.
    Add a new Environment Variable called TESSDATA_PREFIX and set it equal to the path on disk preceding the “tessdata” directory. That will get you past the error and allow tesseract to init in the Simulator.

    In order to work on the phone, you need to set TESSDATA_PREFIX to a different environment variable representing the app install directory. The app install directory will need a subdirectory called tessdata containing all the language files.

  27. Kevin Says:

    Thanks dflynn. Your fix works perfectly for the simulator.

    For running on the iPhone:
    I have started looking for this environmental variable that will find the /documents directory. This directory path changes each time the app is installed….
    <>

    I can find this location problematically during run time with:
    NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString *documentsDirectory = [paths objectAtIndex:0];

    However that seems to be useless since we cannot change the TESSDATA_Prefix Environment Variable during run time..

    I’ll keep searching and keep you updated. Thanks again.

  28. Kevin Says:

    K I got it working on the iPhone.
    Turns out you can change the Environment Variable within the code.

    My Implementation:
    I am creating a tessdata subdirectory in the Applications Directory Folder.
    Then copying the language files from my bundle into that folder.

    Then setting the TESSDATA_Prefix Environment Variable during run time with:
    setenv(”TESSDATA_PREFIX”, [tessDataPath UTF8String], 1);

    And that will do it.

  29. Morten Says:

    I have used the scripts and followed the comments but I can’t get it to work. Can those of you who worked out how to use it please post a short guide describing what you did? Ex. did you change anything in the build_fat.sh file, which properties/settings do you have to change in xcode and which header files should you include in the .m-file?

  30. Laurent Says:

    Hi Everyone

    Has anyone been able to build it with Snow Leopard?
    I am able to build the library on the ARM architecture, include it in my project but impossible to get it to work on the Simulator. I get the error “ld: warning: in /Developer/photoStrip/lib/iPhone/libtesseract_full.a, file is not of required architecture” when I build the App.

    My Setup is Snow Leopard 10.6.2, SDK 3.1, XCode 3.2.

    Any help would be greatly appreciated

    Thanks

  31. Robert Says:

    on your computer, it has likely been built for i686 / x64_64. you may have to tinker to get the library to build for i386 explicitly. i noticed this change myself after updating to Snow Leopard. (the instructions were written when I was running 10.5)

    you can check by using otool:
    otool -h libtesseract_full.a

    check the cputype and subtype. look in:
    /usr/include/mach/machine.h

  32. Laurent Says:

    Thanks Robert. I’ll check all this later today and will let you know.

  33. Laurent Says:

    Robert I recompiled clean and ran the otool command and this is what I get:

    otool -h libtesseract_full.a

    Archive : libtesseract_full.a (architecture arm)
    libtesseract_full.a(libtesseract_full.o) (architecture arm):
    Mach header
    magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
    0xfeedface 12 0 0×00 1 3 1248 0×00002000

    Archive : libtesseract_full.a (architecture x86_64)
    libtesseract_full.a(libtesseract_full.o) (architecture x86_64):
    Mach header
    magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
    0xfeedfacf 16777223 3 0×00 1 3 1480 0×00002000

    To do the ‘lipo’ I had to change the ‘-arch i386 libtesseract_full.a.386′ command to ‘-arch x86_64 libtesseract_full.a.386′. The problem seems to be my CPU type as I should have 7 i.o 16777223, which indicates an unknown value.

    I have 3 machine.h but not where you mentioned: /System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach /Developer/SDKs/MacOSX10.6.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach /Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/mach
    The first 2 are exactly the same.
    The 10.5 is an older version from 3/25/08.

    I checked the 10.6 and this is what I have as CPU type and subtype (the 10.5 looks exactly the same).

    #define CPU_ARCH_MASK 0xff000000 /* mask for architecture bits */
    #define CPU_ARCH_ABI64 0×01000000 /* 64 bit ABI */
    /* * Machine types known by all. */
    #define CPU_TYPE_ANY ((cpu_type_t) -1)
    #define CPU_TYPE_VAX ((cpu_type_t) 1)

    #define CPU_TYPE_MC680×0 ((cpu_type_t) 6)
    #define CPU_TYPE_X86 ((cpu_type_t) 7)
    #define CPU_TYPE_I386 CPU_TYPE_X86 /* compatibility */
    #define CPU_TYPE_X86_64 (CPU_TYPE_X86 | CPU_ARCH_ABI64)

    #define CPU_TYPE_ARM ((cpu_type_t) 12)


    #define CPU_SUBTYPE_VAX_ALL ((cpu_subtype_t) 0)
    #define CPU_SUBTYPE_VAX780 ((cpu_subtype_t) 1)
    #define CPU_SUBTYPE_VAX785 ((cpu_subtype_t) 2)
    #define CPU_SUBTYPE_VAX750 ((cpu_subtype_t) 3)
    #define CPU_SUBTYPE_VAX730 ((cpu_subtype_t) 4)
    #define CPU_SUBTYPE_UVAXI ((cpu_subtype_t) 5)
    #define CPU_SUBTYPE_UVAXII ((cpu_subtype_t) 6)
    #define CPU_SUBTYPE_VAX8200 ((cpu_subtype_t) 7)
    #define CPU_SUBTYPE_VAX8500 ((cpu_subtype_t) 8)
    #define CPU_SUBTYPE_VAX8600 ((cpu_subtype_t) 9)
    #define CPU_SUBTYPE_VAX8650 ((cpu_subtype_t) 10)
    #define CPU_SUBTYPE_VAX8800 ((cpu_subtype_t) 11)
    #define CPU_SUBTYPE_UVAXIII ((cpu_subtype_t) 12)

    Do you have any idea why I get this unknown CPU type? Any suggestion?

    Thanks – Laurent

  34. Robert Says:

    Laurent:

    I *believe* that cputype 16777223 is x86_64. The build script above uses the native cpu to do the compile for the simulator, which is the correct output for your computer, OS and XCode. However, it seems as though the simulator needs i386.

    You”l have to add CFLAGS and CPPFLAGS variables with -arch i386 before the second configure (for the simulator). Try this:
    # Set up relevant environment variables
    export CPPFLAGS=”-arch i386″
    export CFLAGS=”$CPPFLAGS”
    export CPP=”$CPPFLAGS”
    export CXXFLAGS=”$CFLAGS”

    Also, some information I wish I had read a long time ago:
    http://developer.apple.com/mac/library/documentation/Darwin/Conceptual/64bitPorting/building/building.html

  35. Laurent Says:

    Hi Robert

    Unfortunately same results. I think CPU 16777223 means unknown but not 100% sure. I will keep on playing with all these flags but if anyone has a recommendation that would be great

    Thanks

  36. Laurent Says:

    Robert, I opened a thread on the iPhone developer’s forum and a guy from Apple gave me this first answer:

    “CPU type 16777223 is x86_64. (See mach/machine.h.) The Simulator only uses 32-bit i386. You need to configure your build to use i386.
    Note that recent compiler versions default to x86_64 on 64-bit machines. Make sure you’re not using the default anywhere.”

    I asked him more information on the configure option but you were right on CPU type 16777223. It does not mean ‘unknown’ but X86_64.

  37. Rob Says:

    Hi,
    I hope someone can help me out because this is driving me insane…
    I believe I’ve compiled the library so that it can run on both the phone and the simulator, having tweaked the build script when it refused to work due to 368/x86-64 issues, and then tweaked it again when it refused to work with OS X 10.6 due to fopen$UNIX2003 problems… but I’m stuck actually trying to get it to run.

    Sorry if I’m being dumb, but when I call TessBaseAPI::SimpleInit([documentsDirectory UTF8String], “eng”, false); the programs crashes on that line with exit code 1. Is this because it can’t find the data files? or am I doing something stupid?

    Thanks in advance, and thanks for the original build script

  38. Robert Says:

    yes, there’s an issue with the paths…look at a previous comment for setting an environment variable in Xcode for the tess data path.

  39. Rob Says:

    Thanks Robert, it’s working perfectly now.
    Excellent blog post.

  40. Jan Says:

    Hello, I dont’t understand which files i have to include. I have compiled the static libary and i want to use the libary. can you help me please.

  41. Robert Says:

    you need to include: libtesseract_full.a library, baseapi.h and the tessdata folder with the language files.

    the above comments have snippets for setting an environment variable for teh data folder, initializing the tesseract engine and processing an image.

  42. Laurent Says:

    Hi Rob

    Would you mind posting your script? It seems you figured out how to do this cross compiling working with both the simulator and the device. I am still stuck with the stuck with the Simulator

    Thanks

  43. Jan Says:

    Thanks Robert for the new Script it works perfect.
    I do the following things in a .mm(rename in XCode) file and get many errors:
    #include “libtesseract_full.a”
    Should this work?

    Thanks

  44. Robert Says:

    @Jan…you add libtesseract_full.a to the Frameworks group in XCode, rather than as an include in your source files.

    you’ll need to add #include “baseapi.h” to your code, however…and drag baseapi.h from the tesseract source folder into your XCode project. you need to also add the tessdata folder to the project.

  45. Jan Says:

    Thank you so much. It work’s.

  46. Mart Says:

    Hi Robert, thanks for a great write up!
    I have managed to compile and include the libtesseract_full.a, but when i try to use the API with Stefanos code above my app crashes w/o message.
    A few questions:
    - In the code above it looks like Stefano calls TesseractRect(const UInt8 *, double, double, int, int, float, float), but the method signature says: TesseractRect(const unsigned char*, int, int, int, int, int, int). Could it possibly work anyway? Tried to typecast all variables to int , but still no luck.
    -Could the app crash because of a faulty tessdata directory reference?
    Thankful for any help.

  47. Robert Says:

    @mart yes, absolutely..look above for specific details about including and referencing the tessdata folder. also, you can look at my updated link at the top of the page for obj-c++ code snippets for invoking tesseract.

  48. dreampowder Says:

    hi there, is it possible that someone can send me a compiled tesserract static library via e-mail? i am new to obj-c and xcode enviroment and i dont know how to implement all those scripts and code shown above.

    my computer is macbook 2.1 with snow leopard 10.6. i’d be grateful if someone can send me the library to

    ‘coskun.serdar-at-gmail.com’.

  49. Scott Says:

    I got the OCR code all to work but am I crazy or are pictures taken using an iPhone not recognized well at all. I have a 3g and if I take a picture and run it through, for the most part comes back with garbage. Now if I run through it some text at say 20 point that I capture on my Mac, it works much better. Does one need the resolution of the 3gs to make this useful?

  50. Robert Says:

    the autofocus lens on the 3GS helps greatly. there is an external lens for the iPhones without an autofocus lens which is supposed to help the macro focus.

  51. exploration » Blog Archive » OCR on iPhone demo Says:

    [...] around to building a proof of concept implementation of tesseract-ocr for the iPhone. months ago, i documented the steps which helped to get the library cross-compiled for the iPhone’s ARM processor, and [...]

  52. Stéphane Tavera Says:

    Hi everyone,
    Robert, congrats for the hard work done and to share this.
    I was able to compile the library, and added libtesseract_full.a and baseapi.h to the demo iPhone project.
    My config : snow leopard, 10.6.2, xcode 3.2.1
    In build info, I changed the “Mac OS X Deployment Target” to “10.6″
    However, I get an error at build (see trace)

    Anyone can help ?
    thanks in advance !

    trace :

    Ld build/Release-iphoneos/OCR.app/OCR normal armv6
    cd /Users/st/Pocket-OCR
    setenv IPHONEOS_DEPLOYMENT_TARGET 3.1.2
    setenv MACOSX_DEPLOYMENT_TARGET 10.6
    setenv PATH “/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin:/Developer/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin”
    /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/g++-4.2 -arch armv6 -isysroot /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS3.1.2.sdk -L/Users/st/Pocket-OCR/build/Release-iphoneos -L/Users/st/Pocket-OCR -L/Users/st/Pocket-OCR/../../tesseract-ocr-svn/api -L/Users/st/Pocket-OCR/../../tesseract-ocr-svn -L/Users/st/Pocket-OCR/../../tesseract-2.04/libraries -F/Users/st/Pocket-OCR/build/Release-iphoneos -filelist /Users/st/Pocket-OCR/build/OCR.build/Release-iphoneos/OCR.build/Objects-normal/armv6/OCR.LinkFileList -mmacosx-version-min=10.6 -dead_strip -miphoneos-version-min=3.1.2 -framework Foundation -framework UIKit -framework CoreGraphics -ltesseract_full -framework MessageUI -o /Users/st/Pocket-OCR/build/Release-iphoneos/OCR.app/OCR

    ld: library not found for -ltesseract_full
    collect2: ld returned 1 exit status
    Command /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/g++-4.2 failed with exit code 1

  53. Robert Says:

    Check the “Link Binary with Libraries” section of the OCR Target of the Xcode project. I had it included relative to the project path on my development machine. Ensure that your version of the libtesseract_full is included there (it should be displayed in black, rather than red text)

  54. Stéphane Tavera Says:

    About previous post.
    I just had forgotten to update SDKROOT like this :
    export SDKROOT=$DEVROOT/SDKs/iPhoneOS3.1.2.sdk
    and everything works ;-)
    Once again, congrats and many thanks for sharing.
    Isn’t a keynote supposed to begin ;-) ?

  55. elninom Says:

    I wrote “./configure”, “make” and “sudo make install”
    Then entered “./build_fat.sh”; but I’ve got these error:

    /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    Do you know how to solve this error?

  56. Robert Says:

    Have you edited the build_fat script to point to your installation of the iPhone SDK?
    I believe that the error message indicates that the library was compiled for intel rather than arm.

    Also, when using the build script, you don’t need to run configure and make…it does those things twice: once for arm (iPhone SDK) and again for the host architecture (generally intel, either i386 or x86_64).

    Is this for tesseract? Do you have the iPhone SDK installed?

  57. elninom Says:

    @Robert Thank you for replying.
    I’ve installed (re-installed again) and tried on iPhone SDK 3.1.3 and 3.2 beta. Also, tried to compiling tesseract 2.04 and 3.0 svn.
    I checked and corrected the directory and SDK path. I entered only ./build_fat.sh command but still no luck. It gives same and weird error.

    ./configure: line 1965: test: /developer/pocket: binary operator expected
    ./configure: line 1968: test: /developer/pocket: binary operator expected
    checking whether build environment is sane… yes
    /bin/sh: /developer/pocket: No such file or directory
    configure: WARNING: `missing’ script is too old or missing
    checking for a thread-safe mkdir -p… config/install-sh -c -d
    checking for gawk… no
    checking for mawk… no
    checking for nawk… no

    make[4]: Nothing to be done for `all-am’.
    make[3]: Nothing to be done for `all-am’.
    make[2]: Nothing to be done for `all-am’.
    /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0))

    I’m stuck at this point for 5 days. I tried every way. Please help me elninomelninom@gmail.com

  58. Robert Says:

    do you have spaces in your path?
    ie. /developer/pocket (is there more to the path here…?)

    try removing (or escaping) spaces in your path.

  59. elninom Says:

    I don’t have /developer/pocket path or pocket directory.
    How can I make it?

  60. hytgbn Says:

    Hi, I was using tesseract library with my leopard 10.5 and xcode 3.1.x( maybe 3.1.4)

    It works very fine on simulator and device, cool.

    first of all, I appreciate your blog that makes me compile well on my environment.

    But I update my OSX few days ago, and I install xcode 3.2.

    After update it makes compile error, while linking.

    It says it cannot find some object which is referenced from some.o file.

    GOMP library and fopen, fdopen functions are the reason.

    If I update xcode from 3.1 to 3.2 , is there some framework I must additionally add?
    or is there option I have to declare?
    or do I have to compile it again? (actually I compiled it again and again :( )

    If you know something, please give me the clue..

    Thank you.

  61. Robert Says:

    Without the specific error log it’s difficult to tell. I’m using 10.6 with XCode 3.2 and it all works fine. There are two versions of the build_fat script…one for 10.5, another for 10.6.
    Maybe try recompiling the tesseract library with the 10.6 version of the build script.

    Are you using your own iPhone project or the Pocket OCR project?

  62. Wilson Says:

    Robert, I set out a few days ago to take Tesseract for a test drive and your posts/sample code have been very helpful (Thx!). I’ve managed to get the Pocket OCR project to run on the Simulator, but I get the dreaded “libtesseract_full.a file is not of the required architecture” error when I try to build & run for my iPhone 3Gs. I’m not an XCode veteran and unfortunately couldn’t implement the fix you provided above on 11/24/09 (re: adding some new environment variables). Setting these environment variables using the “env” command didn’t seem quite right. Would you mind giving me the “for dummies” version of this fix? Thx in advance.

    I’ve included some potentially relevant info about my setup below:
    - Running Snow Leopard 10.6.2, Using SDK 3.1 & XCode 3.2
    - Running otool -h on libtesseract_full.a yields cputype of 7 and cpusubtype of 3 (note: a review of my /usr/include/mach/machine.h file revealed cputype 7 defined as CPU_TYPE_X86 and cpusubtype 3 as either CPU_SUBTYPE_X86_ALL, CPU_SUBTYPE_X86_64_ALL, CPU_SUBTYPE_386, or CPU_SUBTYPE_I386_ALL

  63. Robert Says:

    The output from otool indicates that it’s not a FAT library, but one only built for your native computer. Also, there may be two libtesseract_full.a files. look for the one in the “lnsout” directory inside the tesseract folder. If that folder doesn’t exist, run the build_fat script again and look for it.

  64. Wilson Says:

    Robert, you were right… I had the wrong libtesseract_full.a included in my project (I was using the one in /ccmain instead of the one in /lnsout). Tesseract is now working on my iPhone! Thx again for your help!

  65. S Woodside Says:

    Very interesting! I’ve tried it out and it’s a bit slow … do you have any idea on how to make it faster? Did you look at that at all?

  66. kazuar Says:

    Hello,

    Thanks for this great article.

    I’m having the same “libtesseract_full.a file is not of the required architecture” that everyone keeps getting. I would like to know where is the problem.
    Please see the following details:
    1) I run Snow Leopard 1.6.4 with XCode 3.2.3 and iPhoneOS4 SDK.
    2) I’ve changed in the script the reference to the iPhoneSDK.
    3) I also changed everywhere darwin9 to darwin10.

    I still get the same message: /usr/bin/lipo: specifed architecture type (arm) for file (lnsout/libtesseract_full.a.arm) does not match it’s cputype (7) and cpusubtype (3) (should be cputype (12) and cpusubtype (0)).

    Another thing I’ve noticed is that while this message appear, two files are being created in the lnsout directory:
    1) libtesseract_full.a.arm
    2) libtesseract_full.a.i386
    Both of them have the same size: 3.6MB

    Does this means that I can ignore this message? Is there anything else I can check?

    Any help would be appreciated.

    Thanks,
    Kazuar

  67. kazuar Says:

    Hello again,

    I’ve actually succeeded compiling tesseract with another tutorial that I’ve found here:
    http://iphone.olipion.com/cross-compilation/tesseract-ocr

    I only had to do small changes.
    When I’ve finished compiling, I got libtesseract_full.a.
    I changed its name to libtesseract_full.a.arm and then tried to run lipo command with the i386 I got from the initial compile.
    Now I got libtesseract_full.a with no problems.

    I hope it will work in a real project.

    I will let you know.

    Thanks,
    Kazuar

  68. Avicene Says:

    @kazuar were you able to compile the library on Mac OS 10.6 and iOS4?
    I tried to use the steps in this blog but arm-apple-darwin9 compilers are nowhere to be found.
    I am working on this script to make it work for Mac OS 10.6 and iOS4.
    I don’t want to reinvent the wheel, so if anybody is aware of similar work was done elsewhere please aknowledge.
    Thanks.

  69. Damian Says:

    kazuar, did u make it to run for OS4? i can’t get it to run, if you did, please send me your script cause this is making me mad.
    damcho@gmail.com
    thanks

  70. Robert Says:

    I have Pocket OCR running on my iPhone 4 (iOS 4 of course). I believe that I compiled it using the 3.x SDK still, however.

Leave a Reply