Create an OCR-Project

How to create and manage an OCR-Project.

ocr-project folders in ABBYY FineReader allow you to manage a 150- to 250-page page section of the project as a single group, instead of as separate files. It creates a unique folder structure that include a copy of the original images and adds hidden files that keep track of the text that has been recognized, your program settings, user patterns, and languages or language groups. These hidden files are critically important if we want to go back and re-output the page section at any time in the future, without having to recreate boxes for every page. To avoid the chance of losing them, we always unhide these files when creating a new project. The following figure show a typical ocr-project folder structure, with hidden files visible.
Figure 1. ocr-project folder showing hidden files

  1. Begin with a collection of all the images in the page section. These are stored in the ebnn/_images folder, in accordance with the folder structure spelled out for the _images Folder.
  2. In AFR, select File > New OCR Project, or use the button on the Main toolbar. This creates an empty project.
  3. Add your image collection to the ocr-project by clicking File > Open Image.... In the dialogue box that opens, navigate to your image collection, select all images, and click Open. The images will be added to the project (or appended to the end if you have an existing one open), and their copies will be saved in the ocr-project folder.
  4. Use File > Save OCR Project... to save it to the 1-afr-project folder. Name the project with edition-section, such as eb03-c01, eb09-s03, etc. If you are making changes to an existing project, retain the same name and overwrite the previous version.

  5. Close AFR and open the project folder in Windows File Explorer. Change the view options to show all hidden files. Right-click on FRBatch.pac, desktop.ini, and packet.ico, select Properties, and change the file attribute by unchecking hidden.