Edition-Section System

File organization depends on two basic folder types

The early stage of processing is done by artificially creating a group of 150-250 printed pages to work on at a time. These groups are identified with the edition-section system. We use it for all processing work until entry files are created and validated. At that point, we reorganize the entries by alphabetical letter for each edition.

Edition
We are digitizing four editions of the Encyclopedia Britannica. We organize all text initially by the print edition number, using the naming convention eb followed by a two-digit code for the edition. Thus eb07 is the seventh edition.
Section
Each edition is organized alphabetically, so we subdivide each edition by alphabetical letter. The full text of some letters is enormous in size, so we further segment each letter by a numbered page section of the letter. Section names begin with the letter followed by a two-digit code for the edition. Thus b03 is the third section of the letter "B."

Combining the naming conventions for editions and sections is a shorthand to indicate working segments of text, like eb11-w02. We use this edition-section system to organize our workflow during the OCR process.