OCR for eb03

Scanning eb03 requires special settings.

All the editions of Encyclopedia Britannica have slightly different layouts, but additional considerations come into play when scanning eb03.
Figure 1. Typical page from eb03


Long-s

Most printed books before 1820 use ſ instead of s. This is the "long-s," and eb03 is no exception. ABBYY FineReader is good at recognizing it, but far from perfect. We replace ſ with s in the TEI editions, so for corrections, you can use the s on the keyboard. There are specific rules for when typographers used ſ: at the beginning and middle of words, but never at the end. Bear this in mind when trying to determine the intended word.

Ligatures

Combinations of letters into a single letter-form, such as the "st" in "first" and "Palestine" on lines 2 and 3 above, are called "ligatures," and they were widespread in pre-1850s printed matter. AFR will often misrecognize ligatures, unless it is trained to identify them when recognizing the page. For instructions, see Train OCR.