Formulas and Equations
Scanning simple and complex math and chemical formulas.
Formulas and equations show up in many technical articles. Sometimes the formula is on a separate line, but often it runs in with the texts. ABBYY FineReader does well with simple formulas, but it cannot manage complex ones. We want to retain accuracy for simple formulas, but complex ones must be left until later, and a note should be placed in the inventory file for the page.
Simple
A simple formula or equation is one that is limited to a
single line, and any fractions can be expressed using the existing character set. Recognition accuracy may be lower for a formula than for text, for several
reasons. AFR does not have a dictionary for checking them against
known words. The subscript and superscript numerals and letters pose problems because of
their small size. And they use large numbers of special characters.
- Greek characters (normally italicized) are in the "Basic Greek" symbols subset.
- Special math symbols, like ≤ and ≈, are in the "Mathematical Operators" subset, though you will likely need to change your font from Times New Roman to Arial Unicode MS to find most of them.
Complex
Complex formulas and equations are two or more lines,
which AFR cannot read. If the formula occurs on a separate line, as in the illustration, then treat it
like an illustration. If it occurs on a line of text, do not worry about correcting it.
Note:
Add a comment in the inventory file for the page, so we can
go back and correct it later.