Ninth Edition
Encyclopedia Britannica, Ninth Edition: A Machine-Readable Text Transcription
format | segment | version | size (ZIP) | # of files | date | GitHub repository | download |
---|---|---|---|---|---|---|---|
TEI (XML) | A-J | 1.0 | 75 MB | 9605 | 2023-02-28 | eb09/XML | ZIP file |
TEI (XML) | K-Z | 1.0 | 67 MB | 8167 | 2023-02-28 | eb09/XML | ZIP file |
Plain text (TXT) | A-Z | 1.0 | 71.7 MB | 17,772 | 1-Nov-2022 | eb09/TXT | ZIP file |
Release notes
- 2023-02-28: XML Release v1.0 (TEI encoding) (2 ZIP files)
- 2022-11-01: We're excited to publish v1.0 of the complete text of this edition as TXT files.
Content notes
Plain text files
- Page breaks are indicated in-line as [edition:volume:page].
- Footnotes and marginal notes are included in-line at the point of the siglum, as ^[1. This is note text.]
- Tables are out of scope and indicated in the text with [table].
- Formulas are out of scope and are left uncorrected.
- Further information is available at Editorial Standards.
TEI files
- Index terms for the entry content are in the
<profileDesc>
section of the<teiHeader>
. Each includes a prefaced URI for the named authority file. See Master Files for further information. - Page breaks include a prefaced URI for the online source image. URI resolves to full URL when output to display formats.
- Footnotes and marginal notes are included in-line at the point of the siglum.
- Tables are out of scope and are left uncorrected.
- Formulas are out of scope and are left uncorrected.
- Further information is available at Editorial Standards.
Storage format
- Files. Entries are in individual files with a header for the Knowledge Project. They can be individually downloaded from the GitHub repository
- ZIP file. To easily download the complete edition in either TXT or XML formats, use the ZIP file(s).
- Directories. Files are organized in directories named for the letter of the
entry + the volume number of the print edition. Note: For example, the directory j12 contains all entries in volume 12 that begin with the letter 'J'.
- File names are meaningful. Example:
kp-eb0908-022205-1234-v1.txt
- kp = Knowledge Project
- eb0908 = 9th ed., print vol. 8
- 022205 = print page 222, 5th entry on the page
- 1234 = last 4 digits of the source image file name (makes file names unique)
- v1 = version 1
- This work is licensed under a Creative Commons Attribution 4.0 International License.