Page Numbers
Specifies encoding method for page numbers in TEI.
All page numbers must be encoded along with the edition and volume number of the print original. We further include a link to the online image file used in the OCR process for the page.
Page and/or image
information is included in three different places:
- The
<TEI>element at the beginning of every file. - The
<div type="entry">element at the start of the entry text. - The
<pb>(page beginning) element indicating the beginning of a new page in the entry.
The <TEI> element
<TEI xml:id="kp-eb0302-0698-0691" xml:lang="en" xmlns="http://www.tei-c.org/ns/1.0"/> - The
@xml:idattribute identifies the first print page of the source material. The first cluster after "kp" identifies the source as the 3th edition, volume 2. The next cluster records the last 4 digits of the source image filename. The last cluster records the print page number for the start of the entry. @xml:langrecords the principle language of the text ("en" = English).@xmlnssupplies the code namespace for the document (in this case, it gives the URI for the TEI namespace).
The <div> element
<div facs="ia:gri_33125011196710/page/n698" type="entry"> - We use
@facsto identify the source information by adding the URI of the image used in the OCR process.- "ia:" is a shorthand notation for the online archive where the image is hosted (in this case, Internet Archive). When output to HTML or PDF, these abbreviations are expanded to their full URI value, producing a valid URI that resolves to the online image.
- We include
@typewith the valueentryto indicate that the text is an encyclopedia entry. - See Misnumbered Pages note below.
The <pb> element
<pb break="no" facs="ia:gri_33125011196710/page/n699" xml:id="kp-eb0302-0699-0692"/>
<pb>counts as a white space; setting@breakto "no" turns this off.- Include
@facswith the URI for original page image. - Add the
@xml:idfor the new page number. The last four digits of the image URI are also inserted into the@xml:idto insure uniqueness. - See Misnumbered Pages note below.
Pagebreaks within notes
Notice:
We adapted this
technique from the Women Writers Project.
Note text can run over to the
next page(s). Link the
<pb> in the note to the corresponding
<pb> in the main text using @xml:id and
@corresp on both <pb> elements, to clarify that
they both reference the same page break.- For the
<pb>in the note,<pb corresp="kp-eb0301-0059-0036" xml:id="pbn336"/>- Add
@correspwith the value of the corresponding@xml:idin the main text<pb>. - Create an
@xml:idvalue for the<pb>. It should begin withpbnfollowed by the edition number and the referenced page number.
- For the
<pb>in the main text,<pb corresp="pbn336" facs="ia:gri_33125011196827/page/n59" xml:id="kp-eb0301-0059-0036"/>- Add
@correspwith the value of the corresponding@xml:idin the note<pb>.
Misnumbered pages
n="misnumbered 0150" xml:id="kp-eb0302-0162-0148"While
rare, sometimes page numbers in the print editions are obviously in error. In such
cases, use the correct page number in @xml:id. Then add
@n with the value misnumbered nnn, replacing nnn
with the printed page number. This indicates the original page number was out of
sequence.
