Page Numbers
Specifies encoding method for page numbers in TEI.
All page numbers must be encoded along with the edition and volume number of the print original. We further include a link to the online image file used in the OCR process for the page.
Page and/or image
information is included in three different places:
- The
<TEI>
element at the beginning of every file. - The
<div type="entry">
element at the start of the entry text. - The
<pb>
(page beginning) element indicating the beginning of a new page in the entry.
The <TEI>
element
<TEI xml:id="kp-eb0302-0698-0691" xml:lang="en" xmlns="http://www.tei-c.org/ns/1.0"/>
- The
@xml:id
attribute identifies the first print page of the source material. The first cluster after "kp" identifies the source as the 3th edition, volume 2. The next cluster records the last 4 digits of the source image filename. The last cluster records the print page number for the start of the entry. @xml:lang
records the principle language of the text ("en" = English).@xmlns
supplies the code namespace for the document (in this case, it gives the URI for the TEI namespace).
The <div>
element
<div facs="ia:gri_33125011196710/page/n698" type="entry">
- We use
@facs
to identify the source information by adding the URI of the image used in the OCR process.- "ia:" is a shorthand notation for the online archive where the image is hosted (in this case, Internet Archive). When output to HTML or PDF, these abbreviations are expanded to their full URI value, producing a valid URI that resolves to the online image.
- We include
@type
with the valueentry
to indicate that the text is an encyclopedia entry. - See Misnumbered Pages note below.
The <pb>
element
<pb break="no" facs="ia:gri_33125011196710/page/n699" xml:id="kp-eb0302-0699-0692"/>
<pb>
counts as a white space; setting@break
to "no" turns this off.- Include
@facs
with the URI for original page image. - Add the
@xml:id
for the new page number. The last four digits of the image URI are also inserted into the@xml:id
to insure uniqueness. - See Misnumbered Pages note below.
Pagebreaks within notes
Notice:
We adapted this
technique from the Women Writers Project.
Note text can run over to the
next page(s). Link the
<pb>
in the note to the corresponding
<pb>
in the main text using @xml:id
and
@corresp
on both <pb>
elements, to clarify that
they both reference the same page break.- For the
<pb>
in the note,<pb corresp="kp-eb0301-0059-0036" xml:id="pbn336"/>
- Add
@corresp
with the value of the corresponding@xml:id
in the main text<pb>
. - Create an
@xml:id
value for the<pb>
. It should begin withpbn
followed by the edition number and the referenced page number.
- For the
<pb>
in the main text,<pb corresp="pbn336" facs="ia:gri_33125011196827/page/n59" xml:id="kp-eb0301-0059-0036"/>
- Add
@corresp
with the value of the corresponding@xml:id
in the note<pb>
.
Misnumbered pages
n="misnumbered 0150" xml:id="kp-eb0302-0162-0148"
While
rare, sometimes page numbers in the print editions are obviously in error. In such
cases, use the correct page number in @xml:id
. Then add
@n
with the value misnumbered nnn
, replacing nnn
with the printed page number. This indicates the original page number was out of
sequence.