Page breaks and page numbering [169]


Encoding of page breaks and page numbering using the <pb/> element and its n= attribute, including guidelines for creating idealized page number sequences


The WWP encodes page breaks using the TEI <pb/> element. This is an empty element, and has no content. By convention, the <pb/> element is understood to mark the start of a new page, so each page of text should be preceded by a <pb/> element, even the first page. The <pb/> element goes before any other information about the page, including collation, forme work, etc.

The page number is encoded in two ways. The actual printed page number is part of the forme work and is encoded using <mw type="pageNum">. An idealized page number is also captured on the n= attribute of the <pb/> element.

"Idealization" of the page number means correcting errors in sequencing, omitting casual variations in the way page numbers are printed (brackets, etc.), and supplying page numbers which are not printed on the page. The idealized page numbers will usually have the same form (arabic numbers, roman numerals, etc.) as the actual page numbers, unless there are overriding reasons to do otherwise. If the document has a separate numbering system for the front or back matter, the idealized numbers should do the same.

Every page in the document including the title page should have an idealized page number, recorded on the <pb> for that page. Page numbering of this sort should start with the first page of the text, which will usually be the title page but might be a frontispiece or some other page before the title page. Note that in texts where the frontispiece is on a verso page, the recto of that page should be included in the pagination *even if it is not present in the OT*. (This is because a page can’t have a verso without a recto, so we have to assume that a recto exists in the original.) We do not include blank preliminary pages in the pagination unless they are preceded by a non-blank page, or unless they are the blank recto to a printed verso (as in the case of the frontispiece mentioned above). In other words, we start our count from the first non-blank page (or, if it is a verso, from its recto) and include all pages following it in the pagination.

See the examples for some specific cases.

Special cases:

Pages that are not numbered, but are accounted for in the explicit numbering of the OT (i.e, there simply is no ink on the page, e.g., 1, 2, 3, , 5, 6, ...) will not have an <mw>, but will have the appropriate n= on the <pb/>. (In this case, the blank page’s <pb/> would be n=4.)

Pages that are not numbered and are not counted in the OT page number reference system (e.g., 1, 2, 3, 4, , , 5, 6 ...) will, where possible, be refered to by the word “facing” and their facing page’s number. (In this case n="facing 4" and n=“facing 5”.) Typically these will be extra leaves tipped in, such as illustrations or sheets of errata. Note that unnumbered pages of this sort will almost always come in pairs (i.e. the two sides of a leaf), since otherwise the universal rule that odd numbers fall on the right side of the page would be violated. Also, unnumbered pages of this sort will almost always start with an odd number. The only exception would be where the page numbering was seriously messed up in other ways. See 081 for more detail.

For tipped-in pages which occur within unnumbered page sequences (e.g. unnumbered front matter), the idealized page numbers should be a simple sequence; there is no need to indicate “facing x”, etc. In fact, in most cases it may be impossible to tell that these pages are tipped in (at least from the xerox), since there will be no discontinuous pagination to give it away.


Example 1. A work which has frontmatter numbered in little roman numerals, followed by the body which is numbered in arabic numbers: the frontmatter should be numbered <pb n="i">, <pb n="ii">, etc., followed by the body numbered <pb n="1">, <pb n="2">, etc.

Example 2. A work in which the frontmatter does not have numbers printed on the page, followed by a body numbered in arabic numbers: the frontmatter should be numbered <pb n="i">, <pb n="ii">, etc., followed by the body numbered <pb n="1">, <pb n="2">, etc. (Similarly, if the body does not have numbers printed on the page, the page numbering should still be recorded in arabic numerals on the n= of PB: n=“1”, n=“2”, n=“3”, etc.)

Example 3. A work in which no page numbers appear at all, which contains a title page, frontmatter, and a body: number the title and frontmatter continuously with small roman numerals: number the body 1, 2, 3, etc. If there were a frontispiece preceding the title page, the page numbering for the frontmatter should start with the frontispiece (or, if that is on a verso, with its hypothetical recto; see above).
4. A document which has separately numbered subsections such as plays (e.g., 1-30 for the first play, 1-25 for the second, 1-34 for the third...): the numbering for each section should be encoded as n=“1”, n=“2”, etc. It is not necessary that each n= value be unique in the document.

list all entries