Gap: general [159]


General notes on the use of <gap> to encode material omitted from transcription


The WWP uses <gap> to encode material omitted from the transcription, either by reason of deliberate omission, or by reason of damage to the text. The following is a list of the attribute values for <gap> and the values allowed by the WWP.

1. reason=

Use only the following values:

"damaged": for cases where the page has been damaged in some way (torn, folded, creased)

"deleted": for cases where the text has been illegibly deleted (this implies an intentional deletion, not simply the presence of a bug)

"illegible": for cases where the page is intact but the original text is illegible for some reason other than intentional deletion (e.g. page is illegibly stained, letter is uninked, a bug was squashed, etc.)

"flawed-reproduction": for cases where the reproduction causes illegibility but we have reason to believe that the original is still legible (illegible gutters, edge cut off by xeroxing or filming, darkening which results in a black fog on the page (microfilm or xerox underexposure), an object superimposed on the original when filmed or xeroxed

"excerpt": for cases where we are deliberately omitting text because our OT is an excerpt (as in the case of the Elizabeth I speeches)

"omitted": for features which our policy is to ignore (bookplates, embossing, modern handwriting, etc.)

"other": ONLY use this where none of the other values is appropriate. We will review the cases where “other” is used and add to the list above as necessary.

2. desc=

Use only the following values:

"printed": for text or images printed on the page (used to be “printed text”; texts need update)

"handwriting": for handwriting

"attachment": for anything stuck, glued, stapled, or otherwise affixed to the page (bookplates, etc.)

"embossing": for anything added to the page by embossing or other forms of pressure (including pricking holes in the page)

"unknown": where we don’t know what’s omitted (e.g. if the page is torn away and we don’t know what was on it)

3. extent=

Use only the following methods:

For omitted letters or words: indicate the number of letters or words omitted. If you can tell exactly how many are omitted, the value should be simply “3 letters” or “5 words”. If it is unclear exactly how many letters are omitted, the value should be “approximately 2 letters” or “approximately 6 words”. Use the surrounding text to judge the approximate extent of the omission. The omission should be indicated in letters if the omitted text is only part of a word; it should be indicated in words if the omitted text is several words.

For parts of pages (this is most usually done in cases of excerption): indicate the number of lines of text omitted, not counting blank lines. For multiple-column layouts, count the lines in the first column, then the lines in the second column, etc.; don’t bother with “1 whole column + 6 lines” or whatever. Note that for excerpts where the omitted text includes at least one whole page plus part of the page where the excerpt begins or ends, there should be separate GAP elements: one indicating the omitted lines on the partially transcribed page and others indicating the omitted whole pages surrounding the excerpt.

For multiple whole pages: indicate both the collation of the omitted pages and the page number range omitted. For details of the formula to use, see 196 and 197.

list all entries