Sample TEI Documents

This is a small collection of sample TEI files gathered from a variety of real-world TEI projects. They represent an extremely varied set of documents and encoding approaches, and may be useful as a way of getting ideas about how to encode specific features, and also as illustrations of the use of specific elements. When using these examples as models, bear in mind:

In Vitro Samples

These samples were encoded for training purposes by Melanie Chernyk, to demonstrate various features of TEI.

Proust, À la recherche du temps perdu (TEI Tite)

This sample includes the first tome of the first edition of Proust's À la recherche du temps perdu (Paris: Gallimard, 1919, under the imprint of the Éditions de la Nouvelle Revue Française). It contains Du Côté de Chez Swann, Part I - "Combray" and Part II - "Un Amour de Swann." This sample is encoded with TEI Tite, a very simple and constrained TEI schema used for data capture by vendors. This sample was generously contributed by Jeff Drouin at the University of Illinois.

Women Writers Project

The Northeastern University Women Writers Project is a digital collection of early modern women's writing in English, specializing in detailed structural and content markup. Because no page images are provided, the markup also captures a fairly detailed representation of the appearance of the source text. The sample files included here represent several different genres, plus a taxonomy file containing a genre classification. The WWP uses a TEI customization that adds several WWP-specific elements (including <vuji>, <mw>, and <mcr>). These files also illustrate the use of XInclude, which is used to pull in (among other things) the taxonomy file into the TEI header.

The Swinburne Archive

The Swinburne Archive is a digital collection, or virtual archive, devoted to the life and work of Victorian poet Algernon Charles Swinburne. The files included here represent a fairly comprehensive picture of how XML can be used to represent a large and complex document, including a number of generated files that are used for visualizations of various kinds.

Chicago Foreign Language Press Survey

The Newberry Library has been digitizing the Chicago Foreign Language Press Survey, which was a project of the Illinois Works Projects Administration between 1936 and 1941. The Survey selected and translated articles published between 1861 and 1938 in Chicago newspapers, covering twenty-two linguistic and ethnic groups in the city.

ECCO, Charles Macklin's King Henry VII: or the popish impostor

These samples demonstrate the evolution of TEI data from an early SGML version, through a P5 XML version, to a final version that has had morphosyntactic markup automatically added using Phil Burns' Morphadorner tool.

Samples from the Humanities Computing Media Centre

These samples include a sample file from the Colonial Despatches Project and a sample file from the Mariage Project