<?xml version="1.0" encoding="utf-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title level="a">TEI - Keeping It Simple</title>
            <author>
               <name>Thomas Hansen</name>
               <address>
                  <addrLine>Society for Danish Language and Literature</addrLine>
                  <addrLine><ref target="mailto:th@dsl.dk">th@dsl.dk</ref></addrLine>
               </address>
            </author>
            <editor role="acceptingeditor">
               <name>Christine McWebb</name>
               <address>
            <addrLine>University of Waterloo</addrLine>
          </address>
            </editor>
            <editor role="recommendingreader">
               <name>Julia Flanders</name>
               <address>
            <addrLine>Brown University</addrLine>
          </address>
            </editor>
         </titleStmt>
         <publicationStmt>
            <publisher>Digital Medievalist, University of Lethbridge</publisher>
            <pubPlace>Lethbridge AB, Canada T1K 3M4 </pubPlace>
            <availability>
               <p>© Thomas Hansen, 2011. Creative Commons Attribution-NonCommercial licence</p>
            </availability>
            <date n="received" when="2011-09-11">September 11, 2011</date>
            <date n="revised" when="2011-12-01">December 1, 2011</date>
            <date n="published" when="2012-02-07">February 7, 2012</date>
         </publicationStmt>
         <seriesStmt>
            <title>Digital Medievalist</title>
            <idno type="issue">7</idno>
            <idno type="date">2011</idno>
         </seriesStmt>
         <sourceDesc>
            <p>Born digital</p>
         </sourceDesc>
      </fileDesc>
      <encodingDesc>
         <projectDesc>
            <p>Article from Digital Medievalist Journal (URL: <ref
                  target="http://www.digitalmedievalist.org/"/>)</p>
         </projectDesc>
         <refsDecl>
            <p>Citations from the text of this article should be by paragraph number (found on the
               ID attribute of the p element).</p>
         </refsDecl>
      </encodingDesc>
      <profileDesc>
         <creation/>
         <langUsage>
            <language ident="en-GB">en-GB</language>
         </langUsage>
         <textClass>
            <keywords scheme="DM">
               <term type="DMType">article</term>
               <term type="keyword">Descriptive Markup Portability</term>
               <term type="keyword">Manuscript Description</term>
               <term type="keyword">Sustainability</term>
               <term type="keyword">Text Encoding Initiative (TEI)</term>
               <term type="keyword">eXtensible Markup Language (XML)</term>
            </keywords>
         </textClass>
      </profileDesc>
      <revisionDesc>
         <change who="http://www.digitalmedievalist.org/about/#cf"><date when="2011-12-01"/>cf
            initial encoding</change>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <argument n="abstract">
            <p>This article discusses the reasons for implementing TEI P5 in a Danish publishing
               project. We argue that while the standard performs well as a sustainable storage and
               interchange format, it is generally too complicated to operate efficiently. We show
               how to cope with this difficulty by introducing a template that makes the daily work
               easier.</p>
         </argument>
      </front>
      <body>
         <div>
            <head>Introduction</head>
            <p xml:id="hansen.p0001">Alvin Toffler, in his book <emph>The Third Wave</emph> (<ref
                  target="#Toffler1980">1980</ref>), described how civilization had evolved through
               what he characterized as three waves sweeping through history. The First Wave brought
               an end to the nomadic lifestyle of hunting and gathering and marked the beginning of
               the <soCalled>Agricultural Age,</soCalled> in which people settled in communities
               supported by agriculture and animal husbandry. The Agricultural Age lasted roughly
               from 8000 BC to 1650-1750, when in Europe another wave started rolling. This Second
               Wave was the beginning of the <soCalled>Industrial Age,</soCalled> in which the
               development of machines enabled mass production of everything from consumer goods to
               communications. Then again, around 1980, a Third Wave was gathering momentum in the
               US. This time, change was driven by information processing machines introducing yet
               another significant increase of productivity. The age of The Third Wave was the
                  <soCalled>Information Age,</soCalled> in which we now live.</p>
            <p xml:id="hansen.p0002">During the First Wave, people mostly produced for self-use, and
               thus economy depended on access to land and labor. During the Second Wave, machines
               and technology introduced an increase in productivity, making the production of
               energy and know-how crucial parameters. In the Information Age, a state of disruption
               has reigned: As Information and Communication Technologies allow know-how, which was
               previously bound to specialists, to be captured, stored, and shared as needed, people
               are able to produce customized goods and services for themselves. This increase in
               productivity causes entire Second Wave markets of standardized, mass-produced goods
               to fall apart, and to give way to a multitude of standards and forms of dissemination
                  (<ref target="#Toffler1980">Toffler 1980</ref>, 196-204; <ref
                  target="#SimonsBlack2009">Simons and Black 2009</ref>).</p>
            <p xml:id="hansen.p0003">Despite the inherently shareable nature of digital data,
               keeping them in a closed circuit of proprietary formats and tools proves
               counter-productive; especially if information is supposed to transcend computer
               environments, domains of application, and the passage of time. In this paper, we
               shall discuss how the commitment to creating portable data has been met as the
               technology of descriptive markup has matured. Section 1 describes the introduction of
               SGML at the Diplomatarium Danicum as a measure to prevent lock-in by word-processor
               formats. At this stage, markup is produced purely for self-use, first in print, then
               later in a Web publication. In Section 2, we consider how implementing The Text
               Encoding Initiative, TEI, as document format might enable higher productivity and
               better sharing of data. The following Section 3 attempts to outline a general
               approach to the implementation process. While simplifying the application format
               might yield a product which is consistent and easy to manage, the outcome might not
               prove equally easy for encoders to use. To address this issue, Section 4 describes
               how markup routines can be rationalized by a template that may be transformed into a
               more richly structured TEI document. Section 5 follows with some closing remarks on
               the possible impact of using well-defined markup formats.</p>
         </div>
         <div>
            <head>Producing for self-use</head>
            <p xml:id="hansen.p0004">As part of The Society for Danish Language and Literature, DSL,
                  <ref target="http://diplomatarium.dk/">Diplomatarium Danicum</ref> publishes
               critical editions of medieval legal documents. In the late 1990s, problems with
               word-processor formats and their lack of semantic and pragmatic coding were starting
               to prevent information from flowing even to the nearest printer. In an attempt to
               address this portability issue DSL introduced SGML as storage format on Diplomatarium
               Danicum, and, to make for a gentle transition, a document type was modeled from the
               legacy print publication. The DTD defined a mere 30 element types, most of which were
               free-form text fields. For instance, a manuscript is described within a MANUSCRIPT
               wrapper subordinating two elements: first an ID (identifier) element with a siglum
               value, then an INF (information) element, in which all information relating to the
               text-witness is recorded. Similarly, a single PUBLICATIONS element recorded all
               bibliographic information about the manuscript. The SGML files were converted into
               print and proofread before being typeset.</p>
            <p xml:id="hansen.p0005">At first, most markup work was done by assistants, but,
               generally, the SGML revolution was a quiet one, and soon everybody had adopted the
               SGML <emph>modus operandi</emph>. As part of a community that places a premium on
               flexibility and integrity, the methodology of defining one's own categorization
               scheme added a sense of continuity, and, of course, it was also free and required
               little more than a text editor to operate.</p>
            <p xml:id="hansen.p0006">Shortly thereafter, influenced by the internet and the uptake
               of the growing tool chest of XML technologies, DSL saw an opportunity for a
               rationalization even more in keeping with the Third Wave promise of efficiency:
               information captured digitally should also be distributed digitally, and at a
               fraction of the usual cost. So, after a total of approximately 15,000 texts, the
               print edition was discontinued and succeeded by a Web publication in 2001. Although
               SGML receded to XML, descriptive markup was still produced for self-use, primarily to
               let editors continue doing what they were used to, and only to be transformed into
               HTML in a customized web application.</p>
            <p xml:id="hansen.p0007">In addition to the advantage of being stored in a scalable,
               text-based format, the material is syntactically coherent and examples of tag abuse
               are rare. However, in terms of data longevity, the transition from word processing to
               the XML work chain had only replaced one closed circuit with another one that was
               less closed. The fact that the format is largely undocumented, completely
               idiosyncratic, and too coarsely structured to lend itself well to format conversion
               and query, means it is difficult to maintain. Moreover, whether the end result was a
               print or online publication, it was still a one way road from data capture to
               exposure, and it was obvious that the result was not taking advantage of everything
               the technology had to offer.</p>
         </div>
         <div rend="P12">
            <head>Introducing TEI</head>
            <p xml:id="hansen.p0008">In 2007, TEI released version P5 (<ref
                  target="#BurnardBauman2007">Burnard and Bauman 2007</ref>) of a standard that had
               been in development since 1990. From the perspective of the Diplomatarium, a
               significant improvement was the incorporation of the manuscript description module.
               At the same time, a three-year grant from the Carlsberg Foundation had enabled the
               development of a repository capable of holding all the documents published under the
               project. Initially, the repository was supposed to support the ongoing work on some
               8,500 documents from the period 1413-1450. Then, as soon as a common format had been
               established, the old one was to be deprecated and the existing 3000 XML documents
               from the period 1401-1412 to be converted and incorporated into the holdings. A
               digitization of the 15,000 printed documents, however, was not part of the plan. The
               first deliverables were two technical reports presenting a way of expressing the
               features of interest in the TEI header and text markup (<ref target="#Hansen2010a"
                  >Hansen 2010a</ref>, <ref target="#Hansen2010b">2010b</ref>). While the tech
               reports explore the details of the implementation, we will assess the strategic
               reasons behind it. Contrary to the deprecated XML application, TEI has the advantage
               of being viable outside the project, plus offering better possibilities of
               multi-purpose content. In other words, TEI provides a format which on the one hand is
               general and popular, and, on the other, articulate and flexible.</p>
            <div>
               <head>General and popular</head>
               <p xml:id="hansen.p0009">With the prospect of having to manage some 25,000 documents,
                  the main motivation for defining a general document format is to establish a joint
                  basis for tools and procedures operating on the material. Not having to configure
                  software for multiple formats should, <emph>ceteris paribus</emph>, minimize the
                  need for one-off integration and make the development and maintenance of tools and
                  procedures less error-prone. However, more significantly, since data must also be
                  exchangeable as static documents, and are no longer produced with the sole purpose
                  of being consumed by custom-fit applications, a popular, well-documented format is
                  an advantage.</p>
               <p xml:id="hansen.p0010">In considering possible use cases for such documents, an
                  exchange could take place in-house with other DSL projects. For instance, the
                  documents could be processed and used in language corpora for corpus-based
                  dictionaries; something that would expose the material to fields like linguistics
                  and language history. But also, given the high costs of creating digital
                  resources, a lot of the money that used to be spent on digitization and
                  publication is now diverted into preservation purposes instead. Indeed, the plans
                  for large centralized repositories, <emph>research infrastructures</emph> like
                  CLARIN (<ptr target="http://www.clarin.eu/external/"/>), bear testament to an
                  incipient specialization between projects that produce data and organizations that
                  preserve them. With potential consumers as research infrastructures entering the
                  field, a market for text resources emerges; and since markets – understood here
                  simply as switchboards of goods – are likely to turn to standards for quality
                  assessment, well-documented formats like TEI are the ones major players like
                  CLARIN are willing to adopt.</p>
               <p xml:id="hansen.p0011">On the other hand, because particular extensions to the
                  standard are less likely to be universally deployed, we have exercised some
                  self-constraint not to deviate from the standard. While this is mainly to minimize
                  the risk of rendering any effort obsolete, it also reflects a modest hope that
                  such larger concentrations of standard-conformant material might stimulate further
                  development of tools and techniques.</p>
            </div>
            <div>
               <head>Articulate and flexible</head>
               <p xml:id="hansen.p0012">If the elements in the running text are supposed to let us
                  infer the meaning of passages in the marked up document (as suggested in <ref
                     target="#Sperberg-McQueenHuitfeldtRenear2000">Sperberg-McQueen, Huitfeldt, and
                     Renear 2000</ref>) then, of course, the element types have to fit the content;
                  otherwise, tag abuse and communication breakdown might occur. The fact that TEI is
                  developed to mark up features of <emph>any</emph> written artefact (<ref
                     target="#Lavagnino2006">Lavagnino 2006</ref>) means that its terminology is
                  general enough to allow common features of an otherwise heterogeneous document
                  material to be expressed in standard terms. With the incorporation of the
                  so-called manuscript module in the P5 version of the standard in 2007 (<ref
                     target="#Driscoll2006">Driscoll 2006</ref>), it has become particularly useful
                  for detailed annotation of just the kind of material (European medieval
                  manuscripts) with which the Diplomatarium Danicum deals.</p>
               <p xml:id="hansen.p0013">Besides reflecting the breadth and depth of coverage, TEI's
                  comprehensive schema provides enough structure to facilitate the level of
                  processing we want. The expressive power of XML's hierarchical content model
                  allows many processing details to be derived from an element's place in the
                  document hierarchy, and the ability to precisely address parts of documents by
                  means of path expressions adds robust handles to the text. This is the reason why
                  we have opted for a much more structured and granular markup approach than in the
                  deprecated model.</p>
               <p xml:id="hansen.p0014">Finally, in terms of flexibility, TEI is designed for a wide
                  range of implementations; a wealth of information may be given in more or less
                  fine-grained and structured ways. However, since a schema is only fully functional
                  if it helps avoid compromising the product with format inconsistencies (e.g.
                  dates, language codes appearing in different shapes) and structural
                  irregularities, some important customization details are expected to be settled on
                  the implementation level. Basically, this boils down to a question of picking out
                  the elements and attributes needed, and deciding how these should be
                  populated.</p>
            </div>
         </div>
         <div rend="P12">
            <head>Applying the TEI</head>
            <p xml:id="hansen.p0015">Although concise schemas like TEI Lite (Burnard and
               Sperberg-McQueen 2006) have been widely adopted, the Diplomatarium draws upon
               features not included here. On the other hand, since TEI Lite also offers features
               which are not needed, a functional schema is best established either by adding to
               subsets like the tei_bare, or by stripping away from the entire tei_all schema.
               Either way, since we are aiming for portability, the modification should comply with
               TEI's conformance criteria (<ref target="#BurnardBauman2007">Burnard and Bauman
                  2007</ref>, ch. 23.3). According to these, documents should:</p>
            <list type="ordered">
               <item>be well-formed; </item>
               <item>validate against the tei_all schema; </item>
               <item>use the definitions in the Guidelines; </item>
               <item>contain only elements in the TEI namespace: http://www.tei-c.org/ns/1.0,
                  and</item>
               <item>have a schema derived from an ODD (One Document Does it all) file.</item>
            </list>
            <p xml:id="hansen.p0016">A good reason for using the ODD format (<ref
                  target="#BurnardBauman2007">Burnard and Bauman 2007</ref>, ch. 22) is to have a
               transparent way of documenting the customization with respect to the unmodified
               starting point; something enabling others to know which of the 500+ elements of the
               tei_all schema are in use, and whether this usage accords with the Guidelines. At the
               same time, the ODD is a source for generating different types of schemas and
               documentation by means of designated tools. But, more significantly, since we are
               committed to working within the TEI framework and not complicating it with
               extensions, we regard the implementation of TEI as a process of simplification.</p>
            <p xml:id="hansen.p0017">The first step is an elimination of the tei_all elements not
               needed. Using ODD to build a schema accepting elements from the msdescription
               (manuscript description) module, we use an empty moduleRef element with the key
               attribute set to "msdescription", and add the list of the elements <emph>not</emph>
               needed as values of the except attribute: <quote>
                  <ab type="code">&lt;moduleRef key="msdescription" except="accMat acquisition
                     adminInfo altIdentifier binding bindingDesc catchwords collation collection
                     colophon custEvent custodialHist depth explicit finalRubric foliation heraldry
                     incipit institution locus locusGrp msPart musicNotation objectType origDate
                     origPlace origin provenance recordHist rubric scriptDesc secFol signatures
                     source stamp surrogates textLang typeDesc watermark"/&gt;</ab></quote></p>
            <p xml:id="hansen.p0018">Having carved out a block of elements by going over the
               relevant modules as sketched above, we focus on the remaining elements of the
               application; each one can be re-declared by an elementSpec (element specification)
               with identifier, module, and mode attributes. Without going into details, we will
               concentrate on simplifying the content model in the content element, and the list of
               attributes in the attList: <quote>
                  <ab type="code">&lt;elementSpec ident="dimensions" module="msdescription"
                     mode="change"&gt;<lb/> &lt;content&gt; … &lt;/content&gt;<lb/> &lt;attList&gt;
                     … &lt;/attList&gt;<lb/> &lt;/elementSpec&gt;</ab></quote></p>

            <p xml:id="hansen.p0019">For instance, in the unmodified schema, the content model of
               the dimensions element allows for the omission of children elements, or instead
               filling the element with an unlimited number of either dim elements or the elements
               height, depth, and width constituting the model.dimLike class. Expressed in ODD as a
               RELAX NG pattern, this rather wide range of possibilities looks like this:<quote>
                  <ab type="code">&lt;content&gt;<lb/> &lt;rng:group&gt;<lb/>
                     &lt;rng:zeroOrMore&gt;<lb/> &lt;rng:choice&gt;<lb/> &lt;rng:ref name="<ref
                        target="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-dim.html"
                        >dim</ref>"/&gt;<lb/> &lt;rng:ref name="<ref
                        target="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-model.dimLike.html"
                        >model.dimLike</ref>"/&gt;<lb/> &lt;/rng:choice&gt;<lb/>
                     &lt;/rng:zeroOrMore&gt;<lb/> &lt;/rng:group&gt;<lb/>
                  &lt;/content&gt;</ab></quote></p>
            <p xml:id="hansen.p0020">The model we are looking for, however, is one that requires the
               operator to provide exactly one height and one width element every time. So instead
               we may write: <quote>
                  <ab type="code">&lt;content&gt;<lb/> &lt;rng:ref name="height"/&gt;<lb/>
                     &lt;rng:ref name="width"/&gt;<lb/> &lt;/content&gt;</ab>
               </quote> This modification is <soCalled>clean,</soCalled> because documents
               validating against the modified schema also validate against the unmodified tei_all
               outset.</p>
            <p xml:id="hansen.p0021">A look at the attributes of the dimensions element suggests
               further simplifications. Originally, the dimensions element ships with 27 attributes,
               but, for our purpose, only the unit attribute is necessary; and so deleting the rest
               should minimize the risk of misplacing them. Each attribute is therefore re-declared
               in an attDef element with the attribute name as the value of the ident (identifier)
               attribute, and the mode of the change set to "delete":<quote>
                  <ab type="code">&lt;attList&gt;<lb/> &lt;attDef ident="type"
                     mode="delete"/&gt;<lb/> &lt;attDef ident="quantity" mode="delete"/&gt;<lb/>
                     &lt;attDef ident="extent" mode="delete"/&gt;<lb/> …</ab>
               </quote></p>
            <p xml:id="hansen.p0022">A portable data standard is not only characterized by an agreed
               set of elements and attributes, but also by an agreed set of permissible values for
               these. A document exchanged between two parties would not be mutually understandable
               if either or both parties used internal coding schemes to populate elements. Having
               deleted all but one of 27 attributes, we not only want the unit attribute to be
               required, but also to make "cm" (centimetres) the only applicable value of it. This
               is done by providing a list of values containing the value item identified by the
               string "cm" as a replacement for any other value list:<quote>
                  <ab type="code">&lt;attDef ident="unit" mode="change" usage="req"&gt;<lb/>
                     &lt;valList type="closed" mode="replace"&gt;<lb/> &lt;valItem
                     ident="cm"&gt;<lb/> &lt;desc&gt;centimetres&lt;/desc&gt;<lb/>
                     &lt;/valItem&gt;<lb/> &lt;/valList&gt;<lb/> &lt;/attDef&gt;</ab>
               </quote></p>
            <p xml:id="hansen.p0023">Where we wish to constrain element values—say, have "Danish
               Society for Language and Literature" as the only valid content of the publisher
               element—we write a pattern stating it: <quote>
                  <ab type="code">&lt;elementSpec ident="publisher" module="core"
                     mode="change"&gt;<lb/> &lt;content&gt;<lb/> &lt;rng:value&gt;Danish Society for
                     Language and Literature&lt;/rng:value&gt;<lb/> &lt;/content&gt;<lb/>
                     &lt;/elementSpec&gt;</ab>
               </quote></p>
            <p xml:id="hansen.p0024">The ability to translate business rules, such as to
                  <emph>always</emph> measure the dimensions of a document and <emph>always</emph>
               in centimetres, into required elements and fixed values improves the consistency of
               the end product. Of course, since we are dealing with many different kinds of
               information, far from all values can be fixed this way. However, when it comes to
               dealing with missing information, a general coding scheme seems to make sense. For
               example, if a manuscript has been issued without seals, then according to the
               original schema the seal description (sealDesc) element may be omitted. But such
               omissions could also mean that the information is missing because it is irrelevant,
               because it is still undetermined, or simply left out by mistake. Having already been
               made mandatory, such elements are also assigned a set of values to help clarify this
               particular issue:</p>
            <table rend="frame">
               <row>
                  <cell>Value</cell>
                  <cell>Type</cell>
                  <cell>Meaning</cell>
               </row>
               <row>
                  <cell>Empty</cell>
                  <cell>strings</cell>
                  <cell>Information does not exist</cell>
               </row>
               <row>
                  <cell>0</cell>
                  <cell>numbers</cell>
                  <cell>Information does not exist</cell>
               </row>
               <row>
                  <cell>1000</cell>
                  <cell>dates</cell>
                  <cell>Information does not exist</cell>
               </row>
               <row>
                  <cell>Nil</cell>
                  <cell>strings</cell>
                  <cell>Information is undetermined</cell>
               </row>
               <row>
                  <cell>99999999</cell>
                  <cell>numbers/dates</cell>
                  <cell>Information is undetermined</cell>
               </row>
            </table>
            <p xml:id="hansen.p0025">These values are applicable almost everywhere information is to
               be recorded. "Almost," since one of the few places where TEI have declared a closed
               set of permissible attribute values is the cert attribute, which may only be
               populated with "high," "medium," or "low". Had the values been added, the
               modification would have been an extension. There are also compromises. In order to be
               able to state non-existent date information we have chosen "1000", because the W3C
               data type xs:date does not accept "0".</p>
         </div>
         <div rend="P12">
            <head>TEI by proxy</head>
            <p xml:id="hansen.p0026">Although modifications might yield more functional schemas with
               better guidance for the encoder, the real benefits seem to accrue to those managing
               the data. In order to make documents more manageable, they have been made
               structurally homogeneous, but at the same time quite verbose. Indeed, despite the
               help offered by schema and tools, instance documents with line upon line of deeply
               nested elements are not necessarily easy for encoders to use.</p>
            <p xml:id="hansen.p0027">To make data entry easier we devised a template implemented as
               an XML Schema Document, from which TEI documents are derived using an XSLT
               stylesheet. Appreciating the sense of continuity that a fully self-invented
               application can provide, the architectural principle has been to respect the existing
               workflow of the project. Compared to the TEI document, the template is "flat" and
               comprehensible, and it comes with default values "nil" and "99999999" declaring the
               information undetermined. While working, the editor resolves whether the features are
               present or irrelevant in the particular situation.</p>
            <p xml:id="hansen.p0028">The relation between template and TEI documents is not
               one-to-one. Where tasks can be automated and spare editors from unnecessary typing,
               this is done by the stylesheet. For instance, to keep texts and translations
               parallel, the number of words and paragraphs are computed as in TEI snippet below: <quote>
                  <ab type="code">&lt;extent&gt;Base text, number of words: &lt;num
                     n="words"&gt;535&lt;/num&gt;, paragraphs: &lt;num
                     n="paragraphs"&gt;23&lt;/num&gt;.<lb/> Translation, number of words: &lt;num
                     n="words"&gt;592&lt;/num&gt;,<lb/> paragraphs: &lt;num
                     n="paragraphs"&gt;23&lt;/num&gt;<lb/> &lt;/extent&gt;</ab></quote></p>
            <p xml:id="hansen.p0029">When a template has been proof-read and reviewed, it is
               discarded; after all, template documents are meant to live and die with the project,
               whereas the TEI product is intended for a multitude of purposes.</p>
            <p xml:id="hansen.p0030">Stored in a repository, the TEI documents may be enriched with
               new markup by applying either automatic routines or manual procedures. For instance,
               when the texts have been established incipits could be automatically generated and
               placed in the TEI header, while place and personal names could be recorded in
               semi-automatic procedures using XQuery. This ability to segment work is a strategic
               advantage at a time where long-term project funding becomes increasingly difficult to
               obtain.</p>
            <p xml:id="hansen.p0031">To give an idea of what information is recorded by the project,
               we will walk through the template explaining how it maps to TEI P5.</p>
            <div>
               <head>editorInitials</head>
               <p xml:id="hansen.p0032">The editor begins with identifying himself by choosing his
                  initials from a closed set of values. Currently there are six editors and six
                  values. So, for example, the snippet <code>&lt;editorInitials&gt; mh
                     &lt;/editorInitials&gt;</code> transforms into more structured TEI syntax:<quote>
                     <ab type="code">&lt;editor&gt; &lt;name xml:id="mh"&gt;<lb/> &lt;forename
                        type="first"&gt;Markus&lt;/forename&gt;<lb/>
                        &lt;surname&gt;Hedemann&lt;/surname&gt;<lb/> &lt;/name&gt;<lb/>
                        &lt;/editor&gt;</ab></quote></p>
            </div>
            <div>
               <head>textId</head>
               <p xml:id="hansen.p0033">To identify the text, the editor then fills in a number
                  after the pattern <emph>yyyymmddxyz</emph>. The template
                     <code>&lt;textId&gt;14201127001&lt;/textId&gt;</code> yields TEI <code>&lt;idno
                     type="dd"&gt; 14201127001 &lt;/idno&gt;</code>.</p>
            </div>
            <div>
               <head>revision</head>
               <p xml:id="hansen.p0034">In order to be able to track the status of a document, the
                  editor enters initials and date in a log registering four stages: first when the
                  document is established; then during editing at three proof-reading stages, i.e.
                  proofFirst, proofSecond, and proofThird. In this template snippet, the document
                  has been established and proof-read once:<quote>
                     <ab type="code">&lt;revision&gt;<lb/> &lt;established who="#alk"
                        when="2010-06-02"/&gt;<lb/> &lt;proofFirst who="#jon"
                        when="2010-10-10"/&gt;<lb/> &lt;proofSecond who="#nil"
                        when="99999999"/&gt;<lb/> &lt;proofThird who="#nil"
                        when="99999999"/&gt;<lb/> &lt;/revision&gt;…</ab></quote></p>
               <p xml:id="hansen.p0035">The stylesheet renders this in more human-readable TEI:<quote>
                     <ab type="code">&lt;revisionDesc&gt;<lb/> &lt;change when="2010-06-02"
                        who="#mh"&gt;Document established by Markus Hedemann, June 2,
                        2010&lt;/change&gt;<lb/> &lt;change when="2010-10-10" who="#jon"&gt;Proof
                        read once by Jonathan Adams, October 10, 2010&lt;/change&gt;<lb/> &lt;change
                        when="99999999" who="#nil"&gt;nil&lt;/change&gt;<lb/> &lt;change
                        when="99999999" who="#nil"&gt;nil&lt;/change&gt;<lb/>
                        &lt;/revisionDesc&gt;</ab></quote></p>
            </div>
            <div>
               <head>textCreationTimeEarliest, textCreationTimeLatest</head>
               <p xml:id="hansen.p0036">Having described the file itself, the editor turns to an
                  account of the circumstances surrounding the issuing of the original document.
                  This starts with a <emph>terminus ante quem</emph> and a <emph>terminus post
                     quem</emph>, which in datable manuscripts is often the same value. These are
                  defined by means of the XML Schema built-in datatype xs:date, which accepts a
                  pattern such as <emph>yyyy-mm-dd</emph>:<quote>
                     <ab type="code"
                        >&lt;textCreationTimeEarliest&gt;1420-11-27&lt;/textCreationTimeEarliest&gt;<lb/>
                        &lt;textCreationTimeLatest&gt;1420-11-27&lt;/textCreationTimeLatest&gt;</ab></quote></p>
            </div>
            <div>
               <head>textCreationTimeCertainty</head>
               <p xml:id="hansen.p0037">Depending on whether the date of issuing appears explicitly,
                  textCreationTimeCertainty is filled with one of two values: "high", indicating
                  that the information can be read from the text-witness, or "low," stating that the
                  information cannot be read, but has been established by other criteria:<quote>
                     <ab type="code"
                        >&lt;textCreationTimeCertainty&gt;high&lt;/textCreationTimeCertainty&gt;</ab></quote>
                  Stating levels of certainty is a way of meeting the processing expectation of
                  uncertain dates being rendered in square brackets.</p>
            </div>
            <div>
               <head>textCreationPlace, textCreationPlaceCertainty</head>
               <p xml:id="hansen.p0038">Similar to the account of the document date, the place and
                  the certainty must also be given if possible. This is done in the
                  textCreationPlace and textCreationPlaceCertainty elements. Contrary to
                  textCreationTimeCertainty, textCreationPlaceCertainty element can be "switched
                  off" by means of the empty value. A template snippet containing the previous five elements:<quote>
                     <ab type="code"
                        >&lt;textCreationTimeEarliest&gt;1420-11-27&lt;/textCreationTimeEarliest&gt;<lb/>
                        &lt;textCreationTimeLatest&gt;1420-11-27&lt;/textCreationTimeLatest&gt;<lb/>
                        &lt;textCreationTimeCertainty&gt;high&lt;/textCreationTimeCertainty&gt;<lb/>
                        &lt;textCreationPlace&gt;Roskilde&lt;/textCreationPlace&gt;<lb/>
                        &lt;textCreationPlaceCertainty&gt;high&lt;/textCreationPlaceCertainty&gt;</ab></quote>
                  transforms into the following TEI structure: <quote>
                     <ab type="code">&lt;creation&gt;<lb/> &lt;date not-before="1425-02-01"<lb/>
                        not-after="1425-02-01"<lb/> cert="high"&gt;1425, 1
                        February&lt;/date&gt;<lb/> &lt;placeName
                        cert="high"&gt;Roskilde&lt;/placeName&gt;<lb/>
                     &lt;/creation&gt;</ab></quote></p>
            </div>
            <div>
               <head>summaryText</head>
               <p xml:id="hansen.p0039">Finally, a summaryText corresponds directly to the TEI
                  summary describing the "intellectual content of an item" under msContents wrapper:<quote>
                     <ab type="code">&lt;msContents&gt;<lb/> &lt;summary&gt; King Erik 7. of
                        Pomerania summons… &lt;/summary&gt;<lb/>
                  &lt;/msContents&gt;</ab></quote></p>
            </div>
            <div>
               <head>witness</head>
               <p xml:id="hansen.p0040">The witness element is a wrapper for 16 elements, most of
                  which are mapped directly to equivalent TEI P5 element types. The first five of
                  these elements identify the text-witness in much the same way as the elements
                  under the TEI P5 msIdentifier (manuscript identifier) element.</p>
            </div>
            <div>
               <head>witnessSigil</head>
               <p xml:id="hansen.p0041">First, the editor provides a unique witness siglum. This
                  value corresponds to the value of the xml:id attribute in the TEI witness
                  element.</p>
            </div>
            <div>
               <head>archivePlaceName</head>
               <p xml:id="hansen.p0042">Second, a value corresponding to the TEI settlement element
                  is provided. This element is meant to contain "the name of a settlement, such as a
                  city, town, or village, identified as a single geo-political or administrative
                  unit".</p>
            </div>
            <div>
               <head>archiveName</head>
               <p xml:id="hansen.p0043">The archiveName corresponds to the TEI repository
                  element.</p>
            </div>
            <div>
               <head>inventoryNumber</head>
               <p xml:id="hansen.p0044">Intentional value corresponding to the TEI idno element.
               </p>
            </div>
            <div>
               <head>manuscriptName</head>
               <p xml:id="hansen.p0045">Intentional value corresponding to the TEI msName element.
                  The template element <code>&lt;manuscriptName&gt;Langebeks Diplomatarium,
                     p.7&lt;/manuscriptName&gt;</code> mirrors the TEI <code>&lt;msName&gt;Langebeks
                     Diplomatarium, p. 117&lt;/msName&gt;</code>. When processed, the template elements:<quote>
                     <ab type="code">&lt;witnessSigil&gt;A&lt;/witnessSigil&gt;<lb/>
                        &lt;archivePlaceName&gt;Copenhagen&lt;/archivePlaceName&gt;<lb/>
                        &lt;archiveName&gt;Rigsarkivet&lt;/archiveName&gt;<lb/>
                        &lt;inventoryNumber&gt;NKR c-2732&lt;/inventoryNumber&gt;<lb/>
                        &lt;manuscriptName&gt;empty&lt;/manuscriptName&gt;</ab></quote> are
                  transformed into TEI P5 as:<quote>
                     <ab type="code">&lt;witness xml:id="A"&gt;<lb/> &lt;msDesc&gt;<lb/>
                        &lt;msIdentifier&gt;<lb/>
                        &lt;settlement&gt;Copenhagen&lt;/settlement&gt;<lb/>
                        &lt;repository&gt;Rigsarkivet&lt;/repository&gt;<lb/> &lt;idno&gt;NKR
                        c-2732&lt;/idno&gt;<lb/> &lt;msName&gt;empty&lt;/msName&gt;<lb/>
                        &lt;/msIdentifier&gt;<lb/> …</ab></quote></p>
            </div>
            <div>
               <head>manuscriptMaterial </head>
               <p xml:id="hansen.p0046">Having identified the manuscript, the editor accounts for
                  the physical description of the material in a series of nine elements. First, the
                  manuscriptMaterial is selected from a closed set of five string values: <list
                     type="ordered">
                     <item>"empty" – the manuscript material is irrelevant;</item>
                     <item>"mixed" – the manuscript material is part paper, part parchment;</item>
                     <item>"nil" – the manuscript material has not been determined yet;</item>
                     <item>"paper" – the manuscript material is paper;</item>
                     <item>"parch" – the material is parchment.</item>
                  </list></p>
            </div>
            <div>
               <head>manuscriptWidth, manuscriptHeight, and manuscriptPlica </head>
               <p xml:id="hansen.p0047">The dimensions of the original document are given in
                  centimeters as xs:decimal values. While the two first correspond directly to the
                  TEI width and height elements, manuscriptPlica is not defined in TEI terms. The
                  element describes a fold reinforcing the inferior part of the manuscript (Cárcel
                  Ortí 1997, 127). The template snippet: <quote>
                     <ab type="code">&lt;manuscriptMaterial&gt;parch&lt;/manuscriptMaterial&gt;<lb/>
                        &lt;manuscriptHeight&gt;17.2&lt;/manuscriptHeight&gt;<lb/>
                        &lt;manuscriptWidth&gt;24.3&lt;/manuscriptWidth&gt;<lb/>
                        &lt;manuscriptPlica&gt;0.6&lt;/manuscriptPlica&gt;<lb/> …</ab></quote>
                  transforms into the following chunk of TEI as:<quote>
                     <ab type="code">&lt;extent&gt; &lt;dimensions unit="cm"&gt;<lb/>
                        &lt;height&gt;17.2 (plica: 0.6)&lt;/heigh&gt;<lb/>
                        &lt;width&gt;24.3&lt;/width&gt;<lb/> &lt;/dimensions&gt;<lb/>
                        &lt;/extent&gt;</ab></quote></p>
            </div>
            <div>
               <head>conditionDescription</head>
               <p xml:id="hansen.p0048">The conditionDescription describes the physical state of the
                  document and thus corresponds to the TEI condition element. The template:
                     <code>&lt;conditionDescription&gt;The document is severly damaged by fire and
                     water&lt;/conditionDescription&gt;</code> transforms into TEI:<quote>
                     <ab type="code">&lt;condition&gt;<lb/> &lt;ab&gt;The document is severely
                        damaged by fire and water&lt;/ab&gt;<lb/>
                  &lt;/condition&gt;</ab></quote></p>
            </div>
            <div>
               <head>layoutDescription</head>
               <p xml:id="hansen.p0049">The layoutDescription holds a set of layout descriptions
                  applicable to a manuscript; it corresponds to the TEI layoutDesc element. A
                  template snippet: <code>&lt;layoutDescription&gt; The text is arranged in two
                     columns&lt;/layoutDescription&gt;</code> is transformed into:<quote>
                     <ab type="code">&lt;layoutDesc&gt; &lt;ab&gt;The text is arranged in two
                        columns&lt;/ab&gt;<lb/> &lt;/layoutDesc&gt;</ab></quote></p>
            </div>
            <div>
               <head>handDescription</head>
               <p xml:id="hansen.p0050">The handDescription element corresponds to the TEI handNote
                  (note on hand) element; it describes a particular style or hand distinguished
                  within a manuscript. This template:<quote>
                     <ab type="code">&lt;handDescription&gt;The text is written by the same scribe
                        as<lb/>&lt;ref target="14251102001"/&gt;, &lt;ref target="14251102002"/&gt;
                        and<lb/> &lt;ref target="14251102003"/&gt;<lb/>
                        &lt;/handDescription&gt;</ab></quote> transforms into TEI:<quote>
                     <ab type="code">&lt;handDesc&gt;<lb/> &lt;handNote&gt;<lb/> &lt;ab&gt;The text
                        is written by the same scribe as<lb/> &lt;ref
                        target="14251102001"/&gt;,<lb/> &lt;ref target="14251102002"/&gt; and<lb/>
                        &lt;ref target="14251102003"/&gt;<lb/> &lt;/ab&gt;<lb/>
                        &lt;/handNote&gt;<lb/> &lt;/handDesc&gt;</ab></quote></p>
            </div>
            <div>
               <head>additionsToText</head>
               <p xml:id="hansen.p0051">An account of significant additions found within a
                  manuscript, such as marginalia or other annotations, is delivered in the
                  additionsToText element. It corresponds to the TEI additions element. A template
                  such as <code>&lt;additionsToText&gt; On the verso the inscription: &lt;q&gt;Item
                     Hr. Peder Griis&lt;ex&gt;s&lt;/ex&gt;es gaffvebreff.
                     1413&lt;/q&gt;&lt;/additionsToText&gt;</code> corresponds to TEI:<quote>
                     <ab type="code">&lt;additions&gt;<lb/> &lt;ab&gt; On the verso the
                        inscription:<lb/> &lt;q&gt;Item Hr. Peder Griis&lt;ex&gt;s&lt;/ex&gt;es
                        gaffvebreff. 1413&lt;/q&gt;&lt;/ab&gt;<lb/>
                  &lt;/additions&gt;</ab></quote></p>
            </div>
            <div>
               <head>seal</head>
               <p xml:id="hansen.p0052">Another feature of interest is the presence of seals. A seal
                  is described by a wrapper (seal) element subordinating four elements: <list
                     type="ordered">
                     <item>sealNumber;</item>
                     <item>sealStatus;</item>
                     <item>sealDescription; and</item>
                     <item>sealReferenceWork.</item>
                  </list></p>
            </div>
            <div>
               <head>sealNumber</head>
               <p xml:id="hansen.p0053">First, the seals are numbered from left to right with
                  xs:integer values. If a document happens to be issued without seals, the default
                  value "99999999" is changed to "0", stating that the information is
                  irrelevant.</p>
            </div>
            <div>
               <head>sealStatus</head>
               <p xml:id="hansen.p0054">Depending on whether the document has, has had, or simply
                  was issued without seals, a value from a closed set of four values is selected:
                     <list type="ordered">
                     <item>'empty'– the document bares no traces of seals;</item>
                     <item>'missing' – the seal is missing;</item>
                     <item>'nil' – it is undetermined whether the document is sealed or not;</item>
                     <item>'pendant' – the seal is pendant.</item>
                  </list></p>
            </div>
            <div>
               <head>sealDescription</head>
               <p xml:id="hansen.p0055">
                  <anchor/> If a seal is extant, or in any way known, the information is given here.
                  First, the name of holder; then, the method of sealing, and, finally, the material
                  is stated.</p>
            </div>
            <div>
               <head>sealReferenceWork</head>
               <p xml:id="hansen.p0056">Whenever possible, a bibliographic reference to
                  sigillographic sources is given. For instance, a document issued with seals is
                  described in the template as:<quote>
                     <ab type="code">&lt;seal&gt;<lb/> &lt;sealNumber&gt;1&lt;/sealNumber&gt;<lb/>
                        &lt;sealStatus&gt;pendant&lt;/sealStatus&gt;<lb/>
                        &lt;sealDescription&gt;Seal of Jens Olufsen in black wax. Legend: &lt;q&gt;S
                        IOHANNES OLAVI&lt;/q&gt;&lt;/sealDescription&gt;<lb/>
                        &lt;sealReferenceWork&gt;DAS 1061&lt;/sealReferenceWork&gt;<lb/>
                        &lt;/seal&gt;<lb/> …</ab></quote> In TEI:<quote>
                     <ab type="code">&lt;sealDesc&gt;<lb/> &lt;seal n="1" type="pendant"&gt;<lb/>
                        &lt;ab&gt;The seal of Jens Olufsen in black wax. Legend: <lb/> &lt;q&gt;S
                        IOHANNES OLAVI&lt;/q&gt; &lt;ref&gt;DAS 1061&lt;/ref&gt;<lb/>
                        &lt;/ab&gt;<lb/> &lt;/seal&gt;<lb/> &lt;/sealDesc&gt;</ab></quote></p>
               <p xml:id="hansen.p0057">A document issued <emph>without</emph> seals retains the
                  seal element, but it is filled in with values stating explicitly that there never
                  were seals on the document:<quote>
                     <ab type="code">&lt;seal&gt;<lb/> &lt;sealNumber&gt;0&lt;/sealNumber&gt;<lb/>
                        &lt;sealStatus&gt;empty&lt;/sealStatus&gt;<lb/>
                        &lt;sealDescription&gt;empty&lt;/sealDescription&gt;<lb/>
                        &lt;sealReferenceWork&gt;empty&lt;/sealReferenceWork&gt;<lb/>
                        &lt;/seal&gt;</ab></quote> TEI:<quote>
                     <ab type="code">&lt;sealDesc&gt;<lb/> &lt;seal n="0" type="empty"&gt;<lb/>
                        &lt;ab&gt;empty &lt;ref&gt;empty&lt;/ref&gt;&lt;/ab&gt;<lb/>
                        &lt;/seal&gt;<lb/> &lt;/sealDesc&gt;</ab></quote></p>
            </div>
            <div>
               <head>witnessHistory</head>
               <p xml:id="hansen.p0058">When known, facts from the history of a manuscript are
                  recorded. The witnessHistory element corresponds to the TEI history element. The
                  template: <code>&lt;witnessHistory&gt; The letter is registered in the registry of
                     the letters at Vallø (1541), Brevkister 137
                     &lt;/ref&gt;&lt;/witnessHistory&gt;</code> corresponds to TEI:<quote>
                     <ab type="code">&lt;history&gt;<lb/> &lt;ab&gt;The letter was registered in the
                        registry of the letters at Vallø (1541), published &lt;ref&gt;Thiset, Adel.
                        Brevkister 137&lt;/ref&gt;&lt;/ab&gt;<lb/> &lt;/history&gt;</ab></quote></p>
            </div>
            <div>
               <head>filiationDescription</head>
               <p xml:id="hansen.p0059">In case other surviving manuscripts are related to a
                  document, such information may be given in the filiationDescription element. This
                  element is modeled on the TEI filiation element. Thus, a snippet like this:
                     <code>&lt;filiationDescription&gt; The document is an apograph from the
                     document of 1388, January 21, Diplomatarium Danicum III,
                     331&lt;/filiationDescription&gt;</code> converts into TEI:<quote>
                     <ab type="code">…&lt;/summary&gt;<lb/> &lt;msItemStruct&gt;<lb/>
                        &lt;filiation&gt;<lb/> &lt;ab&gt;The document is an apograph from the
                        document of 1388, January 21, Diplomatarium Danicum III, 331&lt;/ab&gt;<lb/>
                        &lt;/filiation&gt;<lb/> &lt;/msItemStruct&gt;<lb/>…</ab></quote></p>
            </div>
            <div>
               <head>bibliographicEntry</head>
               <p xml:id="hansen.p0060">Bibliographic information is recorded in bibliographicEntry
                  elements, each one corresponding to the TEI bibl elements which are wrapped in a
                  listBibl (bibliographic list). A template series of three bibliographicEntry elements:<quote>
                     <ab type="code">&lt;bibliographicEntry&gt;Kirkehist. Saml. V
                        99&lt;/bibliographicEntry&gt;<lb/> &lt;bibliographicEntry&gt;Bull. Dan. 358
                        nr. 466&lt;/bibliographicEntry&gt;<lb/> &lt;bibliographicEntry&gt;Rep. nr.
                        5872 (i udtog)&lt;/bibliographicEntry&gt;</ab></quote> is rendered in TEI as:<quote>
                     <ab type="code">&lt;additional&gt;<lb/> &lt;listBibl&gt;<lb/>
                        &lt;bibl&gt;Kirkehist. Saml. V 99&lt;/bibl&gt;<lb/> &lt;bibl&gt;Bull. Dan.
                        358 nr. 466&lt;/bibl&gt;<lb/> &lt;bibl&gt;Rep. nr. 5872&lt;/bibl&gt;<lb/>
                        &lt;/listBibl&gt;<lb/> &lt;/additional&gt;</ab></quote></p>
            </div>
            <div>
               <head>samplingMethod</head>
               <p xml:id="hansen.p0061">The samplingMethod element wraps three sub-elements: <list
                     type="ordered">
                     <item>textCompleteness stating whether the text appears in extenso (version),
                        or is an excerpt;</item>
                     <item>sourceSiglum containing an xs:IDREF pointing to one of the witnessSiglum
                        values described earlier;</item>
                     <item>samplingNote containing an account of possible omissions</item>
                  </list><quote> Thus the following: <ab type="code">&lt;samplingMethod&gt;<lb/>
                        &lt;textCompleteness&gt;excerpt&lt;/textCompleteness&gt;<lb/>
                        &lt;sourceSiglum&gt;A&lt;/sourceSiglum&gt;<lb/> &lt;samplingNote&gt;The
                        first three paragraphs have been omitted as they are unrelated to Danish
                        matters&lt;/samplingNote&gt;<lb/> &lt;/samplingMethod&gt;</ab></quote>
                  becomes in TEI:<quote>
                     <ab type="code">&lt;samplingDecl&gt;<lb/> &lt;ab&gt;Excerpt from
                        &lt;ref&gt;A&lt;/ref&gt;. The first three paragraphs have been omitted
                        because they are unrelated to Danish matters&lt;/ab&gt;<lb/>
                        &lt;/samplingDecl&gt;</ab></quote></p>
            </div>
            <div>
               <head>textLanguage</head>
               <p xml:id="hansen.p0062">The textLanguage element is filled in with one of currently
                  five enumerated values. Language codes are constructed according to BCP 47 (<ptr
                     target="http://www.rfc-editor.org/rfc/bcp/bcp47.txt"/>), and, where possible,
                  follow the ISO 639-1 standard. textLanguage is an open set, with five values:
                     <list type="ordered">
                     <item>'gda' – Old Danish;</item>
                     <item>'gmh' – German Middle High;</item>
                     <item>'gml' – German Middle Low;</item>
                     <item>'la' – Latin;</item>
                     <item>'xno' – Anglo-Norman.</item>
                  </list> Thus, <code>&lt;textLanguage&gt;la&lt;/textLanguage&gt;</code> transforms
                  into TEI:<quote>
                     <ab type="code">&lt;langUsage&gt;<lb/> &lt;language ident="la"&gt;Main
                        language: latin&lt;/language&gt;<lb/> &lt;/langUsage&gt;</ab></quote></p>
            </div>
            <div>
               <head>text</head>
               <p xml:id="hansen.p0063">The text element is similar to a TEI P5 div element. In the
                  Diplomatarium template, it may only be structured by TEI p (paragraph) elements.
                  Below paragraph level, a mixed content of text and eight TEI element types is
                  allowed: The elements available are: <list type="ordered">
                     <item>app – critical apparatus;</item>
                     <item>cit – citation;</item>
                     <item>damage;</item>
                     <item>ex – expansion;</item>
                     <item>gap;</item>
                     <item>hi – hightlighted;</item>
                     <item>ref (reference);</item>
                     <item>supplied.</item>
                  </list> For instance:<quote>
                     <ab type="code">&lt;text&gt;<lb/> &lt;p&gt; Christierno
                        Hen&lt;ex&gt;n&lt;/ex&gt;ingi presbitero Roskildensis diocesis
                        &lt;/p&gt;<lb/> &lt;p&gt; Benigno etc.&lt;/p&gt;<lb/> &lt;p&gt; Cum itaque
                        &lt;damage&gt;si&lt;/damage&gt;cut exhibita nobis …&lt;/p&gt;<lb/> …<lb/>
                        &lt;/text&gt;</ab></quote> translates into roughly the same, but with
                  numbered paragraphs:<quote>
                     <ab type="code">&lt;text&gt;<lb/> &lt;body&gt;<lb/> &lt;div
                        xml:lang="la"&gt;<lb/> &lt;p n="a#1"&gt; Christierno
                        Hen&lt;ex&gt;n&lt;/ex&gt;ingi presbitero Roskildensis diocesis
                        &lt;/p&gt;<lb/> &lt;p n="a#2"&gt;Benigno etc.&lt;/p&gt;<lb/> &lt;p
                        n="a#3"&gt; Cum itaque &lt;damage&gt;si&lt;/damage&gt;cut exhibita nobis …
                        &lt;/p&gt; <lb/> …<lb/> &lt;/div&gt;<lb/> &lt;body&gt;<lb/>
                        &lt;/text&gt;</ab></quote></p>
            </div>
            <div>
               <head>translation</head>
               <p xml:id="hansen.p0064">Similar to text, the translation element has a mixed content
                  of text and elements; however, the only to elements allowed here are note and ref
                  (reference).</p>
            </div>
         </div>
         <div>
            <head>Conclusion</head>
            <p xml:id="hansen.p0065">Creating the kind of multi-purpose content that can be shared
               when needed clearly means going further than the ambiguous commitment to descriptive
               markup and XML. The application format also has to be commonly known in order to make
               sense for others; this is why we consider TEI and its documentation format ODD the
               best bet for a sustainable storage and exchange format. We like to think of our usage
               of it as a <emph>simple</emph> one: first, because it is unextended and tries to stay
               clear of idiosyncrasies; second, because the different TEI instance documents remain
               structurally the same.</p>
            <p xml:id="hansen.p0066">Still, regardless of the strategic reasons for adopting a
               standard, the format must not prevent people from being productive. For many,
               adopting the XML <emph>modus operandi</emph> already means entering a different work
               chain with special tools and texts interspersed with angular brackets. If standards
               also complicate matters, a successful implementation might be a long way off.
               Therefore, since easier use of a standard like TEI is actually an attainable goal,
               for example by using a template, then, clearly, this should be promoted.</p>
            <p xml:id="hansen.p0067">Although descriptive markup is a true Third Wave technology
               enabling customized applications, some of these, like DocBook and TEI, have evolved
               into complicated market standards best handled by specialists. However, that this
               kind of specialization is actually happening in projects such as the Diplomatarium
               Danicum and research infrastructures inspires confidence that the disruption which
               was the very hallmark of the Third Wave is perhaps starting to wear off. With a
               market where standard texts are checked in and out of repositories, we can hope for
               development of better tools and technologies that would take the field of scholarly
               text processing even further.</p>
         </div>
      </body>
      <back>
         <div>
            <listBibl>
               <bibl xml:id="BurnardBauman2007">Burnard, L. and Bauman, S., eds. 2007. <emph>TEI P5:
                     Guidelines for electronic text encoding and interchange</emph>. <ptr
                     target="http://www.tei-c.org/release/doc/tei-p5-doc/en/html//index.html"
                  />.</bibl>
               <bibl xml:id="BurnardSperberg-McQueen2006">Burnard, L. and Sperberg-McQueen, C. M.,
                  eds. 2006. <emph>TEI lite: Encoding for interchange: An introduction to the TEI —
                     revised for TEI P5 release</emph>. <ptr
                     target="http://www.tei-c.org/release/doc/tei-p5-exemplars/html/teilite.doc.html"
                  />.</bibl>
               <bibl xml:id="Driscoll2006">Driscoll, M.J. 2006.P5-MS: A general purpose tagset for
                  manuscript description. <emph>Digital Medievalist</emph> 2.1. Accessed December
                  14, 2010.</bibl>
               <bibl xml:id="Hansen2010a">Hansen, T. 2010a. <emph>Metadata for diplomatarium danicum
                     texts. Technical report</emph>. Copenhagen: Society for Danish Language and
                  Literature. <ptr target="http://diplomatarium.dk/docs/Metadata_DD_texts.pdf"
                  />.</bibl>
               <bibl xml:id="Hansen2010b">---. 2010b. <emph>General text format and markup for
                     diplomatarium danicum texts. Technical report.</emph> Copenhagen: Society for
                  Danish Language and Literature. <ptr
                     target="http://diplomatarium.dk/docs/General_text_format_DD_texts.pdf"
                  />.</bibl>
               <bibl xml:id="Lavagnino2006">Lavagnino, J. 2006. When not to use TEI. In
                     <emph>Electronic textual editing</emph>, eds. Lou Burnard, Katherine O'Brien
                  O'Keeffe, and John Unsworth. Modern Language Association. <ptr
                     target="http://www.tei-c.org/About/Archive_new/ETE/Preview/lavagnino.xml"
                  />.</bibl>
               <bibl xml:id="Ortí1997">Ortí, María Milagros Cárcel. 1997. <emph>Vocabulaire
                     international de la diplomatique</emph>. Valencia: Commission internationale de
                  diplomatique.</bibl>
               <bibl xml:id="SimonsBlack2009">Simons, G.F, and Black, H.A. 2009. Third wave writing
                  and publishing. <emph>SIL Forum for Language Fieldwork</emph> 2009-005. <ptr
                     target="http://www.sil.org/SILepubs/Pubs/52287/SILForum2009-005.pdf"/></bibl>
               <bibl xml:id="Sperberg-McQueenHuitfeldtRenear2000">Sperberg-McQueen, C.M., Huitfeldt,
                  C., and Renear, A.H. 2000. Meaning and interpretation of markup. <emph>Markup
                     languages: Theory and practice</emph> 2.3:215-234. <ref
                     target="http://cmsmcq.com/2000/mim.html"
                  >http://cmsmcq.com/2000/mim.html</ref>.</bibl>
               <bibl xml:id="Toffler1980">Toffler, A. 1980. <emph>The third wave</emph>
                  <emph>.</emph> New York: Bantam Books.</bibl>
            </listBibl>
         </div>
      </back>
   </text>
</TEI>
