<TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt>
				<title level="a">Opening the Illustrated Incunable Short Title
					Catalog on CD-ROM: an end-user's approach to an essential
					database</title>
				<author>
					<name>Jonathan Green</name>
					<address><addrLine>College of Charleston</addrLine></address>
				</author>
				<editor role="acceptingeditor">
					<name>D. P. O'Donnell</name>
					<address><addrLine>University of Lethbridge</addrLine></address>
				</editor>
				<editor role="recommendingreader">
					<name>I. Lancashire</name>
					<address><addrLine>University of Toronto</addrLine></address>
				</editor>
				<respStmt>
					<resp>Tei-encoding by</resp>
					<name>Daniel Paul O'Donnell</name>
					<name/>
				</respStmt>
			</titleStmt>
			<editionStmt>
				<edition>Version 1.0 (Publication copy)</edition>
			</editionStmt>
			<extent>Approximately 8,300 words.</extent>
			<publicationStmt>
				<publisher>Curriculum Redevelopment Centre, University of
					Lethbridge</publisher>
				<pubPlace>Lethbridge AB, Canada T1K 3M4 </pubPlace>
				<availability status="unknown">
					<p>© Jonathan Green, 2005. Creative Commons
						Attribution-NonCommercial licence, 2.5</p>
				</availability>
				<date n="received" when="2004-09-23">September 23, 2004</date>
				<date n="revised" when="2004-12-22">December 22, 2004</date>
				<date n="published" when="2005-04-20">April 20, 2005</date>
			</publicationStmt>
			<seriesStmt>
				<title>Digital Medievalist</title>
				<idno type="volume">1</idno>
				<idno type="issue">1</idno>
				<idno type="date">Spring 2005</idno>
			</seriesStmt>
			<notesStmt>
				<note type="abstract" anchored="true">
					<p>The <choice>
							<expan>
								<title level="m">Illustrated Incunable Short Title Catalog
									on CD-ROM</title>
							</expan>
							<abbr>IISTC</abbr>
						</choice>, now in its second edition, provides an unrivaled
						wealth of information on fifteenth-century printing and, as a
						computer database, allows for rapid searching that would not be
						possible with printed reference works. However, the database's
						search interface suffers from numerous problems, as Paul
						Needham described in a thorough review essay. This article
						presents a solution to those problems that can be implemented
						by the end user, and also shows what kind of useful information
						can be obtained from the IISTC by doing so. The solution
						entails exporting all records to a very large text file,
						analyzing the file with scripts written in Perl, importing the
						information into a full-featured database application, and
						conducting queries with the database application's more robust
						and better documented interface. With the IISTC data directly
						accessible, the database fields can be manipulated to implement
						features missing in the original IISTC, including separate
						fields for each part of the imprint data and a count of
						recorded copies. Query-generated output demonstrated here
						include a table of incunables with the highest number of copies
						recorded in the IISTC; printers of Ulm, the number of their
						signed editions, and their dates; and the number of signed
						editions printed each year through the end of the fifteenth
						century. Sample scripts for recreating the results described
						here, as well as instructions for implementing them and a
						discussion of points to consider when doing so, are found in
						the appendices.</p>
				</note>
				<note type="acknowledgements" anchored="true">
					<p>The author wishes to thank Alvan Bregmann, Bryce Inouye, and
						Paul Needham for their kind assistance and helpful suggestions
						for this article.</p>
				</note>
			</notesStmt>
			<sourceDesc>
				<p>Original Composition</p>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<projectDesc>
				<p>Article from Digital Medievalist Journal (URL:
					http://www.digitalmedievalist.org/)</p>
			</projectDesc>
			<refsDecl>
				<p>Citations from the text of this article should be by paragraph
					number.</p>
			</refsDecl>
		</encodingDesc>
		<profileDesc>
			<creation/>
			<langUsage>
				<language ident="ENG-US">US English</language>
				<language ident="LAT">Latin</language>
				<language ident="FRA">French</language>
				<language ident="ITA">Italian</language>
			</langUsage>
			<textClass>
				<keywords scheme="DM">
					<term type="DMType">Project Report</term>
					<term>Illustrated Incunable Short Title Catalog (IISTC)</term>
					<term>bibliography</term>
					<term>databases</term>
					<term>user interfaces</term>
					<term>scripting languages (PERL)</term>
					<term>incunabula</term>
					<term>British library</term>
				</keywords>
			</textClass>
		</profileDesc>
	</teiHeader>
	<text xml:lang="ENG-US">
		<body>
			<div>
				<head>Introduction</head>
				<p xml:id="green.dm.1.1.p.0010">Compliance with open standards in
					software and multimedia projects is an excellent thing for the
					projects' users, and so it is often promoted as a virtue that
					programmers and digital content creators should strive for.
					Developers have not always shared this concern, unfortunately,
					with the result that the users who expected to make use of some
					electronic resource in their research are occasionally prevented
					from finding all the answers they had sought. What is the end
					user to do in such a situation, besides write to the publisher
					and ask that the needed feature be added in the next version?
					Sometimes the user can do more, perhaps much more, depending on
					the program or project. In the case of one invaluable database,
					the <choice>
						<expan>
							<title level="m">Illustrated Incunable Short Title Catalog on
								CD-ROM</title>
						</expan>
						<abbr>IISTC</abbr>
					</choice> (<ref target="#britishlibrary1998" type="bibliographic"
						>British Library 1998</ref>), there is a wealth of useful
					information trapped behind an inadequate user interface.
					Medievalists working on fifteenth-century literature or early
					printing have two options: they can practice the patience of a
					recusant or they can seek a radical solution, namely, exporting
					all 28,360 records, extracting the necessary information, and
					importing it into the fields of a standards-compliant database.
					The questions about early printing that can then be more easily
					answered illustrate one reason why standards compliance is so
					important for humanities computing projects: designers and
					developers can never anticipate all the research inquiries that
					scholars may wish to pursue.</p>
			</div>
			<div>
				<head>Problems</head>
				<p xml:id="green.dm.1.1.p.0020">As a catalog of all known incunable
					editions with an extensive if not yet complete list of known
					copies, the IISTC comes closer than any other presently available
					reference work to being a worldwide incunable census. It is
					therefore an essential tool for research libraries and scholars
					in many fields. As a computer database rather than a printed
					catalog, the IISTC promises quick answers to scholars' questions.
					In a thorough review article, Paul Needham praises the IISTC as a
					milestone in the history of incunable bibliography but also
					identifies numerous deficiencies in the database and particularly
					in the idiosyncratic user interface (<ref target="#needhamp1999"
						type="bibliographic">Needham 1999</ref>). The application of
					computer technology to bibliography makes the IISTC a
						<quote>revolutionary publication in incunable study</quote>, by
					allowing searches that <quote>from printed reference works...can
						be made only laboriously, or for practical purposes cannot be
						made at all</quote> (479). And yet, because of the limitations
					of the software provided for searching the IISTC database,
					several types of inquiry remain laborious, impractical, or at
					times impossible. Needham specifically mentions the following
					shortcomings:</p>
				<list type="ordered">
					<item>One can save only entire records, rather than exporting
						only particular fields (478).</item>
					<item>The interface software is unstable, particularly when
						conducting complex searches (483).</item>
					<item>Much useful information is found not in the incunable
						records, but in various unexportable lists, for which there is
						no way to easily view their data. <quote>Similarly, there is no
							way with the current software of the IISTC to select a city,
							and then view or capture a list of all the recorded printing
							shops of the city</quote> (486 n. 58).</item>
					<item>Editions with unsigned dates have been assigned a
							<term>Year of Publication</term> through an opaque and
						problematic process, so that the field is <quote>inadequate for
							even the roughest of statistical analysis</quote> (489).
						Editions assigned to a range of years, such as 1477-79, will
						only turn up in a search for the first or last year in the
						range, but not under 1478 (520).</item>
					<item>Libraries in some countries are recorded inconsistently, so
						that it is often <quote>difficult or impossible to compile,
							with a single search, all the incunables of a given
							library</quote> (498).</item>
					<item>The asterisk denoting that Hain had personally inspected a
						particular imprint is indistinguishable from the
						asterisk-as-a-wildcard character in bibliography searches;
						consequently, <quote>the IISTC's software silently mishandles
							Hain's asterisks</quote> (499).</item>
					<item>The drop-down lists in the <term>Bibliography</term> field
						are in strict alphabetical order, making it impossible to
						search catalog numbers in numerical sequence. This is a
						drawback for works such as Kurt Ohly and Vera Sack's catalog of
						Frankfurt incunables, and a substantial problem for searching
						the older standard Hain-Copinger-Reichling series
						(498-99).</item>
					<item>The <term>Notes</term> field can be searched only through
							<term>All Fields</term> searches (501).</item>
					<item>There is no easy way to search for only signed or unsigned
						cities, printers, or dates. That is, the user interface search
						capabilities cannot reliably differentiate between editions
						that bear the name of a Basel printer and editions that
						scholars have attributed to a Basel printer (520).</item>
					<item>Many of these problems are related to the search software's
						failure to implement true string searches, instead treating
							<code>William Caxton</code> as <code>William AND
							Caxton</code> with unpredictable results (520).</item>
				</list>
				<p xml:id="green.dm.1.1.p.0030">In addition, although the
					still-incomplete IISTC is unmatched as a worldwide incunable
					census by any other resource including the <ref
						target="#gesamtkatalogderwiegendrucke1925-"
						type="bibliographic">
						<title level="m">Gesamtkatalog der Wiedgendrucke</title>
					</ref>, the IISTC lists the present-day locations of incunable
					editions, but does not display the total number of copies. This
					is not a serious problem if one can see at a glance that there
					are only one or two copies of a given incunable, but rather
					irksome if there are dozens, and a huge handicap if one wishes to
					compare the number of recorded copies for more than a few
					editions. To answer the question, <quote>For which incunable does
						the IISTC record the largest census?</quote> it is best to know
					the answer beforehand: Anton Koberger's edition of the Latin
						<title level="m">Nuremberg Chronicle</title>, 12 July 1493
					(Goff S-307), and even then one must calculate the total number
					of copies by hand (Needham arrives at <quote>more than
						780</quote>
					<ref target="#needhamp1999" type="bibliographic">Needham
						1999</ref>, 497]). Even an experienced scholar of
					fifteenth-century printing might have difficulty naming the
					second-, third-, or tenth-largest incunable census, or the
					largest census for works printed in German or another of the
					vernacular languages. This information lies within the database,
					but the IISTC interface prevents users from accessing it.</p>
			</div>
			<div>
				<head>A solution</head>
				<p xml:id="green.dm.1.1.p.0040">Repetitive tasks, such as adding up
					the number of copies for 28,360 editions, are best left to
					computers, and therein lies the solution, which has but four
					essential steps:</p>
				<list type="simple">
					<item>Export all 28,360 records in the IISTC as a plain text
						file; </item>
					<item>Transform the text file into a standard file format; </item>
					<item>Import the information into a database or spreadsheet
						application; </item>
					<item>Search the new database using standardized and
						well-documented search tools. </item>
				</list>
				<p>While many different approaches and various software packages
					could be used to implement these steps, the following discussion
					is based on readily-available software and consumer applications
					that are practically standard issue in the computing
					infrastructure of many colleges and universities. (For a full
					account of the process of importing the IISTC records into a
					database, see <ref target="#green.dm.1.1.appendix.1"
						type="navigation">appendix 1</ref>.)</p>
				<p xml:id="green.dm.1.1.p.0050">Because the IISTC allows any number
					of records to be selected and exported, a user could choose to
					export all 28,360 records to a plain-text file, at least in
					theory. The operation itself can take hours or even days, perhaps
					as a legacy of the IISTC's providing only a 16-bit Windows
					interface coded in Visual Basic 3. The minimal requirements of
					the IISTC software mean that it can run on quite antiquated
					hardware, although at the cost of increased liability to crash
					and increasingly uncertain interoperability with a library's
					newer computer infrastructure. Exporting all the records is
					nevertheless possible and, as the following will show, quite
					useful.</p>
				<p xml:id="green.dm.1.1.p.0060">The export of every IISTC record
					results in a very long list of records such as the following:</p>
<ab type="code">				
					<!-- l replaced -->The Illustrated ISTC (2nd Edition) <lb/>
					<lb/>
					<!-- l replaced -->Author: Aesopus<lb/>
					<!-- l replaced -->Title: Vita, after Rinucius, et Fabulae, Lib. I-IV, prose
						version of Romulus [German]. Add: Fabulae extravagantes.
						Fabulae novae (Tr: Rinucius). Fabulae Aviani. Fabulae collectae
						[German] (Tr: Heinrich Steinhöwel). Leonardus Brunus Aretinus:
						De duobus amantibus Guiscardo et Sigismunda [German] (Tr:
						Nikolaus von Wyle)<lb/>
					<!-- l replaced -->Imprint: [Augsburg: Anton Sorg, about 1479]<lb/>
					<!-- l replaced -->Language: German  Format: f°<lb/>
					<!-- l replaced -->Notes: General+Production:<lb/>
					<!-- l replaced -->Woodcuts<lb/>
					<!-- l replaced -->Cataloguing Source: Goff A120<lb/>
					<!-- l replaced -->Bibliography: HR, Supplement 333; Schreiber, Manuel 3028a;
						Schramm IV p. 50; GW 353<lb/>
					<!-- l replaced -->Locations: <lb/>
					<!-- l replaced -->  British Isles: London, Victoria and Albert Museum<lb/>
					<!-- l replaced -->  USA: LC(R); MMu(P)L<lb/>
					<!-- l replaced -->  Germany: Dresden KupferstichKab<lb/>
					<!-- l replaced -->ISTC No:  ia00120000<lb/>
					<lb/>
					<!-- l replaced -->(c) British Library Board and (c) Primary Source Media<lb/>
				
</ab>				<p>One could cut and paste all of the information by hand from this
					record to a database or spreadsheet table, where
						<term>Author</term> was one column, <term>Title</term> another,
					and so on. It would be arduous and repetitious, and therefore
					best left to a computer. Fortunately, there are well-documented
					and accessible scripting languages such as Perl exactly suited
					for this task. (For Perl software and documentation, see <ptr
						target="http://www.cpan.org"/>; <ptr
						target="http://www.perl.org"/>; <ptr
						target="http://www.activestate.com"/>.) One must only tell the
					computer:</p>
				<quote>
					<p>Read through all 495,923 lines in the exported text file;
						whenever you find a line that begins with <code>Title:</code>,
						save everything between the non-printing tab character and the
						end of the line; now look for a line that begins with
							<code>ISTC No:</code>, and do the same. Finally, print the
						ISTC number as an index, then a tab character to separate the
						fields, then the title, and then a new line character. And then
						get back to work!</p>
				</quote>
				<p>The script, in Perl as written by a medievalist (and explained
					more fully in <ref target="#green.dm.1.1.appendix.2"
						type="navigation">appendix 2</ref>), might look something like
					this:</p>
				<ab type="code">
					<!-- l replaced -->$batch="istc.txt";    <lb/>
					<!-- l replaced -->open BATCH, $batch or die "Cannot open $batch for
						read:$!";<lb/>
					<!-- l replaced -->while (&lt;BATCH&gt;) {<lb/>
					<!-- l replaced -->  if (/^Title:\t(.*?)$/) {<lb/>
					<!-- l replaced -->    $match = $1;<lb/>
					<!-- l replaced -->    $hit=1;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  if (/^ISTC.*(i.\d{8})/ and ($hit == 1)) {<lb/>
					<!-- l replaced -->    $hit = 0;<lb/>
					<!-- l replaced -->    $istc_number = $1;<lb/>
					<!-- l replaced -->    print "$istc_number\t$match\n";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>That is, if the entirety of the IISTC is exported as plain text
					to the file <code>istc.txt</code> and the Perl script invoked as
					written, it will produce a very long list that begins in the
					following way:</p>
				<ab type="code">
					<!-- l replaced -->ia00000500  Orhot Hayyim<lb/>
					<!-- l replaced -->ia00001000  Abbey of the Holy Ghost<lb/>
					<!-- l replaced -->ia00001500  Abbey of the Holy Ghost<lb/>
					<!-- l replaced -->ia00002000  Abbey of the Holy Ghost<lb/>
					<!-- l replaced -->ia00003000  Abbreviamentum statutorum<lb/>
					<!-- l replaced -->ia00004000  Abbreviamentum statutorum<lb/>
					<!-- l replaced -->ia00004500  Abbreviamentum statutorum<lb/>
					<!-- l replaced -->ia00005000  Abbreviamentum statutorum<lb/>
					<!-- l replaced -->ia00005500  Abecedarium<lb/>
					<!-- l replaced -->ia00008000  Dialogus in astrologiae defensionem cum vaticinio
						a diluvio ad annos 1702. With additions by Domicus Palladius
						Soranus<lb/>
					<!-- l replaced -->ia00009000  Trutina rerum coelestium et terrestrium. With
						additions by Augustinus Beganus and Ludovicus Ponticus<lb/>
					<!-- l replaced -->ia00009100  De luminaribus et diebus criticis<lb/>
				</ab>
				<p>If the output is redirected to a file, then one is left at the
					end with a tab-delimited table containing a list of ISTC index
					numbers and their corresponding titles, which can be imported
					into the database application of one's choice. With similar
					scripts that search not for <code>Title:</code> but for
						<code>Author:</code> or <code>Imprint:</code>, for example, the
					rest of the information can be extracted as well and then
					imported in turn. While the IISTC search interface is
					idiosyncratic, inadequately documented, and crash-prone, the
					database industry has spent decades and billions of dollars on
					standardizing, documenting, and crash-proofing their
					software.</p>
				<p xml:id="green.dm.1.1.p.0070">While using another software
					package to replace the IISTC interface is useful, any spreadsheet
					or database application will have its own limitations on what it
					can do with the records in their present form. Opening up the
					IISTC has the added advantage, however, that the records can be
					manipulated further. For example, the <term>Imprint</term> field
					could be split up into <term>city</term>, <term>printer</term>,
					and <term>date</term> fields; or a flag could be added to mark
					each as <soCalled>signed</soCalled> or
						<soCalled>unsigned</soCalled>; or fields could be created for
					the first, last, or average of all dates attributed to unsigned
					imprints. The pattern matching and string manipulation
					capabilities of Perl are quite robust and can even be made to
					deal with defective IISTC records. (For one possible
					implementation of a script to analyze the <term>Imprint</term>
					field, see <ref target="#green.dm.1.1.appendix.3"
						type="navigation">appendix 3</ref>.) The same kind of
					manipulation can be done on the <term>Locations</term> field to
					provide a count of the number of copies identified for each of
					ten geographic regions, which can in turn be added to yield an
					overall sum. A discussion of the particular challenges here and
					sample scripts are provided in <ref
						target="#green.dm.1.1.appendix.4" type="navigation">appendix
						4</ref>. </p>
			</div>
			<div>
				<head>Results</head>
				<p xml:id="green.dm.1.1.p.0080">Is the effort worth it? While
					learning enough Perl to write the necessary scripts takes some
					time, it is much more manageable than learning, say, Latin.
					Whether that is time well spent depends on one's needs, and how
					much one prefers to let a computer handle repetitive search and
					tabulation. As noted above, Needham regrets that there is no way
					to quickly view a list of recorded print shops for a given city
					using the IISTC (<ref target="#needhamp1999" type="bibliographic"
						>Needham 1999</ref>, 497 n. 58), even though the IISTC holds
					this information. With a database application as an interface,
					however, one can quickly extract the required information. Thus
					we can discover that the IISTC records the following printers for
					Ulm:</p>
				<ab type="code">
					<!-- l replaced -->Conrad Dinckmut<lb/>
					<!-- l replaced -->Conrad Dinckmut?<lb/>
					<!-- l replaced -->Hans Hauser<lb/>
					<!-- l replaced -->Johann Reger<lb/>
					<!-- l replaced -->Johann Reger, for Justus de Albano<lb/>
					<!-- l replaced -->Johann Schäffler<lb/>
					<!-- l replaced -->Johann Zainer<lb/>
					<!-- l replaced -->Johann Zainer, not before 1478<lb/>
					<!-- l replaced -->Johann Zainer?<lb/>
					<!-- l replaced -->Lienhart Holle<lb/>
				</ab>
				<p>This example, like the others here, was created using Microsoft
					Access. This software is neither a model of standards compliance
					nor inexpensive; it is, however, the most widespread of desktop
					database applications. With its <soCalled>query by
						design</soCalled> functionality, one can graphically select the
						<term>printers</term> and <term>cities</term> database fields,
					specify that the latter should correspond to <code>Ulm</code>,
					and let Access automate the process of generating the correct
					query statement; the process requires less than a minute to set
					in motion and just seconds to execute. One can just as easily
					formulate a more exact question, for example, <quote>For what Ulm
						printers do we have incunable editions with signed city,
						printer, and year? How many are there? When were they
						printed?</quote> By pencil-and-paper methods, the following
					table would take quite some time to construct, but with a
					database application just a few minutes or, with experience,
					seconds:</p>
				<table rend="border">
					<head>Printers of Ulm and their signed imprints</head>
					<row role="data">
						<cell rend="rowhead">
							<term>Printer</term>
						</cell>
						<cell rend="integer">
							<term>Signed editions</term>
						</cell>
						<cell rend="integer">
							<term>First signed year</term>
						</cell>
						<cell rend="integer">
							<term>Last signed year</term>
						</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Conrad Dinckmut</cell>
						<cell rend="integer">31</cell>
						<cell rend="integer">1482</cell>
						<cell rend="integer">1496</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Johann Reger</cell>
						<cell rend="integer">10</cell>
						<cell rend="integer">1486</cell>
						<cell rend="integer">1499</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Johann Reger, for Justus de Albano</cell>
						<cell rend="integer">1</cell>
						<cell rend="integer">1486</cell>
						<cell rend="integer">1486</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Johann Schäffler</cell>
						<cell rend="integer">8</cell>
						<cell rend="integer">1492</cell>
						<cell rend="integer">1499</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Johann Zainer</cell>
						<cell rend="integer">35</cell>
						<cell rend="integer">1473</cell>
						<cell rend="integer">1500</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Lienhart Holle</cell>
						<cell rend="integer">6</cell>
						<cell rend="integer">1482</cell>
						<cell rend="integer">1484</cell>
					</row>
				</table>
				<p>As the IISTC records relatively few imprints after 1501, the
					last signed year does not, of course, indicate that a printer
					ceased operation around that time. The SQL statement used for the
					search may seem complicated at first glance, but one does not
					have to glance at it even a first time, thanks to the query
					design system of Microsoft Access and other consumer database
					applications:</p>
				<ab type="code">
					<!-- l replaced -->SELECT istc.Printer, istc.City, Min(istc.first_year) AS
						MinOffirst_year, Max(istc.last_year) AS MaxOflast_year,
						Count(istc.istc_number) AS CountOfistc_number<lb/>
					<!-- l replaced -->FROM istc<lb/>
					<!-- l replaced -->WHERE (((istc.Flags) Like "+++"))<lb/>
					<!-- l replaced -->GROUP BY istc.Printer, istc.City<lb/>
					<!-- l replaced -->HAVING (((istc.City)="ulm"));<lb/>
				</ab>
				<p>(Note that the database name is <code>istc</code>, and the
					relevant fields are <code>Printer</code>, <code>City</code>,
						<code>first_year</code>, <code>last_year</code>,
						<code>istc_number</code>, and <code>Flags</code>, which signify
					whether the city, printer, and date are signed or
					attributed.)</p>
				<p xml:id="green.dm.1.1.p.0090">The preceding table should not be
					confused with an authoritative statement based on extensive
					research. It is rather a quickly-constructed summary that
					provides a first impression of the overall situation of early
					printing in Ulm, but that is by itself a useful function for a
					computer database.</p>
				<p xml:id="green.dm.1.1.p.0100">What if one wanted to see a rough
					overview of the development in number of editions printed each
					year? (See, for example, <ref target="#neddermeyeru1998"
						type="bibliographic">Neddermeyer 1998</ref>, 2:609-10.) As
					noted above, the IISTC <term>Year of Publication</term> field is
					entirely inadequate for this, and the IISTC does not permit
					searching of editions with signed dates only. After the IISTC
					records have been imported into a database, one possibility would
					be to take the average of dates that appear as [1479-81], or one
					might choose instead to consider only the 12,072 imprints with a
					signed date. If one takes the latter option, one can quickly
					paste the resulting data into Microsoft Excel—another omnipresent
					if not inherently standards-friendly spreadsheet application—to
					construct a graph such as the following:</p>
				<figure>
					<graphic url="support/figure1.png"/>
					<figDesc>Number of incunable editions with signed dates in the
						IITSC as graphed by Microsoft Excel</figDesc>
				</figure>
				<p>Needham notes that a search of IISTC's <term>Year of
						Publication</term> field would find a seeming contraction in
					book printing between 1477 and 1479, but that this reflects
					idiosyncrasies in the IISTC search software rather than an actual
					shrinkage in production (<ref target="#needhamp1999"
						type="bibliographic">Needham 1999</ref>, 489). The summary of
					incunable production for the years 1475-1485 (below) finds that
					this apparent contraction was indeed spurious-but perhaps not
					that for 1482 through 1484, when signed editions decline by 18%
					over two years (the only decline lasting more than a single
					year). Additional work is required to determine how widespread
					this phenomenon was or what its causes might have been (see also
						<ref target="#neddermeyeru1998" type="bibliographic"
						>Neddermeyer 1998</ref>, 1:420-22), but the graph at least
					provides the right place to start, where the numbers provided by
					the IISTC search interface do not.</p>
				<table rend="border">
					<head>Editions per year, total (IISTC), and signed only</head>
					<row role="data">
						<cell rend="rowhead">
							<term>Year</term>
						</cell>
						<cell rend="integer">
							<term>Editions (IISTC)</term>
						</cell>
						<cell rend="integer">
							<term>Editions (signed only)</term>
						</cell>
					</row>
					<row role="data">
						<cell rend="integer">1475</cell>
						<cell rend="integer">835</cell>
						<cell rend="integer">242</cell>
					</row>
					<row role="data">
						<cell rend="integer">1476</cell>
						<cell rend="integer">589</cell>
						<cell rend="integer">231</cell>
					</row>
					<row role="data">
						<cell rend="integer">1477</cell>
						<cell rend="integer">672</cell>
						<cell rend="integer">257</cell>
					</row>
					<row role="data">
						<cell rend="integer">1478</cell>
						<cell rend="integer">657</cell>
						<cell rend="integer">266</cell>
					</row>
					<row role="data">
						<cell rend="integer">1479</cell>
						<cell rend="integer">563</cell>
						<cell rend="integer">245</cell>
					</row>
					<row role="data">
						<cell rend="integer">1480</cell>
						<cell rend="integer">1177</cell>
						<cell rend="integer">285</cell>
					</row>
					<row role="data">
						<cell rend="integer">1481</cell>
						<cell rend="integer">734</cell>
						<cell rend="integer">342</cell>
					</row>
					<row role="data">
						<cell rend="integer">1482</cell>
						<cell rend="integer">816</cell>
						<cell rend="integer">359</cell>
					</row>
					<row role="data">
						<cell rend="integer">1483</cell>
						<cell rend="integer">932</cell>
						<cell rend="integer">334</cell>
					</row>
					<row role="data">
						<cell rend="integer">1484</cell>
						<cell rend="integer">717</cell>
						<cell rend="integer">295</cell>
					</row>
					<row role="data">
						<cell rend="integer">1485</cell>
						<cell rend="integer">1118</cell>
						<cell rend="integer">312</cell>
					</row>
				</table>
				<p xml:id="green.dm.1.1.p.0110">And what of the <title level="m"
						>Nuremberg Chronicle</title>? It stands at the head of the list
					of most-preserved incunables, but what follows it? According to
					the IISTC, the <title level="m">Nuremberg Chronicle</title>
					vastly outnumbers its closest competitor:</p>
				<table rend="border">
					<head>Incunables with highest census counts in the IISTC</head>
					<row role="data">
						<cell rend="rowhead">
							<term>Author</term>
						</cell>
						<cell>
							<term>Abbreviated title</term>
						</cell>
						<cell>
							<term>Reference</term>
						</cell>
						<cell>
							<term>Imprint</term>
						</cell>
						<cell rend="integer">
							<term>Copies</term>
						</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Schedel, Hartmann</cell>
						<cell>Liber chronicarum</cell>
						<cell>HC 14508*</cell>
						<cell>Nuremberg: Anton Koberger, 12 July 1493</cell>
						<cell rend="integer">786</cell>
					</row>
					<row role="data">
						<cell rend="rowhead" rows="2">Aristoteles</cell>
						<cell>Opera [Greek]...</cell>
						<cell>HC 1657*</cell>
						<cell>Venice: Aldus Manutius, Romanus, 1495-98</cell>
						<cell rend="integer">319</cell>
					</row>
					<row role="data">
						<cell>Biblia latina...</cell>
						<cell>HC 3173*</cell>
						<cell>[Strassburg: Adolf Rusch, for Anton Koberger at
							Nuremberg, not after 1480]</cell>
						<cell rend="integer">287</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Politianus, Angelus</cell>
						<cell>Opera...</cell>
						<cell>HC 13218*</cell>
						<cell>Venice: Aldus Manutius, Romanus, July 1498</cell>
						<cell rend="integer">270</cell>
					</row>
					<row role="data">
						<cell rend="rowhead" rows="2">Euclides</cell>
						<cell>Elementa geometriae...</cell>
						<cell>HC 6693*</cell>
						<cell>Venice: Erhard Ratdolt, 25 May 1482</cell>
						<cell rend="integer">266</cell>
					</row>
					<row role="data">
						<cell>Epistolae diversorum philosophorum...[Greek]</cell>
						<cell>HC 6659*</cell>
						<cell>Venice: Aldus Manutius, Romanus, 1499</cell>
						<cell rend="integer">266</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Firmicus Maternus, Julius</cell>
						<cell>Mathesis (De nativitatibus libri VIII)...</cell>
						<cell>HC 14559*</cell>
						<cell>Venice: Aldus Manutius, Romanus, June and [17] Oct.
							1499</cell>
						<cell rend="integer">257</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Ubertinus de Casali</cell>
						<cell>Arbor vitae crucifixae Jesu Christi</cell>
						<cell>HC 4551*</cell>
						<cell>Venice: Andreas de Bonetis, 12 Mar. 1485</cell>
						<cell rend="integer">252</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Boethius</cell>
						<cell>Opera</cell>
						<cell>H 3351*</cell>
						<cell>Venice: Johannes and Gregorius de Gregoriis, de Forlivio,
							1491-92</cell>
						<cell rend="integer">251</cell>
					</row>
					<row role="data">
						<cell rend="rowhead">Antoninus Florentinus</cell>
						<cell>Summa theologica (Partes I-IV)...</cell>
						<cell>HC 1243*</cell>
						<cell>Venice: Nicolaus Jenson, 1477-80</cell>
						<cell rend="integer">242</cell>
					</row>
				</table>
				<p>These numbers should not be understood as the number of copies
					now existing, nor even as the number of copies recorded by the
					IISTC. Rather, one has to interpret them as one computer script's
					interpretation of the IISTC data, which is itself incomplete and
					sometimes ambiguous. Martin Davies, ISTC general editor, has
					stated, however, that ISTC data collected as of 1992 would
					proportionally reflect the total number of surviving copies:
						<quote>The numbers of copies of any particular edition...must,
						however, bear a fairly constant relation to the total now
						extant: the fewer recorded the scarcer an edition will prove to
						be</quote> (<ref target="#daviesmandgoldfinchj1992"
						type="bibliographic">Davies and Goldfinch 1992</ref>, 20). If
					exact precision is essential, then one would do well to verify
					all figures by hand. By my calculation, the IISTC records over
					350,000 individual copies for fifteenth-century editions, or
					between 65% and 80% of all surviving incunables by various
					estimates (see <ref target="#neddermeyeru1998"
						type="bibliographic">Neddermeyer 1998</ref>, 1:79). Based on
					these provisional figures, one would expect a complete census of
					surviving copies of the Latin <title level="m">Nuremberg
						Chronicle</title> to have somewhere between 1000 and 1250
					copies. Christoph Reske arrived at ca. 900 copies with another
					estimated 135 in private hands (<ref target="#reskec2000"
						type="bibliographic">Reske 2000</ref>, CD 275-77), while Paul
					Needham's ongoing count, largely restricted to copies in public
					libraries, already approaches 1200 (personal correspondence,
					cited by permission).</p>
			</div>
			<div>
				<head>Conclusion</head>
				<p xml:id="green.dm.1.1.p.0120">There are countless ways to graph,
					chart, and tabulate the IISTC data, but those that occur to this
					author may not be the same ones that would hold the interest of
					the present reader. The previous examples should be enough to
					demonstrate the utility of allowing other software applications
					to cooperate with the IISTC's data. Standard database searches
					will address most of the shortcomings of the IISTC identified by
					Needham (for example, escaping the asterisk character so that it
					is not interpreted as a wildcard in searches). The rest can be
					addressed to the extent the underlying data allow by some
					additional script writing. If the effort required is justified,
					the tools at one's disposal are flexible enough to provide an
					answer.</p>
				<p xml:id="green.dm.1.1.p.0130">The preceding discussion may hold
					broader implications for designers and users of other electronic
					reference works. The general outlines of the solution offered
					here may be applicable to other electronic resources: exporting
					records, manipulating them, and re-importing them into another
					application is by no means a unique process. Even if the software
					in question has no export function, more sophisticated
					programming can always automate manual copying and pasting.</p>
				<p xml:id="green.dm.1.1.p.0140">An important point of application
					design is that a tabular view of a database often permits
					important phenomena to be more easily visualized and defective
					records to be more easily found. Providing unimpeded access to
					the data offers maximum flexibility and value for an
					application's users. Database fields that cannot be directly
					viewed and whose reliability cannot be easily verified, such as
					the IISTC's <term>Year of Publication</term> field, are
					necessarily less useful than they otherwise could be.</p>
				<p xml:id="green.dm.1.1.p.0150">While a search interface can have
					many uses, it cannot anticipate every question that might be
					asked, and so it can aid or supplement but never entirely replace
					access to the underlying data. Much of the effort required to
					make the IISTC data accessible to other software applications
					would not have been necessary if the IISTC had maintained
					consistent formatting and made use of an open format from the
					beginning. That it did not, however, does not mean that scholars
					and other end users have to wait for the British Library to
					redesign its project. If necessary, standards can also be imposed
					from below.</p>
			</div>
		</body>
		<back>
			<div xml:id="green.dm.1.1.appendix.1" type="appendix">
				<head>Appendix 1: A step-by-step description of opening the IISTC
					database</head>
				<p xml:id="green.dm.1.1.p.0160">The following discussion assumes
					that the user has the IISTC, Microsoft Windows, and Microsoft
					Office installed on his or her computer. While the IISTC runs
					only under Windows, similar results should be achievable with any
					database software.</p>
				<list type="ordered">
					<item>
						<p>Select all records in the IISTC. First, enter a search on
							the <term>Search</term> screen that returns all 28,360
							records, such as searching for <code>i*</code> in the
								<term>ISTC Number</term> field. On the <term>List
								Display</term> screen, click on <term>Select All</term>.
							Once this choice has been confirmed, the computer may become
							unusable for an hour or more until the operation has
							completed.</p>
					</item>
					<item>
						<p>After all records have been selected, click on
								<term>Export</term>, which is also on the <term>List
								Display</term> screen. This may also require considerable
							time before the dialogue box appears. Do not change the
							export range, but do change the export format to plain text
							by clicking on the button marked <term>Using Rich Text Format
								(RTF)</term>; once you click on it, the title will change
							to read <term>Using Plain Text (TXT)</term>, which is the
							desired format. Click on the button marked
								<term>Export</term>, then select a location for the
							exported file and give it a name. The examples here assume a
							filename of <code>istc.txt</code>. The export process may
							take many hours and tie up all the resources of the computer
							during that time. The resulting file will be over 22
							megabytes in size.</p>
					</item>
					<item>
						<p>The various fields in the IISTC can now be turned into
							tab-delimited tables one at a time using Perl scripts such as
							that found in <ref target="#green.dm.1.1.appendix.2"
								type="navigation">appendix 2</ref> and above. If the script
							is named <code>title.pl</code>, the output can be redirected
							to create a file named <code>title.txt</code>:</p>
						<ab type="code">
							<!-- l replaced -->perl title.pl &gt; title.txt<lb/>
						</ab>
						<p>Otherwise the output will appear on screen. Scripts
							virtually identical to that found in <ref
								target="#green.dm.1.1.appendix.2" type="navigation"
								>appendix 2</ref> can be used to create a series of files,
							each a tab-delimited table containing an ISTC number in one
							column and one additional field in the other. </p>
					</item>
					<item>
						<p>These tables can now be imported one at a time into a
							database application. Using Microsoft Access, create a new
							database file, then open the text files one at a time using
							the <term>File</term> &gt; <term>Get External Data</term>
							&gt; <term>Import</term> function. Specify that the file is
							delimited, and that tabs separate the fields, and that it
							should be opened in a new table. Import the first field as
								<code>indexed (no duplicates)</code> and give it an
							appropriate title, so that the ISTC number can serve as the
							index of the imported database as well; the next stage of the
							import process will let you choose the ISTC number as the
							primary key for the database.</p>
						<p>Because some title records are longer than the 255-character
							limit that Access imposes on text fields, these records will
							be truncated and an error message will appear. Import the
							titles as a second table in the same way, but with the title
							field as a <term>memo</term> data type. The truncated titles
							can be used when sorting is necessary, while the full memo
							field will be available when the complete titles are needed,
							so both are useful.</p>
						<p>Import the rest of the text files in the same way, choosing
							appropriate titles for each table and an appropriate data
							type for each field. Maintain consistency in the naming of
							fields.</p>
					</item>
					<item>
						<p>The tables can now be joined one at a time into one large
							flat-file database table, which can simplify later searches.
							Select the <term>title</term> table, because it contains all
							28,360 records, and, for example, the <term>authors</term>
							table, which only contains 20,933. Create a query by
							selecting these two tables; the identical <term>ISTC
								Number</term> fields are already automatically joined.
							Right click on the link between the two tables, examine the
							join properties, and click on the third option: we need all
							the records from the <term>title</term> table as well as the
							corresponding records in <term>author</term>. Under the
								<term>Query</term> menu, select <term>Make-Table
								Query</term> and choose a name, such as <code>istc2</code>.
							Running the query will create a new table that includes all
							the ISTC numbers and titles for all records as well as the
							author for works that have one. Repeating this process with
							the newly created table and the table containing the next
							field to be imported will eventually result in a large table
							with all of the fields readily accessible in the ISTC
							database. The table should have 28,360 records after every
							step.</p>
					</item>
					<item>
						<p>Some of the most useful information in the IISTC requires
							further analysis of its fields using more Perl scripts. A
							script to analyze the imprint field is found in <ref
								target="#green.dm.1.1.appendix.3" type="navigation"
								>appendix 3</ref>, while a script to provide a copy count
							is found in <ref target="#green.dm.1.1.appendix.4"
								type="navigation">appendix 4</ref>. Each of these Perl
							scripts will create new tab-delimited tables that can be
							imported into the database by following steps 4 and 5
							above.</p>
					</item>
					<item>
						<p>Because the IISTC's <term>Printing Regions</term> function
							assigns some incunables to more than one region, this
							information cannot be imported into the same table. If this
							information is required, the relevant records would have to
							be exported from the IISTC separately, the ISTC Numbers
							extracted, and a new table created that does not use the ISTC
							Number as an index. One can then limit one's searches
							according to the IISTC's printing regions by searching out
							only those records in the larger table for which an ISTC
							number in the <term>regions</term> table is associated with
							the desired region.</p>
					</item>
				</list>
				<p xml:id="green.dm.1.1.p.0170">This is by no means the only
					possible approach towards creating a database from the IISTC
					records, or even one that is particularly faithful to the ideal
					of standards compliance. While Microsoft Access has a large
					install base, it is quite expensive; open-source and
					standards-compliant database solutions such as MySQL exist, but
					none yet matches the ease of use of Access. A monolithic flat
					file may not be the best database for all circumstances. In
					addition, some questions are still best handled by recourse to
					further script writing, particularly if further text manipulation
					is required, such as numerically sorting entries in the
					bibliographic standard works.</p>
			</div>
			<div xml:id="green.dm.1.1.appendix.2" type="appendix">
				<head>Appendix 2: A Perl script for extracting fields from the
					istc.txt file</head>
				<p xml:id="green.dm.1.1.p.0180">The comment lines, which begin with
					the pound sign, explain the function of each line of code.</p>
				<ab type="code">
					<!-- l replaced -->$batch="istc.txt";    <lb/>
					<!-- l replaced --># Define the name of file to search<lb/>
					<!-- l replaced -->open BATCH, $batch or die "Cannot open $batch for
						read:$!";<lb/>
					<!-- l replaced --># Open the file, or close with an <lb/>
					<!-- l replaced --># error if it doesn't exist<lb/>
					<!-- l replaced -->while (&lt;BATCH&gt;) {<lb/>
					<!-- l replaced -->  # As long as there are lines <lb/>
					<!-- l replaced -->  # in the file left to search...<lb/>
					<!-- l replaced -->  if (/^Title:\t(.*?)$/) {<lb/>
					<!-- l replaced -->  # ...look for the pattern <lb/>
					<!-- l replaced -->  # "Title:&lt;tab character&gt;&lt;anything<lb/>
					<!-- l replaced -->  # else&gt; <lb/>
					<!-- l replaced -->  # at the beginning of a line<lb/>
					<!-- l replaced -->    $match = $1;<lb/>
					<!-- l replaced -->    # Save "anything else"...<lb/>
					<!-- l replaced -->    $hit=1;<lb/>
					<!-- l replaced -->    # ...and set a flag that<lb/>
					<!-- l replaced -->    # we've found what we're <lb/>
					<!-- l replaced -->    # looking for <lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  if (/^ISTC.*(i.\d{8})/ and ($hit == 1)) {<lb/>
					<!-- l replaced -->  # Now, if we have a match already<lb/>
					<!-- l replaced -->  # saved, look for the <lb/>
					<!-- l replaced -->  # pattern "ISTC" at <lb/>
					<!-- l replaced -->  # the beginning of the <lb/>
					<!-- l replaced -->  # line, and then anything <lb/>
					<!-- l replaced -->  # else, and then "i" followed <lb/>
					<!-- l replaced -->  # by eight digits; save the <lb/>
					<!-- l replaced -->  # "i" and the digits, as that's<lb/>
					<!-- l replaced -->  # the ISTC number<lb/>
					<!-- l replaced -->    $hit = 0;<lb/>
					<!-- l replaced -->    # Reset our flag<lb/>
					<!-- l replaced -->    $istc_number = $1;<lb/>
					<!-- l replaced -->    # Assign the "i plus eight digits"<lb/>
					<!-- l replaced -->    # to a variable<lb/>
					<!-- l replaced -->    print "$istc_number\t$match\n";<lb/>
					<!-- l replaced -->    # Print the ISTC number, a tab<lb/>
					<!-- l replaced -->    # character, the title, and a <lb/>
					<!-- l replaced -->    # new line character<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>The output, as explained above (<ref
						target="#green.dm.1.1.p.0060">§ 6</ref>), begins like this:</p>
				<ab type="code">
					<!-- l replaced -->ia00000500  Orhot Hayyim<lb/>
					<!-- l replaced -->ia00001000  Abbey of the Holy Ghost<lb/>
					<!-- l replaced -->ia00001500  Abbey of the Holy Ghost<lb/>
					<!-- l replaced -->ia00002000  Abbey of the Holy Ghost<lb/>
				</ab>
				<p>This output should be redirected to a file to be saved for
					further use like this:</p>
				<ab type="code">
					<!-- l replaced -->perl title.pl &gt; title.txt<lb/>
				</ab>
				<p>Very little needs to be changed in order to extract the rest of
					the fields. In the line <code>if (/^Title:\t(.*?)$/) {</code>,
					one need only replace <term>Title</term> with Author,
					Bibliography, Cataloguing Source, Collective Title, Format,
					Imprint, Language, Locations, or Notes.</p>
			</div>
			<div xml:id="green.dm.1.1.appendix.3" type="appendix">
				<head>Appendix 3: A Perl script for analyzing the IISTC imprint
					field</head>
				<p xml:id="green.dm.1.1.p.0190">On many occasions, it would be
					useful to turn the IISTC imprint field into separate fields for
					city, printer, and date of printing, and to clearly distinguish
					between signed and attributed information. The following script
					accomplishes this based on the output from the script in <ref
						target="#green.dm.1.1.appendix.2" type="navigation">appendix
						2</ref> as applied to the <term>Imprint</term> field.</p>
				<ab type="code">
					<!-- l replaced --># This script takes as input a <lb/>
					<!-- l replaced --># tab-delimited table of istc <lb/>
					<!-- l replaced --># numbers and imprint fields, <lb/>
					<!-- l replaced --># assumed here to be named 'imprint.txt'. <lb/>
					<!-- l replaced --># This script outputs the istc number <lb/>
					<!-- l replaced --># again as an index, followed by the <lb/>
					<!-- l replaced --># first imprint field only, then fields <lb/>
					<!-- l replaced --># containing the city and printer. Then <lb/>
					<!-- l replaced --># it outputs the years: the average <lb/>
					<!-- l replaced --># of all years in all imprint fields, <lb/>
					<!-- l replaced --># the earliest and then the latest such <lb/>
					<!-- l replaced --># year. The last column contains three <lb/>
					<!-- l replaced --># flags, either + or -. Signed cities, <lb/>
					<!-- l replaced --># printers, and dates appear as +++, <lb/>
					<!-- l replaced --># while the opposite would be ---. Years <lb/>
					<!-- l replaced --># appearing in single quotes ('1401') <lb/>
					<!-- l replaced --># have been ignored.<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced --># set imprint data file<lb/>
					<!-- l replaced -->$batch="imprint.txt";    <lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced --># open the file to process, or give an error<lb/>
					<!-- l replaced --># code <lb/>
					<!-- l replaced -->open BATCH, $batch or die "Cannot open $batch for read:$!"; <lb/>
					<!-- l replaced --># create column titles<lb/>
					<!-- l replaced -->print <lb/>
					<!-- l replaced -->"istc_number\timprint\tcity\tprinter\tavg_year\tfirst_year\tlast_year\tflags\n";<lb/>
					<!-- l replaced -->while (&lt;BATCH&gt;) {<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # first, reset all variables<lb/>
					<!-- l replaced -->  undef @allyears;<lb/>
					<!-- l replaced -->  undef @sort; <lb/>
					<!-- l replaced -->  $firstyear=0; <lb/>
					<!-- l replaced -->  $lastyear=0; <lb/>
					<!-- l replaced -->  $avgyear = 0; <lb/>
					<!-- l replaced -->  $yearcount = 0; <lb/>
					<!-- l replaced -->  $flags='+++';<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # save the input line as $record for later<lb/>
					<!-- l replaced -->  # use<lb/>
					<!-- l replaced -->  $record=$_;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # get first two tab-delimited fields, the<lb/>
					<!-- l replaced -->  # ISTC Number and first imprint line<lb/>
					<!-- l replaced -->  /^(.*?)\t(.*?)\t/;<lb/>
					<!-- l replaced -->  $istc_number=$1;<lb/>
					<!-- l replaced -->  $imprint=$2;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # search the imprint line for an optional<lb/>
					<!-- l replaced -->  # opening bracket, then the city, then a<lb/>
					<!-- l replaced -->  # colon, then the rest of the line<lb/>
					<!-- l replaced -->  $imprint=~/^(\[|)(.*?)(?:\]: |: )(.*)$/;<lb/>
					<!-- l replaced -->  $rightpart=$3;<lb/>
					<!-- l replaced -->  $city=$2;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if an opening bracket was found, flag <lb/>
					<!-- l replaced -->  #the city as unsigned<lb/>
					<!-- l replaced -->  if ($1) {substr $flags, 0, 1, "-"}<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # split the rest of the line by commas,<lb/>
					<!-- l replaced -->  # forming the array @printer<lb/>
					<!-- l replaced -->  @printer=split /, /, $rightpart;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # fix 3 defective records: if there's<lb/>
					<!-- l replaced -->  # no comma found in the rest<lb/>
					<!-- l replaced -->  # of the line, and there's no number<lb/>
					<!-- l replaced -->  # to be found, add a dummy, <lb/>
					<!-- l replaced -->  # empty date element to array<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # fix defective imprint lines not<lb/>
					<!-- l replaced -->  # handled correctly: ip01005630 <lb/>
					<!-- l replaced -->  # (no year,), ic00216715 (no year,), ir00334450<lb/>
					<!-- l replaced -->  if ($#printer==0 and $printer[0]!~/\d/) {push @printer, "
						"}<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # fix for two defective records with no<lb/>
					<!-- l replaced -->  # imprint data: print the <lb/>
					<!-- l replaced -->  # istc number and then skip the rest of the loop<lb/>
					<!-- l replaced -->  if ($record=~/^([^\t]*?)\t$/) {<lb/>
					<!-- l replaced -->    $istc_number=$1;<lb/>
					<!-- l replaced -->    print "$istc_number\n";<lb/>
					<!-- l replaced -->    next;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # remove the last element of @printer array;<lb/>
					<!-- l replaced -->  # it's usually the date field<lb/>
					<!-- l replaced -->  $date = pop @printer;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # fix for two deficient records containing neither<lb/>
					<!-- l replaced -->  # city nor printer, just dates<lb/>
					<!-- l replaced -->  if ($imprint !~/:/) {<lb/>
					<!-- l replaced -->    $date = $imprint;<lb/>
					<!-- l replaced -->    undef @printer;<lb/>
					<!-- l replaced -->    $city = "";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # remove all brackets to test for a date; we<lb/>
					<!-- l replaced -->  # need to find the ca. 150 records of the<lb/>
					<!-- l replaced -->  # anomalous form 'City: printer, year, month <lb/>
					<!-- l replaced -->  # and day'<lb/>
					<!-- l replaced -->  $_ = $date; <lb/>
					<!-- l replaced -->  s/[\[\]]//g;<lb/>
					<!-- l replaced -->  $xdate=$_;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # remove all brackets from current last element<lb/>
					<!-- l replaced -->  # of @printer array<lb/>
					<!-- l replaced -->  $ydate=@printer[-1];<lb/>
					<!-- l replaced -->  $ydate=~s/[\[\]]//g;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if $date doesn't contain a year, then check<lb/>
					<!-- l replaced -->  # the last element of @printer; if it does,<lb/>
					<!-- l replaced -->  # pop it onto the front of $date<lb/>
					<!-- l replaced -->  if ($xdate !~/1[45]\d{2}|undated/i and $ydate=~/1[45]\d{2}/)
						{<lb/>
					<!-- l replaced -->    $date=pop(@printer).$date;<lb/>
					<!-- l replaced -->    $_ = $date; <lb/>
					<!-- l replaced -->    s/[\[\]]//g;<lb/>
					<!-- l replaced -->    $xdate=$_;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # now obliterate dates in single quotes regarded<lb/>
					<!-- l replaced -->  # as false<lb/>
					<!-- l replaced -->  $xdate=~s/'.*?'//g;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # match a year 1400 to 1599<lb/>
					<!-- l replaced -->  $xdate=~/(1[45]\d{2})/;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if we find it, use it, otherwise we have nothing<lb/>
					<!-- l replaced -->  # to test<lb/>
					<!-- l replaced -->  if ($1) {$testyear=$1} else {$testyear=""}<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if we have a date to test, get the last two digits<lb/>
					<!-- l replaced -->  if ($testyear) {$yeardigits=substr $testyear, 2, 2} else
						{$yeardigits='####'}<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if the last two digits are surrounded by brackets,<lb/>
					<!-- l replaced -->  # flag the date as unsigned. [14]94 is treated<lb/>
					<!-- l replaced -->  # as signed, 14[9]4 as unsigned<lb/>
					<!-- l replaced -->  $_ = $imprint;<lb/>
					<!-- l replaced -->  if
						(/\[[^\]]*$yeardigits[^\]]*\]|\[$yeardigits|$yeardigits\]/ or
						$yeardigits eq '####') {<lb/>
					<!-- l replaced -->    substr $flags, 2, 1, "-";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # split the input line again on the tabs<lb/>
					<!-- l replaced -->  @checkdates = split /\t/, $record;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # but discard the first two tabs<lb/>
					<!-- l replaced -->  $null=shift @checkdates;<lb/>
					<!-- l replaced -->  $null=shift @checkdates;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # and add the date field previously identified<lb/>
					<!-- l replaced -->  unshift (@checkdates, $date);<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # this next loop extracts all years from each<lb/>
					<!-- l replaced -->  # imprint field in turn<lb/>
					<!-- l replaced -->  foreach $possibledate (@checkdates) {<lb/>
					<!-- l replaced -->    $_=$possibledate;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # remove brackets, get rid of '1401' dates<lb/>
					<!-- l replaced -->    s/\[|\]|'.*?'//g;    <lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # find simple years, like 1493, 1494-,<lb/>
					<!-- l replaced -->    # 1498-1505<lb/>
					<!-- l replaced -->    @simple_years=/(1[45]\d{2})/g;  <lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # add the years found to the list<lb/>
					<!-- l replaced -->    push (@allyears, @simple_years);<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # find dates like 1476-80<lb/>
					<!-- l replaced -->    $_=$possibledate;<lb/>
					<!-- l replaced -->    @complex_years=/(1[45]\d{2}[\-\/]\d{2})\D/g;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # first count the simple years in the next loop<lb/>
					<!-- l replaced -->    foreach $simpleyear(@simple_years) {<lb/>
					<!-- l replaced -->      $avgyear+=$simpleyear;<lb/>
					<!-- l replaced -->      $yearcount++;<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->    # and add the second part to the list of years<lb/>
					<!-- l replaced -->    # in the following loop<lb/>
					<!-- l replaced -->    foreach $complexyear (@complex_years) {<lb/>
					<!-- l replaced -->      <lb/>
					<!-- l replaced -->    # find the element to split on: either - or /<lb/>
					<!-- l replaced -->    $split=substr($complexyear,4,1);    <lb/>
					<!-- l replaced -->    # ignoring @temp[0], as it is already a simple_year <lb/>
					<!-- l replaced -->    @temp = split /$split/, $complexyear;<lb/>
					<!-- l replaced -->    @temp[1]=substr(@temp[0],0,2).@temp[1];<lb/>
					<!-- l replaced -->    push (@allyears, @temp[1]);<lb/>
					<!-- l replaced -->    $avgyear+=@temp[1];<lb/>
					<!-- l replaced -->    $yearcount++;<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  # round to nearest year<lb/>
					<!-- l replaced -->  if ($yearcount) {<lb/>
					<!-- l replaced -->    $avgyear=int(($avgyear/$yearcount)+.5);<lb/>
					<!-- l replaced -->  } else {<lb/>
					<!-- l replaced -->    $avgyear = "";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # now sort the years numerically <lb/>
					<!-- l replaced -->  @sort = sort { $a &lt;=&gt; $b } @allyears;<lb/>
					<!-- l replaced -->  $firstyear=@sort[0];<lb/>
					<!-- l replaced -->  $lastyear= @sort[-1];<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # put the printer back together<lb/>
					<!-- l replaced -->  $printer=join ', ', @printer;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # add missing front or back brackets for aesthetics only<lb/>
					<!-- l replaced -->  $_ = $printer;<lb/>
					<!-- l replaced -->  if (/^[^\[]+\]/) {$printer='['.$printer}<lb/>
					<!-- l replaced -->  if (/\[[^\]]+$/) {$printer=$printer.']'}<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # now get rid of all brackets and store as $xprinter<lb/>
					<!-- l replaced -->  $_ = $printer;<lb/>
					<!-- l replaced -->  s/[\[\]]//g;<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->  # if the printer is enclosed in brackets, or begins with
						a<lb/>
					<!-- l replaced -->  # bracket, flag as unsigned<lb/>
					<!-- l replaced -->  $xprinter=$_;<lb/>
					<!-- l replaced -->  if ($imprint=~/\[[^\]]*\Q$xprinter\E[^\]]*/ or<lb/>
					<!-- l replaced -->  $printer=~/^\[/) {<lb/>
					<!-- l replaced -->    substr $flags, 1, 1, "-";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  # output the information and continue on to the next<lb/>
					<!-- l replaced -->  # record<lb/>
					<!-- l replaced -->  print "$istc_number\t$imprint\t$city\t$xprinter\t";<lb/>
					<!-- l replaced -->  print "$avgyear\t$firstyear\t$lastyear\t$flags\n";<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>That is, the input file begins like this:</p>
				<ab type="code">
					<!-- l replaced -->ia00000500  [Spain or Portugal: Printer of Alfasi's Halakhot,
						before 1492?]  <lb/>
					<!-- l replaced -->ia00001000  Westminster: Wynkyn de Worde, [about 1496]  <lb/>
					<!-- l replaced -->ia00001500  Westminster: Wynkyn de Worde, [about 1497]  <lb/>
					<!-- l replaced -->ia00002000  Westminster: Wynkyn de Worde, [about 1500]  <lb/>
					<!-- l replaced -->ia00003000  [London: John Lettou and William de Machlinia,
						about 1482]  <lb/>
					<!-- l replaced -->ia00004000  [London]: Richard Pynson, 9 Oct. 1499  <lb/>
					<!-- l replaced -->ia00004500  [London]: Richard Pynson, 9 Oct. 1499  <lb/>
					<!-- l replaced -->ia00005000  [London]: Richard Pynson, '9 Oct. 1499' [about
						1503]  <lb/>
					<!-- l replaced -->ia00005500  [The Netherlands: Prototypography, about
						1465-80]  <lb/>
					<!-- l replaced -->ia00008000  Venice: Franciscus Lapicida, 20 Oct. 1494  <lb/>
				</ab>
				<p>The output of the further manipulation here appears as follows
					in eight different fields:</p>
				<ab type="code">
					<!-- l replaced -->istc_number  imprint  city  printer  avg_year  first_year  last_year  flags<lb/>
					<!-- l replaced -->ia00000500  [Spain or Portugal: Printer of Alfasi's Halakhot,
						before 1492?]  Spain or Portugal  Printer of Alfasi's
						Halakhot  1492  1492  1492  ---<lb/>
					<!-- l replaced -->ia00001000  Westminster: Wynkyn de Worde, [about
						1496]  Westminster  Wynkyn de Worde  1496  1496  1496  ++-<lb/>
					<!-- l replaced -->ia00001500  Westminster: Wynkyn de Worde, [about
						1497]  Westminster  Wynkyn de Worde  1497  1497  1497  ++-<lb/>
					<!-- l replaced -->ia00002000  Westminster: Wynkyn de Worde, [about
						1500]  Westminster  Wynkyn de Worde  1500  1500  1500  ++-<lb/>
					<!-- l replaced -->ia00003000  [London: John Lettou and William de Machlinia,
						about 1482]  London  John Lettou and William de
						Machlinia  1482  1482  1482  ---<lb/>
					<!-- l replaced -->ia00004000  [London]: Richard Pynson, 9 Oct.
						1499  London  Richard Pynson  1499  1499  1499  -++<lb/>
					<!-- l replaced -->ia00004500  [London]: Richard Pynson, 9 Oct.
						1499  London  Richard Pynson  1499  1499  1499  -++<lb/>
					<!-- l replaced -->ia00005000  [London]: Richard Pynson, '9 Oct. 1499' [about
						1503]  London  Richard Pynson  1503  1503  1503  -+-<lb/>
					<!-- l replaced -->ia00005500  [The Netherlands: Prototypography, about
						1465-80]  The
						Netherlands  Prototypography  1473  1465  1480  ---<lb/>
					<!-- l replaced -->ia00008000  Venice: Franciscus Lapicida, 20 Oct.
						1494  Venice  Franciscus Lapicida  1494  1494  1494  +++<lb/>
				</ab>
			</div>
			<div xml:id="green.dm.1.1.appendix.4" type="appendix">
				<head>Appendix 4: An approach to counting incunables using the
					IISTC</head>
				<p xml:id="green.dm.1.1.p.0200">Turning the IISTC's
						<term>Locations</term> field into a numerical count of
					surviving copies presents new challenges, as the format for
					recording copies varies considerably between regions. American,
					German, and Italian libraries are always divided by semicolons;
					Belgian and Other libraries usually appear as <term>City, First
						Library, Second Library</term>; Dutch, Spanish, and most Other
					European libraries appear as <term>City First Library, Second
						Library</term>; and French and British records mix both
					formats. In addition, one hopes but can never be sure that the
					frequent records describing a library's holdings of a given
					incunable as <quote>(3, 1 defective)</quote> consistently mean
						<quote>three copies, of which one is defective</quote> rather
					than <quote>3 complete copies plus one defective one</quote>.
					Perfect accuracy in automatically counting the IISTC's countless
					incunables may not be possible, but a high degree of accuracy
					(verified by comparing computer-generated results with
					old-fashioned tabulation) is achievable and sufficient for
					answering many questions and for helping to formulate others.</p>
				<p xml:id="green.dm.1.1.p.0210">For counting the number of extant
					copies in the IISTC, the process is broken down into two steps
					for sake of simplicity. First, a simple script-or rather, ten
					minor variations on a simple script-are used to extract only the
					relevant data from the full export of IISTC records. The
					following script searches out only copies in American
					libraries:</p>
				<ab type="code">
					<!-- l replaced -->$batch="istc.txt";    #name of file to search<lb/>
					<!-- l replaced -->open BATCH, $batch or die "Cannot open $batch for
						read:$!";<lb/>
					<!-- l replaced -->while (&lt;BATCH&gt;) {<lb/>
					<!-- l replaced -->  if (/^[ ]*USA:\t(.*?)$/) {<lb/>
					<!-- l replaced -->    $match = $1;<lb/>
					<!-- l replaced -->    $hit=1;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  if (/^ISTC.*(i.\d{8})/ and ($hit == 1)) {<lb/>
					<!-- l replaced -->    $hit = 0;<lb/>
					<!-- l replaced -->    $istc_number = $1;<lb/>
					<!-- l replaced -->    print "$istc_number\t$match\n";<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>The output of this script is a long list of ISTC numbers and the
					libraries in which copies of the relevant incunable can be
					found:</p>
				<ab type="code">
					<!-- l replaced -->ia00000500  JTSL (1 leaf)<lb/>
					<!-- l replaced -->ia00001000  PML<lb/>
					<!-- l replaced -->ia00002000  FolgSL; PandJG<lb/>
					<!-- l replaced -->ia00003000  AmBML; Harv(L)L; LC(L); NewL (-); PML<lb/>
					<!-- l replaced -->ia00004000  Harv(L)L; LC; PML; UPaL; YU(B)L; EHLS (sold
						1981)<lb/>
					<!-- l replaced -->ia00005000  Harv(L)L; HEHL; LC(L)<lb/>
					<!-- l replaced -->ia00008000  CPhL; Harv(M)L; PML<lb/>
				</ab>
				<p>With minor variations in the fourth line of the script, similar
					files can be created for the other locations by which the IISTC
					organizes its copy attestations: Belgium, British Isles, Other
					European, France, Germany, Italy/Vatican, Netherlands,
					Spain/Portugal, and Other. For Spain/Portugal, for example, the
					output begins:</p>
				<ab type="code">
					<!-- l replaced -->ia00008000  Avila BP<lb/>
					<!-- l replaced -->ia00009200  Barcelona BCatal, BU; Córdoba BP; Madrid BN, BU;
						Sevilla Colombina, BU; Toledo BP; Vigo Massó; Lisboa BN<lb/>
					<!-- l replaced -->ia00012000  Avila BP<lb/>
					<!-- l replaced -->ia00014400  El Escorial RMon<lb/>
					<!-- l replaced -->ia00016500  Córdoba BCap<lb/>
					<!-- l replaced -->ia00017000  Sevilla Colombina; Coimbra BU<lb/>
				</ab>
				<p>The list of libraries in each location that own a given
					incunable is useful information that can be imported as ten new
					fields into the database as described in <ref
						target="#green.dm.1.1.appendix.1" type="navigation">appendix
						1</ref>. What would also be useful, however, is if we had a
					count of copies in a particular location that can be easily
					summed to provide a worldwide incunable count (as far as the
					IISTC is concerned, at least). The following script provides just
					such a functionality for American, Italian, and German libraries.
					This script is invoked a bit differently than the preceding
					scripts, in that it expects two command-line arguments: the name
					of the file to be processed and the name of the file to be
					written. If this script were given the name
						<code>count1.pl</code>, it might be invoked as follows to read
					from the file <code>usa-libraries.txt</code> and create the file
						<code>usa-count.txt</code>:</p>
				<ab type="code">
					<!-- l replaced -->perl count1.pl usa-libraries.txt usa-count.txt<lb/>
				</ab>
				<p>The script is as follows:</p>
				<ab type="code">
					<!-- l replaced --># script to process library-output<lb/>
					<!-- l replaced --># files for consistently<lb/>
					<!-- l replaced --># semicolon-delimited countries: USA,<lb/>
					<!-- l replaced --># Italy, Germany<lb/>
					<!-- l replaced -->$in=shift;    # take input file from command line<lb/>
					<!-- l replaced -->$out = shift;  # take output filename from command line<lb/>
					<!-- l replaced -->open IN, $in or die "Cannot open $in for read:$!";<lb/>
					<!-- l replaced -->open OUT, "&gt;$out" or die "Cannot open $out for
						write:$!";<lb/>
					<!-- l replaced -->print OUT "istc_number\tlocations\tcount\n";<lb/>
					<!-- l replaced -->while (&lt;IN&gt;) {<lb/>
					<!-- l replaced -->  $copycount=0;<lb/>
					<!-- l replaced -->  /^(i.\d{8})\t(.*)$/;<lb/>
					<!-- l replaced -->  $istc_number=$1;<lb/>
					<!-- l replaced -->  $locations=$2;<lb/>
					<!-- l replaced -->  @libraries=split /;/, $locations;<lb/>
					<!-- l replaced -->  foreach $library (@libraries) {<lb/>
					<!-- l replaced -->    while ($library=~/\((?:\D|\d+[^,])[^\(]*?\)/) {<lb/>
					<!-- l replaced -->      $library=~s/\((?:\D|\d+[^,])[^\(]*?\)//g;<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->    #get rid of nested parentheses<lb/>
					<!-- l replaced -->    $library=~s/\((\d{1,2})[^\(]*\)/\(\1\)/g;<lb/>
					<!-- l replaced -->    #replace (3, 1 torn) with (3)<lb/>
					<!-- l replaced -->    if ($library=~/\((\d{1,2})\)/) {$copycount+=$1} else<lb/>
					<!-- l replaced -->    {$copycount++}<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  print OUT "$istc_number\t$locations\t$copycount\n";<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>The output of this script includes column headings. For German
					libraries, for example, it begins:</p>
				<ab type="code">
					<!-- l replaced -->istc_number  locations  count<lb/>
					<!-- l replaced -->ia00008000  Bamberg SB; München BSB  2<lb/>
					<!-- l replaced -->ia00009100  Gotha ForschLB; Tübingen UB  2<lb/>
					<!-- l replaced -->ia00009200  Augsburg SStB; Bamberg SB; Berlin SB; Darmstadt
						LHSB; Freiburg i.Br. UB; Giessen UB; Göttingen SUB; Heidelberg
						UB; Karlsruhe BLB; Mainz StB; München BSB (3); München UB;
						Passau SB; Würzburg UB (2)  17<lb/>
					<!-- l replaced -->ia00009300  Frankfurt(Main) StUB (imperfect)  1<lb/>
					<!-- l replaced -->ia00009900  Hannover KestnerM  1<lb/>
				</ab>
				<p>The IISTC's variability in recording copies requires the script
					to be adapted for other locations, however. The next two scripts
					are minor variations on the preceding one. The first addresses
					locations such as Belgium that insert a comma between the name of
					the city and the libraries owning a particular incunable:</p>
				<ab type="code">
					<!-- l replaced --># script to process library-output files for<lb/>
					<!-- l replaced --># countries delimited as City, Library1, Library2:<lb/>
					<!-- l replaced --># Belgium, Other [usually]<lb/>
					<!-- l replaced -->$in=shift;    # take input file from command line<lb/>
					<!-- l replaced -->$out = shift;  # take output filename from command line<lb/>
					<!-- l replaced -->open IN, $in or die "Cannot open $in for read:$!";<lb/>
					<!-- l replaced -->open OUT, "&gt;$out" or die "Cannot open $out for
						write:$!";<lb/>
					<!-- l replaced -->print OUT "istc_number\tlocations\tcount\n";<lb/>
					<!-- l replaced -->while (&lt;IN&gt;) {<lb/>
					<!-- l replaced -->  undef @cities;<lb/>
					<!-- l replaced -->  $copycount=0;<lb/>
					<!-- l replaced -->  /^(i.\d{8})\t(.*)$/;<lb/>
					<!-- l replaced -->  $istc_number=$1;<lb/>
					<!-- l replaced -->  $locations=$2;<lb/>
					<!-- l replaced -->  @cities=split /;/, $locations;<lb/>
					<!-- l replaced -->  foreach $city (@cities) {<lb/>
					<!-- l replaced -->    while ($city=~/\((?:\D|\d+[^,])[^\(]*?\)/) {<lb/>
					<!-- l replaced -->      $city=~s/\((?:\D|\d+[^,])[^\(]*?\)//g;<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->    #get rid of nested parentheses<lb/>
					<!-- l replaced -->    $city=~s/\((\d{1,2})[^\(]*\)/\(\1\)/g;<lb/>
					<!-- l replaced -->    #replace (3, 1 torn) with (3)<lb/>
					<!-- l replaced -->    undef @libraries;<lb/>
					<!-- l replaced -->    if ($city =~ /,/) {<lb/>
					<!-- l replaced -->      @libraries=split /,/, $city;<lb/>
					<!-- l replaced -->      $null = shift @libraries;<lb/>
					<!-- l replaced -->    } else {$libraries[0] = $city}<lb/>
					<!-- l replaced -->    foreach $library (@libraries) {<lb/>
					<!-- l replaced -->      if ($library=~/\((\d{1,2})\)/) {<lb/>
					<!-- l replaced -->        $copycount+=$1;<lb/>
					<!-- l replaced -->      } else {$copycount++}<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  print OUT "$istc_number\t$locations\t$copycount\n";<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>The next script is for locations such as Spain/Portugal that
					separate libraries within a single city from each other with
					commas, but without a comma after the name of the city:</p>
				<ab type="code">
					<!-- l replaced --># Script to process library-output files for countries<lb/>
					<!-- l replaced --># delimited as City Library1, Library2: Other Europe,<lb/>
					<!-- l replaced --># Spain, Netherlands, France (mostly), Britain (usually)<lb/>
					<!-- l replaced -->$in=shift;    # take input file from command line<lb/>
					<!-- l replaced -->$out = shift;  # take output filename from command line<lb/>
					<!-- l replaced -->open IN, $in or die "Cannot open $in for read:$!";<lb/>
					<!-- l replaced -->open OUT, "&gt;$out" or die "Cannot open $out for
						write:$!";<lb/>
					<!-- l replaced -->print OUT "istc_number\tlocations\tcount\n";<lb/>
					<!-- l replaced -->while (&lt;IN&gt;) {<lb/>
					<!-- l replaced -->  undef @cities;<lb/>
					<!-- l replaced -->  $copycount=0;<lb/>
					<!-- l replaced -->  /^(i.\d{8})\t(.*)$/;<lb/>
					<!-- l replaced -->  $istc_number=$1;<lb/>
					<!-- l replaced -->  $locations=$2;<lb/>
					<!-- l replaced -->  @cities=split /;/, $locations;<lb/>
					<!-- l replaced -->  foreach $city (@cities) {<lb/>
					<!-- l replaced -->    while ($city=~/\((?:\D|\d+[^,])[^\(]*?\)/) {<lb/>
					<!-- l replaced -->      $city=~s/\((?:\D|\d+[^,])[^\(]*?\)//g;<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->    #get rid of nested parentheses<lb/>
					<!-- l replaced -->    $city=~s/\((\d{1,2})[^\(]*\)/\(\1\)/g;<lb/>
					<!-- l replaced -->    #replace (3, 1 torn) with (3)<lb/>
					<!-- l replaced -->    undef @libraries;<lb/>
					<!-- l replaced -->    @libraries=split /,/, $city;<lb/>
					<!-- l replaced -->    foreach $library (@libraries) {<lb/>
					<!-- l replaced -->      if ($library=~/\((\d{1,2})\)/) {<lb/>
					<!-- l replaced -->        $copycount+=$1;<lb/>
					<!-- l replaced -->      } else {$copycount++}<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  print OUT "$istc_number\t$locations\t$copycount\n";<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>As explained above, the IISTC contains some records that are
					truly ambiguous as to the number of copies in question, and for
					some locations the formatting is inconsistent. In the case of
					inconsistent formatting, some further refinement can help reduce
					the inaccuracies. It is undoubtedly useful for the staff of the
					British Library for their copies to appear at the head of the
					list of libraries in the British Isles rather than with other
					London libraries, and with signatures of all their copies;
					however, for attempting a count based on this data, it is
					distinctly annoying. Consider the following data:</p>
				<ab type="code">
					<!-- l replaced -->ia00017000  London BL, 167.f.13 = IB.27036; Chatsworth;
						Edinburgh NLS (Inc.207); Oxford Bodley (2), Magdalen, Pembroke
						(2) Colleges; Stonyhurst College<lb/>
					<!-- l replaced -->ia00018600  Cambridge, Trinity Hall; Oxford Bodley, All Souls
						College<lb/>
					<!-- l replaced -->ia00020500  London BL, IC.28708; Barnard Castle, Bowes
						Museum<lb/>
					<!-- l replaced -->ia00021000  Cambridge, Trinity Hall; Oxford, New College<lb/>
				</ab>
				<p>How is a computer to know that <quote>Oxford Bodley, All Souls
						College</quote> refers to two copies, while <quote>Oxford, New
						College</quote> refers to just one? The assumption that a comma
					divides a city and its libraries must be modified with an
					explicit statement that Oxford is a city, as the following script
					attempts to implement. As a consequence, the anomalous recording
					of British Library copies results in an overcount by one that
					must be individually corrected.</p>
				<ab type="code">
					<!-- l replaced --># script to process library-output files<lb/>
					<!-- l replaced --># for British Isles.<lb/>
					<!-- l replaced --># Remove nested parentheses first, to get<lb/>
					<!-- l replaced --># rid of semicolons within comments, and <lb/>
					<!-- l replaced -->#then split on semicolons; if there is<lb/>
					<!-- l replaced --># a 'London, BL', remove one from the total count<lb/>
					<!-- l replaced -->$in="libs-brit.txt";    # input file <lb/>
					<!-- l replaced -->$out = "bricount.txt";    # output file <lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->open IN, $in or die "Cannot open $in for read:$!";<lb/>
					<!-- l replaced -->open OUT, "&gt;$out" or die "Cannot open $out for
						write:$!";<lb/>
					<!-- l replaced -->print OUT "istc_number\tlocations\tcount\n";  # add column
						heads<lb/>
					<!-- l replaced --> <lb/>
					<!-- l replaced -->while (&lt;IN&gt;) {<lb/>
					<!-- l replaced -->  undef @cities;<lb/>
					<!-- l replaced -->  $copycount=0;<lb/>
					<!-- l replaced -->  /^(i.\d{8})\t(.*)$/;<lb/>
					<!-- l replaced -->  $istc_number=$1;<lb/>
					<!-- l replaced -->  $locations=$2;<lb/>
					<!-- l replaced -->  <lb/>
					<!-- l replaced -->    # get rid of (digit) in BL signatures<lb/>
					<!-- l replaced -->    $fixlocations=$locations;<lb/>
					<!-- l replaced -->  while ($fixlocations=~/[^ ]\(\d\)/) {<lb/>
					<!-- l replaced -->    $fixlocations=~s/[^ ]\(\d\)//g;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  <lb/>
					<!-- l replaced -->  # get rid of nested parentheses <lb/>
					<!-- l replaced -->  while ($fixlocations=~/\((?:\D|\d+[^,])[^\(]*?\)/) {<lb/>
					<!-- l replaced -->    $fixlocations=~s/\((?:\D|\d+[^,])[^\(]*?\)//g;<lb/>
					<!-- l replaced -->  }<lb/>
					<!-- l replaced -->  <lb/>
					<!-- l replaced -->  @cities=split /;/, $fixlocations;<lb/>
					<!-- l replaced -->  foreach $city (@cities) {<lb/>
					<!-- l replaced -->    $city=~s/\((\d{1,2})[^\(]*\)/\(\1\)/g;<lb/>
					<!-- l replaced -->    # replace (3, 1 torn) with (3)<lb/>
					<!-- l replaced -->    $city=~s/\(\d{1,2} lea[^\(]*\)//g;<lb/>
					<!-- l replaced -->    # eliminate e.g. (3 leaves)<lb/>
					<!-- l replaced -->    if ($city=~/London BL,[^,]* and /) {$copycount++}<lb/>
					<!-- l replaced -->    # correct for multiple BL signatures without<lb/>
					<!-- l replaced -->    # comma dividers<lb/>
					<!-- l replaced -->    $city=~s/(London|Oxford|Cambridge|Manchester|Dublin|Durham|Hereford|Edinburgh|Cashel|Guernsey|Coleraine|Barnard
						Castle|Parkminster|Northampton|Reigate|
						Birmingham|Canterbury|Harpenden|Brasenose|Killiney),/\1/;<lb/>
					<!-- l replaced -->    # eliminate commas after city names<lb/>
					<!-- l replaced -->    undef @libraries;<lb/>
					<!-- l replaced -->    @libraries=split /,/, $city;<lb/>
					<!-- l replaced -->    foreach $library (@libraries) {<lb/>
					<!-- l replaced -->      if ($library=~/\((\d{1,2})\)/) {<lb/>
					<!-- l replaced -->        $copycount+=$1;<lb/>
					<!-- l replaced -->      } else {<lb/>
					<!-- l replaced -->        $copycount++;<lb/>
					<!-- l replaced -->      }<lb/>
					<!-- l replaced -->      if ($library=~/London BL/) {<lb/>
					<!-- l replaced -->        $copycount--;<lb/>
					<!-- l replaced -->      }<lb/>
					<!-- l replaced -->    }<lb/>
					<!-- l replaced -->}<lb/>
					<!-- l replaced -->  print OUT "$istc_number\t$locations\t$copycount\n";<lb/>
					<!-- l replaced -->  #print "$istc_number\t$locations\t$copycount\n";<lb/>
					<!-- l replaced -->}<lb/>
				</ab>
				<p>Some sample output illustrates that Perl can deal with a great
					deal of discrepancy in formatting and still arrive at a correct
					count, while the question of what ownership of a
						<soCalled>copy</soCalled> of an incunable really means is a
					separate issue entirely:</p>
				<ab type="code">
					<!-- l replaced -->ia00425700  Oxford Bodley  1<lb/>
					<!-- l replaced -->ia00426000  London BL, IB.21897 (Acquisition 1985, not in BMC.
						Bound with Nicolaus Perottus, Rudimenta grammatices, Lyons,
						anonymous press (IB.21897) and Aelius Anthonius Nebrissensis,
						Introductiones Latinae, Logroño, Arnao de Brocar, 1510. In a
						Spanish binding); Oxford Bodley  2<lb/>
					<!-- l replaced -->ia00426300  Cambridge, St John's College (2 ff.)  1<lb/>
					<!-- l replaced -->ia00426500  London BL, IB.21851  1<lb/>
					<!-- l replaced -->ia00426600  London BL, Harl.5918(2) = IA.49742 (Colophon only,
						in the Bagford Collection)  1<lb/>
					<!-- l replaced -->ia00426700  Cambridge UL (imperfect, wants a2-7 and all after
						K6); Oxford Bodley (fragment consisting of ff. f1,6, quire E
						and ff. I2-5)  2<lb/>
					<!-- l replaced -->ia00428000  London BL, IA.20854 (Imperfect, wanting leaf g7
						and sheets h4, i4)  1<lb/>
				</ab>
			</div>
			<div>
				<listBibl>
					<bibl xml:id="britishlibrary1998">The British Library. 1998.
							<title level="m">The illustrated ISTC on CD-ROM</title>. 2nd
						ed. London: Primary Source Media, in association with the
						British Library.</bibl>
					<bibl xml:id="coppingerwa1895-1902">Copinger, Walter Arthur.
						1895-1902. <title level="m">Supplement to Hain's <title
								level="m">Repertorium bibliographicum</title>: or,
							collections toward a new edition of that work</title>.
						London: H. Sotheran.</bibl>
					<bibl xml:id="daviesmandgoldfinchj1992">Davies, Martin and John
						Goldfinch. 1992. <title level="m">Vergil: a census of printed
							editions 1469-1500</title>. <title level="s">Occasional
							Papers of the Bibliographical Society</title> 7. London: The
						Bibliographical Society.</bibl>
					<bibl xml:id="gesamtkatalogderwiegendrucke1925-"><title level="m"
							>Gesamtkatalog der Wiegendrucke</title>. 1925-. 10 vols. to
						date. Stuttgart: Hiersemann.</bibl>
					<bibl xml:id="hainl1826-1838">Hain, Ludwig. 1826-1838. <title
							level="m">Repertorium bibliographicum, in quo libri omnes ab
							arte typographica inventa usque ad annum MD. typis expressi,
							ordine alphabetico vel simpliciter enumerantur vel adcuratius
							recensentur</title>. 2 vols. Stuttgart: J. G. Cotta.</bibl>
					<bibl xml:id="neddermeyeru1998">Neddermeyer, Uwe. 1998. <title
							level="m">Von der Handschrift zum gedruckten Buch</title>:
							<title level="m">Schriftlichkeit und Leseinteresse im
							Mittelalter und in der frühen Neuzeit</title>.<title
							level="m"> Quantitative und qualitative Aspekte</title>.
						Buchwissenschaftliche Beiträge aus dem deutschen Bucharchiv
						München 61. 2 vols. Wiesbaden: Harrassowitz.</bibl>
					<bibl xml:id="needhamp1999">Needham, Paul. 1999. <title level="a"
							>Counting incunables: the IISTC CD-ROM.</title>
						<title level="m">Huntington Library Quarterly</title> 61:
						457-529.</bibl>
					<bibl xml:id="ohlykandsackv1966-1967">Ohly, Kurt and Vera Sack.
						1966-1967. <title level="m">Inkunabelkatalog der Stadt- und
							Universitätsbibliothek und anderer öffentlicher Sammlungen in
							Frankfurt am Main</title>. Frankfurt: Klostermann.</bibl>
					<bibl xml:id="reichlingd1905-1911">Reichling, Dietrich.
						1905-1911. <title level="m">Appendices ad Hainii-Copingeri
							Repertorium bibliographicum</title>. 7 vols. Munich:
						Rosenthal.</bibl>
					<bibl xml:id="reskec2000">Reske, Christoph. 2000. <title
							level="m">Die Produktion der Schedelschen Weltchronik in
							Nürnberg</title>. Wiesbaden: Harrassowitz.</bibl>
				</listBibl>
			</div>
		</back>
	</text>
</TEI>
