Digital Medievalist 7 (2011). ISSN: 1715-0736.
© Morgan Kay and Maryanne Kowaleski, 2011. Creative Commons Attribution-NonCommercial licence

Developing an Online Database on a Shoestring: Growing Pains at the Online Medieval Sources Bibliography

[ Skip to Abstract | Return to Top ]

Peer-Reviewed Article

Accepting Editor: Christine McWebb, University of Waterloo.
Recommending Reader: Nadia Altschul, Johns Hopkins University.
Received: September 11, 2011
Revised: November 19, 2011
Published: February 7, 2012

[ Skip to Navigation | Return to Colophon ]

Abstract

The Online Medieval Sources Bibliography (OMSB) is a database of modern editions of medieval primary sources. This paper discusses the computing, financing, and logistical challenges we faced in creating the database, as well as our solutions. The OMSB is aimed at a wide audience, from high school students to professors, so we have had to tailor our data to the needs of many different types of researcher, and to keep in mind all of the different ways someone might search for sources. Working with a very limited budget, we have made use of graduate and undergraduate students to provide both programming and data entry, a solution that has provided excellent research experience (and often much-needed funding) for the students involved.

Keywords: Database; Ruby on Rails; Primary sources; Data entry.


[ Return to Navigation]

§ 1    Most of the large bibliographic databases available to medievalists are aimed at research scholars and sponsored by universities and other institutions. We have all benefited tremendously from these sites, but this paper is about the creation and development of a bibliographic database on a shoestring budget, aimed at students as much as at researchers. Inspired in part by the success of Fordham University's Internet Medieval Sourcebooks (<http://www.fordham.edu/halsall/sbook1.html>), which provide easy online access to thousands of copyright-free translations of primary sources for classroom use, the Online Medieval Sources Bibliography (OMSB, available online at <http://medievalsourcesbibliography.org>) began in 2003 as a desktop Microsoft Access database created by Maryanne Kowaleski. It aimed to help familiarize graduate students with more primary sources for the study of the Middle Ages and to provide them with some course credit and/or income as they worked as cataloguers on the bibliography. It has grown into a large database of annotations about modern editions of medieval primary sources, and a useful resource for students, teachers, and researchers. This paper discusses the aims, structure, and content of the OMSB, but it focuses in particular on the lessons we learned in solving the problems have that confronted us in the first seven years of the database.

§ 2    The OMSB is a searchable database of modern editions and translations — both printed and online — of medieval primary sources. It aims to provide annotated entries of all sorts of primary sources for the study of the Middle Ages. The database now catalogs over 4000 works, particularly historical records and literary texts, but we have taken steps in recent years to include more sources in art history, philosophy, and theology. The OMSB is meant to help identify primary sources on particular subjects, in certain periods or regions, in a specific language, or in a particular genre. Those looking for which modern edition of a medieval text best suits their needs will find the annotations on the site useful, as will those looking for a facing page translation, or a text with an extensive glossary or facsimiles, or records written in a particular language. Figure 1 shows the search screen for the database, and all of the search terms researchers can use to find records.

Figure 1: OMSB search screen OMSB search screen

§ 3    Figures 2 and 3 show the entry for Skeat's edition of Piers Plowman. The user is given information about a wide variety of the book's features (for example, this edition has a full apparatus), although the main strength of the OMSB is the useful information contained in the "Comments" field, which, in this instance, offers considerable context. Not all of our entries meet this high standard, but it is our goal.

Figure 2: Record details for Skeat's edition of Piers Plowman, top of entry Record details for Skeat's edition of Piers Plowman, top of entry

Figure 3: Record details for Skeat's edition of Piers Plowman, bottom of entry Record details for Skeat's edition of Piers Plowman, bottom of entry

§ 4    The search fields provide some useful and powerful tools for searching the contents of the database. Getting them to this stage was not easy, especially since we want the database to be employed by a wide audience, from advanced high school students with little knowledge of the Middle Ages to doctoral students and professors. The first major problem we confronted was which fields to include and what to include in them. For example, one particularly difficult decision was the "Geo-political region" field. On the surface, it seems simple: we want users to be able to search for records that relate to France or England or whatever country they study. We started with a simple list of medieval countries, but quickly ran into complications: what about Flanders? Should we use modern names, or medieval ones? We thought about this from the point of view of end-users: how would they search for records? We realized that all country names, whether medieval or modern, had to be included to make locations easier to find. We also recognized that our choices could have political implications since some modern countries are very possessive of their medieval heritage, even if some of it originated outside of their modern borders, so we decided to name the field "Geo-political region" instead of "Country." We also include some broader regional names, such as "Iberian Peninsula" and "Scandinavia," for areas where borders changed throughout the Middle Ages.

§ 5    Another field that seemed relatively simple at first, but turned out to be quite complicated, was "Medieval Author." Initially, this was just a text field where catalogers typed in the author's name. We soon realized that we would be wise to standardize authors' names: medieval authors have enough variant spellings and aliases that we wanted to make sure all catalogers were using the same name. So we decided to turn that field into a drop-down menu, where catalogers could choose from a list of authors' names. Standardizing and alphabetizing authors' names turned out to be very difficult: should we list Thomas Aquinas under Thomas or under Aquinas? We decided to use the Dictionary of the Middle Ages as our standard. We talked to the DMA's editors about how they decided to standardize authors' names, and discovered it could be somewhat arbitrary: Aquinas is under "A" in the DMA because they wanted to have an important, well-known author in the first volume that was published. So Morgan Kay was put in charge of the author list: every time a work by a new author is put in the database, she looks the author up in the DMA and adds him or her to our list of authors. If an author isn't in the DMA, we use WorldCat as our standard. Back in the summer of 2004, we thought there could not be more than a few hundred named medieval authors who would show up in our database, but now we are at almost 1200 and still adding more. Eventually, we realized that just listing the author's name isn't enough: we needed to list alternate spellings, aliases, and dates to make sure we did not get authors confused. Our hit statistics also told us that a lot of people were finding OMSB by doing web searches for obscure authors, but when they got to our website, all they would see was a list of medieval authors; a web search wouldn't even take them to the works by that author.

§ 6    So in 2008, we created a new table for authors that includes their name, aliases and alternate spellings, title, and dates. On the website, each author receives his/her own page, showing all of this information and linking to all of the records by that author (<http://medievalsourcesbibliography.org/authors>). We have also added a field for a brief biography of the author: just a few sentences to place the author in context, along with a short list of useful reference works about the author. The addition of a linked "Medieval Author" field saves us from having to repeat information about authors in the annotations for sources; it also now takes searchers looking for a specific medieval author outside the database right to the author bio page. There are enough authors that it is unlikely that we will ever be able to write biographies for all of them, but the author pages are a good resource. This author list has taken a lot of time and effort, and has grown in response to catalogers' and end-users' needs. We did not anticipate that this would be such a big aspect of our work, but it has turned out to be one of the more unique and useful areas of the website.

§ 7    OMSB has also had to cope with financial challenges. Kowaleski wrote the first database using Microsoft Access in 2003; a graduate student who wanted to do a summer research project for credit tested the database by entering annotations of records she was using in her M.A. thesis. The following summer, Kowaleski applied to Dr Nancy Busch, the Dean of Fordham's Graduate School, for funding for four summer graduate assistants (of which Kay was one) to work twenty hours a week over the summer entering different types of sources. The Dean agreed to fund the project, and has continued to provide funding for graduate assistants for the past six summers, because she had seen how the Internet Medieval Sourcebooks had promoted Fordham's reputation as a center for the study of the Middle Ages, and because she realized the value of the research experience that students would gain working on the project. (We are also planning to apply for external grants, which is another reason we have secured support from the Dean.) Fordham does not have a lot of summer funding opportunities for graduate students, so the OMSB has provided a significant source of income and research experience for a wide range of students in the last seven years.

§ 8    We have also faced a variety of technological challenges, especially since we have had almost no outside IT help in creating, updating, or maintaining the database and website. One of the first summer graduate assistants was a master's student in computer science who was supposed to help write a program to make our data from Microsoft Access available online, but our efforts here were stymied by Fordham's lack of a MySQL server at that time. We eventually found a programmer who was willing to work at a discount and write the website search engine in a web development framework called Ruby on Rails (<http://rubyonrails.org/>). We were still using the Access database for data entry, so he had to write a script to get the data out of Access and into MySQL and onto the website. As the Access database got bigger and bigger, however, the task of calling in and merging all the catalogers' databases four times a year became more and more complicated. The process generally took several days, and eventually generated so many errors that it became impossible.

§ 9    So in the summer of 2007, Kay decided to learn Ruby on Rails and write an entirely new web-based application. The project has benefited tremendously from having a medievalist familiar with its development and problems who is also a programmer with the IT skills to build a new program from the ground up. This happy confluence has also saved the project the considerable cost of hiring outside programming help. The application is now entirely web-based, so catalogers can enter data from any computer with internet access and they do not have to install any software. There is an option to hide records from public view, so works in progress do not clutter what the end users see. Catalogers can look at a list of their own records, and can search on more fields than users who are not logged in.

§ 10    Another challenge has been dealing with student cataloguers, who, in 2005, expanded to include the four graduate assistants appointed with stipends to the Center for Medieval Studies each year. This change both increased the number of cataloguer and gave the grad assistants additional research experience that they could note on their academic CVs. On occasion, we have also had undergraduate volunteers who wish to gain some research experience, particularly if they hope to go on to graduate school. The effectiveness of particular catalogers has a lot to do with whether they find data entry tedious, or whether they have systematic minds and take naturally to data entry. Not surprisingly, the advanced PhD students usually make the best catalogers because of their experience as well as their ability to write the precise and efficient prose needed for annotations. Many of the MA students and all of the undergraduate interns need more extensive editing and coaching to learn how to make proper entries. We have learned more about what traits to look for in potential catalogers, and we have tightened up the editorial procedure so that Kowaleski reads and suggests edits for each record before it is unhidden and goes live online. We have also learned more about how to train catalogers to do a good job (we have developed a 24-page cataloging guide, for example), and how much information we should include in the annotations.

§ 11    The project and its catalogers have come a long way over the years. The records that we thought were complete the very first summer now look short and inadequate, and usually only help people who are already familiar with the material. More recent records (like that in Figures 2 and 3) provide a wealth of information that is written to help users of all levels of expertise. Keeping old records up to our rising standards has been a challenge: Kay has been spending her summers combing through old records and improving them. Most of the student cataloguers who stick around long enough to enter over thirty works will acknowledge how much they can gain from working on the project, from learning more about specific primary sources and specific series (such as EETS or the Anglo-Norman Text Society or the Selden Society) to training themselves to write the informative but precise prose needed for annotations. All of them have also improved their critical skills, particularly in being able to discern the differing value of particular translations or editions. We have also learned that cataloguers must keep in mind the needs of not only students, but also researchers. Some students have to be reminded frequently that the annotations are about what the end user finds helpful. As they create an entry, they should be thinking about how the data entered relates to the source, as well as how the user is going to find it and interact with the information (will they know who Abelard is? Or what a manorial court roll is?). Choosing the appropriate subject headings has turned out to be the hardest part of the entry for the student cataloguers. We now tell them to ask three questions: Does this subject heading relate to the text? Does the subject heading reflect text you have provided in the Comments or Introduction Summary fields? Would someone searching for this subject heading find this text interesting or useful?

§ 12    In the last seven years, OMSB has grown from a humble little desktop Access database to a 4,000-record online application that is becoming more widely known and useful (the site now has around 150 visitors a day, three times more than a year ago), at the same time as it has helped provide research experience and funding for an increasingly large number of graduate and undergraduate students. Along the way, we have found inexpensive ways around the structural, financial, technical, and training conundrums, although we do not claim to have conquered all of the problems we face. We are also adjusting to other changes in the world of digital media. For example, we are rapidly entering online sources, which have grown exponentially over the past few years: over 40 per cent of the records in the database are available online. There was a huge explosion of online texts when Google started digitizing books, and now that Archive.org and some other organizations are also digitizing books, we have a lot of catching up to do. We also want to make more of an effort to welcome additions to the database from fellow researchers (we can give anyone cataloging privileges or a simple form to enter sources they think we should include) and have been providing increasingly more training for undergraduate students, including a junior from Wheaton College who was awarded a research internships stipend by her College to work up to 200 hours on OMSB in the summer of 2010. We strongly believe, in fact, that more cataloguers from more institutions will be a crucial factor in the continued growth, scope, and usefulness of the Online Medieval Sources Bibliography.

Works cited

Halsall, Paul, ed. 2006. Internet medieval sourcebook. Accessed Dec. 10, 2010. <http://www.fordham.edu/halsall/sbook1.html>.

Kay, Morgan and Maryanne Kowaleski, eds. 2010. Online medieval sources bibliography. Accessed Dec. 10, 2010. <http://medievalsourcesbibliography.org/>.

Ruby on Rails. 2010. Ruby on rails. Accessed Dec. 10, 2010. <http://rubyonrails.org/>.