According to David J. Brewer, the following persons appear in this 1851 painting by Scottish painter John Faed, entitled Shakespeare and his Friends at the Mermaid Tavern.

"The standing figure on Shakespeare's left is that of Sir Walter Raleigh. He leans on the shoulder of the Earl of Southampton. Seated in the right foreground with his back to the spectator is Sir Robert Cotton, with Decker on his right. The figures seated immediately behind Shakespeare are Ben Jonson, Donne, and Daniel. The figure seated at the rear of the table is Bacon, and with him are seated Fletcher, Dorset and Camden. In the rear of the seated group stands Beaumont with his hand extended and next to him stands Selden with Sylvester on the spectator's extreme left."

Works by Beaumont, Daniel, Fletcher, and Jonson appear in the Shakespeare His Contemporaries collection.

This site is now obsolete. Please visit the current site at https://shc.earlyprint.org.


Introduction

The centrality of Shakespeare in the picture above tells you more about the nineteenth century than about his place or reputation in the world in which he lived. He certainly made a decent living as a playwright. Remember that in the 1590's he bought the second-largest house in Stratford, which had been previously owned by the Queen's physician. That is a little like a modern playwright in his thirties buying a house from the President of the American Medical Association. That counts as success.

You can read the picture, which has an honoured place in the Folger Shakespeare Library, as a reasonable approximation of a collegial relationship. After all, the somewhat melancholy figure in the centre does not dwarf his contemporaries in the way in which the annual number of publications about Shakespeare dwarfs— by at least an order of magnitude— the number of publications about his contemporaries.

These contemporaries are the focus of “Shakespeare his Contemporaries” or SHC, a project devoted to the collaborative curation of non-Shakespearean plays from Shakespeare's world. The current corpus includes 510 plays from Udall's 1552 comedy Ralph Roister Doister to the dramatic sketches of Margaret Cavendish, the Duchess of Newcastle, published in 1662 but probably written over the course of the previous decade.

Texts co-existing in different states of (im)perfection

The SHC corpus includes most, but not all, plays writing during those roughly 100 years. The texts come from the transcriptions of the Text Creation Partnership (TCP). The quality of those transcriptions, which varies a lot, is mostly a function of the quality of the digital scan of the microfilm of the printed page that the transcribers had before their eyes. Quite a few of the plays were transcribed from different editions. In those cases we picked the transcription that seemed to be in better shape. That may be a bibliographically or historically dubious decision, but it works well enough as a first step, and in a digital environment it is easy to add or replace items as time goes on.

The SHC texts differ from their TCP source files in several ways that are described in the section on The texts and their linguistic annotation. They have been curated by two generations of undergraduates who fixed some 50,000 words that had been incompletely or incorrectly transcribed, including quite a few obvious printers' errors. You get a good sense of the cumulative effect of that editorial activity if you compare the defect rate per 10,000 words at the beginning and end of two years of editorial work.

Table 1: Defect rates per 10k words for 510 SHC texts

Text stage

25%

Median

75%

90%

Uncurated TCP source texts

5

14

62

126

2016 SHC texts

0

1.3

6.4

47.2

For three quarters of the texts, the rate of known defects dropped by an order of magnitude. If you assume that the text of a playbook has on average 300 words per page, an uncurated text at the 75th percentile would on average have two defects per page. In their current state, a text at the 75th percentile has one defect every two pages. That is a noticeable difference. The simple table also shows that defects cluster in 10% of the texts, and there has been less progress with these texts.

It is a distinctive feature of the SHC corpus that the texts in it co-exist in various states of (im)perfection. Because most visitors to this site will be familiar with conventional academic grading schemes, we have given each text a grade between A and F, so that readers have a sense of the relative quality of each text in the corpus. Consult the section on Curation and Quality Assurance for an account of how the grades were determined.

Curation by undergraduates

As mentioned before, most of the textual corrections in the SHC corpus were made by undergraduates in summer internships after their freshman or sophomore years. Teams of five students worked at Northwestern in the summers of 2013 and 2015. A team of three Amherst students under the supervision of Peter Berek worked on the SHC corpus in January 2016, consulting printed original at Smith College and the Houghton Library. They are listed in the credits section of this site, and the contributions of each of them are listed in the header for each play.

Unsurprisingly, with very little training and light supervision the students did excellent work. Spending a summer in the trenches of Lower Criticism— digital or otherwise— is not everybody's cup of tea. But those who are drawn to it approach it with a lot of care and energy.

There are three students whose work deserves to be singled out. In the summer of 2015 Hannah Bredar (BA 2015, Northwestern), Kate Needham, and Lydia Zoells, both of them 2016 graduates from Washington University in St. Louis, engaged in a curation sprint that separately or together led them to the Bodleian, Folger, and Newberry Libraries as well as the Rare Book Libraries of Northwestern and the University of Chicago. Between them they corrected some 12,000 textual defects in several hundred plays. All three of them were veterans of sorts: Hannah had been a member of the 2013 Northwestern team, while Kate and Lydia had learned much about the philological trade in Joe Loewenstein's Spenser Lab.

Castlists

The current release includes a still embryonic feature that provides the foundation for much future elaboration. At the end of each text there is a machine-generated castlist. All speeches (<sp> elements) have been marked with "who" attributes that map to unique role IDs. The castlist displays each roleID followed by the count of speeches with that ID. The castlist is sorted in descending order of speeches. The roleID is a very convenient hook on which to hang a lot of properties associated with a particular character. It provides a foundation for "dynamic castlists". More about them in future releases.

Future work: How you can help

The SHC site in its current form provides access to much improved versions of TCP playbooks. But much work remains to be done. We have followed a policy of "Release early and often", common in the software world. What you see today (April 2016), is somewhere around Version 0.4. Come September, the site will include an Annotation Module that will let anybody anytime and anywhere contribute to making the transcriptions more accurate or complete than they are now. Because under the hood of this site every word is a discrete digital object with a unique identifier, you can attach or send annotations to each word “address” (or range of them). Such annotations can be corrections submitted for review, or they can be comments you keep for yourself. Corrections do not overwrite the original but become separate digital objects, with the ‘who', ‘what', ‘when', and ‘where' of each logged as “stand-off markup”. Working with a text in this environment will be almost as easy as “reading with a pencil”, but the results are more easily sharable. If you do textual work on anything in the TCP corpus, we hope that you will find this site the easiest place to do that work. That is a “win-win” scenario: what is easiest for you is readily sharable with others. This site or its successor will be expandable: if a text you want to work on is part of the TCP corpus or meets its structural conditions, it will be easy to add it. Step by step we may move towards a “Book of English”, defined as

  • A large, growing, collaboratively curated and public domain corpus
  • Of printed English from its earliest modern form
  • With full bibliographical detail
  • And light but consistent structural and linguistic annotation.