They’re Crowdsourcing Papyrus Transcription!

July 28, 2011

One of the hot time-wasting-at-work activities for underemployed and geeky office workers this summer has been the New York Public Library’s What’s on the Menu? project, which asks the public to help transcribe historical restaurant menus from a very large collection.  Menus can’t be reliably transcribed automatically by Optical Character Recognition (OCR), because they tend to use unusual fonts and layouts. In further evidence that there’s a passionate user group out there for nearly any topic, volunteers at the menu transcription project have so far transcribed 475,731 dishes  from 8,821 menus.

What else can’t reliably be transcribed by OCR? Papyri! (Or anything written by hand; for the ancient world this mostly means papyri.)  The Ancient Lives project invites the public to help transcribe items from the Oxyrhynchus Papyri, whose excavation is described at some length.  The project has gotten a lot of press, and there has also been discussion on academic list-servs, with some skepticism about whether the public will be willing and/or able to crowdsource ancient Greek handwriting, and some concerns about the ethics of asking the public to contribute to a project while giving nothing in return.

Ancient Lives is hosted by Zooniverse, which describes itself as a “citizen science” website, and hosts multiple crowdsourcing projects, the majority related to astronomy – participants are asked to look at images of space, many from the Hubble telescope, and identify anomalies, classify galaxies by shape, etc.  The site states it has had 445,501 volunteers (a free login is required to participate) and if the testimonials at the site are reflective of this population, the volunteers are largely enthusiastic, and feel they are being rewarded, for example by learning more about astronomy. One keen-eyed amateur astronomer discovered a new phenomenon, now named after her (Hanny’s Vorweerp is the original; they are now a known and soon-to-be-formally-published phenomenon called Vorweerpjes!)

Could Ancient Lives be a teaching tool in the classroom for you?  Could introductory Greek students get practice recognizing Greek letters by transcribing papyri (or would non-standard handwriting confuse them)? Would an assignment to explore the site fit in to a general Greek Civilization class, or a literature class that reads works whose documentation is affected by the finds at Oxyrhynchus (Menander, for example)?  Or might it be a fun way to procrastinate from that syllabus-writing you should be doing this week?


E-Books for Learning Greek

April 4, 2011

I have started looking more seriously at texts for elementary Greek that can be used on the Kindle (and/or other e-book readers), in advance of a possible trial in a class this summer.  Here’s a list of resources I have found useful – do you have any to add? The following include texts available in Kindle format, and texts available as .pdfs – most e-book readers can deal with simply-formatted .pdf files, although their treatment of footnotes or multi-column pages can be, frankly, terrible. I have NOT included online-only texts (as at Perseus, TLG, etc.)

Hathi Trust

  • A scholarly e-book repository, it includes most out-of-copyright works (pre-1923) digitized by Google Books, plus additional titles post-1923 where Hathi staff have worked with publishers and authors to make works available to the public.
  • Search interface is very much like a library online catalog, so it’s easier to find a known title than when searching Google Books.
  • Note one can create a free account and make lists (“public collections“) of texts.  It would be useful to have such a list for important classical works, no?  Maybe in my copious free time (or yours).

Google Books

  • An alphabetical list of works selected by Crane and Babeu – Google Books Ancient Greek and Latin Texts Available as downloadable .pdf files.
  • Ditto, but US-access only. Requires a Google account to log in, and you must be in the US.
  • You can also search Google Books for specific titles, but good luck getting what you want in the first page of results – I’d try Hathi Trust first, myself, as the search interface is more sophisticated.


  • Requires creation of an account (free), after which one can download .pdf files.
  • Includes out-of-copyright texts – this site dates to 2001, so the texts were hand-scanned before the advent of Google Books.
  • Greek texts library. There’s also Latin.


  • Best website name ever? Links to downloadable .pdf versions of out-of-copyright editions from the Loeb Classical Libraries.

Project Gutenberg

For purchase at Amazon (prices listed – they are generally modest).

One problem I have run into is that the Kindle cannot convert any documents larger than 25MB, and many .pdf files are larger than this.  The solution is to use Adobe Acrobat and break up the .pdf files into smaller units, which requires a) possession of Adobe Acrobat (the production software, not just the reader) and b) more work on the user end – a lexicon that’s divided into several chunks alphabetically is not as easy to use.


Good Summary Article on Digital Classics/cist

March 22, 2011

Yesterday I read with interest Simon Mahoney’s article “Research communities and open collaboration: the example of the Digital Classicist wiki,” thanks to a recommendation from @paregorios (Tom Elliot).  It’s a fairly quick read and I feel like I have a better understanding of what the Digital Classicist wiki‘s history is, and what I might find it useful for in the future – better than I acquired after some random poking around on the site last summer, anyway.

One of the big topics the article raises is whether digital humanities is inherently collaborative and what technological structures can do to foster community.  This is an issue I’m interested in in general, especially because I see academia generally, and classics within the academy in particular, as very hierarchical disciplines that value tradition, and disciplines where much of the serious work is done solo (archaeologists are somewhat exceptional in this regard).  I thought about this idea when I talked about and social networking for academics; I thought about this idea when we discussed crowdsourcing at THATCamp SE.  I’m thinking about this today, as my goal for this week is to get the wiki piece of the Ancient World Open Bibliographies project up and running, and the goal of that project is the building of a collaborative bibliography for the use of scholars and students.  How can I get collaborators?

As an aside, I was curious enough about the gender balance in digital classics – especially because of the recent spate of articles about gender imbalance among Wikipedia editors – to count the number of members listed at the Digital Classicist Wiki by gender.  (For first names I was uncertain about, I assumed they were female.)  The tally was 120 listed members, 80 of whom are male and 40 of whom are female; the four editors are male.  Not too shabby; recent reports suggest classics PhDs currently awarded are largely split 50-50 by gender, for context, but computer science remains a male-dominated field.


THATCamp SE 4: Making Digital Collections Work for the Scholar

March 10, 2011

THATCamp SE got started in earnest on Saturday.  We all met in a lecture room in the Emory Library (which is a wonderful space in general, and had guest wireless that made me incredibly jealous – it remembered me when I came back the 2nd day, and automatically gave me access! On my own campus, I get kicked off the wireless network repeatedly even when I’m sitting in the same place for an hour.)

THATCamps are “unconferences” – that is, there is no set agenda and no pre-planned papers.  The conference attendees post at the conference web site about what issues they want to discuss, and start to generate interest, and then the morning of the first day they write their sessions on a huge whiteboard and others make tickmarks if they plan to attend.

that camp session white board

This is actually from a THATCamp in Australia. Hence the appearance of "speedos."


The tickmarks allow the organizers to assign sessions to appropriate sized rooms. We had a brief rundown of the THATCamp groundrules (below) and we were off.

  1. THATCamp is FUN – That means no reading papers, no powerpoint presentations, no extended project demos, and especially no grandstanding.
  2. THATCamp is PRODUCTIVE – Following from the no papers rule, we’re not here to listen and be listened to. We’re here to work, to participate actively. It is our sincere hope that you use today to solve a problem, start a new project, reinvigorate an old one, write some code, write a blog post, cure your writer’s block, forge a new collaboration, or whatever else stands for real results by your definition. We [are] here to get stuff done.
  3. Most of all, THATCamp is COLLEGIAL – Everyone should feel equally free to participate and everyone should let everyone else feel equally free to participate. You are not students and professors, management and staff here at THATCamp. At most conferences, the game we play is one in which I, the speaker, try desperately to prove to you how smart I am, and you, the audience member, tries desperately in the question and answer period to show how stupid I am by comparison. Not here. At THATCamp we’re here to be supportive of one another as we all struggle with the challenges and opportunities of incorporating technology in our work, departments, disciplines, and humanist missions. So no nitpicking, no tweckling, no petty BS.

The first session I attended was proposed by Andy Carter (@cartera) of the Digital Library of Georgia, and described by him as “Big digital piles and the classroom.” As an archivist, he wants to know how scholars use digital collections in both teaching and research, and how the collections he manages can make these tasks easier for them.  Shawn Averkamp, who attended, had also asked a similar question.  We had more librarians than faculty in the room (this was to become a theme…) but the faculty talked about technological hurdles (real and/or perceived) and needing a guide to resources.  My main takeaway was articulated by Paul Fyfe (@pfyfe):

digital libraries/archives session theme of missed connections: who is mediating between resources and researchers/teachers?

For me, this was a stand-up moment: I am.  In my work as a Reference librarian, I am doing this for students who walk up to the desk or ask me a question via chat reference.  In my work as a subject liaison, I want to make it my goal to do this, not just for the faculty at my institution, but for the discipline of classics as a whole, at this blog.  I’ve struggled with understanding digital humanities projects so I can explain to the average classicist – what is this, and how might it be relevant to your research or teaching?  Is there an undergraduate assignment lurking in this project?

Some practical ideas for digital collections or dh projects that we discussed were educator guides, sample assignments, or digital “sandboxes” for playing with content, all hosted at the project sites.  These can come from the librarian or project head, but project hosts would  also welcome feedback from scholars who use their collections: email the project letting them know your students used the materials, and what the assignment looked like, or noting access hurdles and making suggestions to overcome them.  For faculty, the takeaway should be that digital collection and project hosts want the materials to be used, and need your help to see how that can be accomplished most profitably.

Previous posts on THATCamp SE:


Google Art Project: Cool, but Little Ancient Material

February 2, 2011

The buzz on twitter yesterday was Google Art Project, a new Google project (what will those people think of next!) that takes the Google Street View idea indoors – one can “walk” through the galleries of about 15 wonderful museums worldwide, getting a sense of the entire rooms, and then focus in on some specific artworks.  It’s a must-see if you teach any kind of art history classes, and well worth looking at for archaeologists, historians, and humanities folks generally.

What’s good:

  • Great list of museums, with good international coverage: Versailles, Hermitage, Uffizi.
  • Incredible detail for some of the artworks.  Try looking at a van Gogh on maximum zoom – never mind the brushstrokes, you can see the weave of the canvas!
  • Especially valuable for museums like the Frick (and many others) where the room itself, the experience of works of art in an architectural setting, with furniture and decor, is a big component of the visitor experience.

What I’d improve if it were mine:

  • For museums you don’t already know, it’s not easy to figure out where to look for art you’re interested in.  In many museums, the galleries have room numbers that tell nothing about their contents, so you have to browse through all of them to see what, if anything, you want to see is there. (There is a “room description” but it’s at least 2 clicks to get there.)
  • Not all the art is included in the super-zoom (which is understandable, but seeing the art on the walls makes you want to zoom!)
  • Heavy focus on western European (and American) museums and western European painting (medieval-modern).  I’d love to see a more worldwide focus, and a broader time horizon. More, more more!
  • I find “walking” through the rooms a little tricky – it’s like a drunk person is operating my cursor.  This may be user error (though I promise I am not drunk.)

Specifically of interest to classicists/archaeologists:


