Archive for the ‘Digital Humanities’ Category


Update on Google Art Project / World Wonders / Cultural Institute

November 19, 2013

I posted some time ago about Google Art Project, in which Google did a “street view”-like walk through of international museums. They have also done this at archaeological sites, in a set of locations now called Google World Wonders.  Here’s a list of museums and sites relevant to the classical world that now have detailed access through these projects, now collected under the umbrella of Google Cultural Institute:

World Wonders

I may have missed some European cities with Roman-era stuff – there are a lot of “Old City of X” (especially in Spain) and I don’t know my Roman Europe well enough to know all the cities that may have visible architecture (if I’ve missed a doozy, please say so in comments!) There are a LOT more, from multiple parts of the world; if you teach world history or art history at all, it’s well worth a scan for classroom tools. Makes me want to plan some trips!

Art Project museums:

Note that not every display or object in a given museum is included; these are generally selections from the collections. There are 290 museums in total and I haven’t looked at all of them for relevance – there are lots of large city and national museums that probably include a few items from the ancient Mediterranean.  Coverage is thoroughly international, with especially good coverage of Europe, North America, and Asia. Have a look!


TOCS-IN at Zotero: A Project That Didn’t Work

September 20, 2012

So, blogging a project that didn’t work – good idea or not?  Let’s see…

The project was to get the content of the TOCS-IN citation database into the free, open-access bibliographic software Zotero (which David Pettegrew discusses today; his post kicked me over my hesitation about blogging this project). I wanted to do this for two reasons: to draw increased attention to TOCS-IN, which is an excellent, open-access bibliographic resource for Classicists, and make it especially accessible to Zotero users; and to make the TOCS-IN content potentially available as Linked Open Data, because Zotero can export files in BIBO, a linked open data format for bibliographic citations.

My steps were:

1. Get permission from P.M.W. Matheson of the University of Toronto, the manager of the volunteer-driven TOCS-IN project, to use the available data files for this purpose.  She was helpful and supportive – thank you!

2. Write a Python script to convert the data file formatting from a custom SGML markup to RIS format, a common format for bibliographic citations (used by Zotero as well as EndNote, which created it.) I am not a programmer, but happily my husband is; this piece would not have been possible without his help, although I did big chunks of it All By Myself.

3. Add the RIS-formatted citations to a Zotero Group library. This turned out to be the problem.  In theory, there is no limit to the number of bibliographic citations that can be stored by a Zotero user.  In practice, once I got about 40,000 (of the ca. 80,000) citations uploaded my Zotero standalone software began freezing every time I attempted to do anything (like stubbornly add another several thousand citations), and refusing to sync with the online Group Library.  A question posted in the Zotero forums got the swift and helpful confirmation that the sync process simply cannot handle such large datasets well, and that I alone would not be affected; any users who tried to use this large group library would start crashing their Zotero instances as well.

What now?

It’s possible that Zotero, which is actively under development, will make it possible to create very large citation libraries. Zotero used to not be able to handle a couple of thousand citations in one library, and now it can do that with ease (as, for example, the ASCSA Group Library of 2553 items demonstrates). But it may not be a priority for Zotero’s developers to move in that direction; most people use Zotero for personal citation libraries, not as de facto mirror sites for large bibliographic indices.

I have looked at BibSoup/BibServer, related projects that allow the open-access presentation of bibliographic data online, deal with a wide variety of formats (bibtex, MARC, RIS, BibJSON, RDF), and are relevant to the Linked Open Data goal of this project (full RESTful API).  I really liked Zotero simply because it is already very popular with humanities-oriented users and likely to become more so (it seems especially popular among graduate students). BibSoup is geared toward STEM academics, and currently only has about 17,000 citations total (and I’m a little hesitant about breaking things after my Zotero experience!); BibServer requires a server and IT chops which I lack. I do think these applications have a lot of potential, but I don’t think they will work for my project right now.  I’d welcome an argument on this point, or any other suggestions.

Finally, I’d like to add a quick recap and appreciation of what TOCS-IN is and comprises.  TOCS-IN is a bibliographic database  that is fully open-access (searchable at Toronto and at Louvain) and entirely crowd-sourced – that is to say, made possible by the contributions of volunteers who transcribe or copy and paste journal tables of contents and format them for inclusion in the database.  A list of volunteers is available at the site, as is a list of journals currently needing a volunteer.  Do consider joining us; I am currently covering three journals, and the time burden is minimal, especially if the journal publishes its table of contents online (much less typing!)

The basic portion of TOCS-IN is about 80,000 citations, comprising the tables of contents of about 180 journals, all among those indexed by the subscription database L’Annee Philologique. The project began in 1992, so chronological coverage mostly starts there.  A comprehensive list of titles, volumes, and issue numbers is available at the Toronto site. TOCS-IN at Toronto and Louvain currently also searches an additional ca. 56,000 citations, including tables of contents of some TOCS-IN journals dating before 1992 (listed at Louvain), and edited volumes, festschrifts, etc. (listed at Toronto).


LAWDI 3: Good Linking Practices for Bibliographic Stuff

June 13, 2012

While the following were informed by conversations and presentations at LAWDI, they should be considered my opinions only, and I welcome any (polite!) discussion of why my ideas are wrong-headed  in comments.

So, you’re a scholar putting up information online, and you don’t have the time or IT chops to start learning how to implement RDFa or learn a specialized linked open data vocabulary. The following are some ideas of things you can do that are linked open data friendly, with an emphasis on providing links to stable, authoritative, easy to use URLs. This post covers bibliographic items (secondary scholarship).

I want to emphasize that doing all this linking is work; it takes time. I’ve been trying to link more thoroughly in my blog posts about LAWDI, and it does add to the time burden of writing blog posts. I urge readers to strive to include more (good-quality) links in the things they post online, but please don’t feel guilty if you can’t do it all. Do what you can; every bit is a piece toward our common goals.


  • Link to a WorldCat record using the OCLC number. Permalink URLS are linkable from records and can be created using the format .
    WorldCat is my top choice because 1) it welcomes links, 2) it’s the largest and most international open linkable library catalog. Note: sometimes if you look a book up by title you’ll find multiple OCLC records with multiple OCLC numbers, even though you’re looking at the same book, not even different editions. OCLC and its members are probably working to tidy this sort of thing and merge (or at least cross-reference) duplicate records. For now, pick the one that has the largest number of libraries showing in the list in your home/target country (there will often be one US record and one European record, for example.)
  • Link to the US Library of Congress using an LCCN (Library of Congress Call Number).  Permalink URLS are shown in records and can be created using the format (useful, since many books have the LCCN in print on the inside.)
    Using the Library of Congress is a fine choice; it’s my second choice because it is US-centric (while WorldCat is working on becoming more international) and the Library of Congress records don’t have the enhancements that WorldCat records do (ability to display holdings in libraries near you, ability to provide a link to online booksellers, etc.)
  • I would not bother linking to, for example, Amazon using an ISBN. WorldCat links using OCLC are more useful in my opinion, and as easy to create.
  • Including the ISBN in a citation can be useful; there are some great browser plug-ins that can identify ISBNs in web pages and link users to libraries or online booksellers (for example, LibX or Book Burro).

Digital Books

  • If a book is available in an open-access digital edition, by all means include a link to that, preferably in addition to a link to a WorldCat record for the print edition. For open-access digital books you have two strong choices, neither the clear winner yet in my opinion.
  • Link to the Open Library record. URLs look like this:
    Open Library is the more linked data friendly solution; each record can be downloaded in RDF and JSON. Records also include linked OCLC numbers and LCCNs. The full-text books can be downloaded in a bunch of different formats, from .pdf to MOBI, and also also readable online.  Open Library is part of the Internet Archive, and is a “born-open” project. They currently only have about 1 million open-access books, though, and their records aren’t as scholar-friendly – they don’t have all the features of  library catalog records (though they are based on them.)
  • Link to the Hathi Trust record. URLs look like this:
    Hathi Trust’s records have library-provided bibliographic data and they have a large collection (3 million plus) of open-access volumes (as well as many more digital volumes not open-access; availability of formats can also be an issue). They are backed by a bunch of big academic libraries and are likely to stick around. They have an API, but are not as linked-data friendly as Open Library.
  • I would not bother linking to a Google Books record unless you can’t find a match at either of the previous places. Google Books has great content, but their metadata is lacking, and they are a for-profit company who cannot guarantee a future commitment to free open-access products.

Book Chapters

  • For print-only book chapters, right now you’d do best to link to the whole book.
  • Ditto for book chapters available in full-text digitally, unless you can track down .pdfs at the author’s web site or, for example.

Journal Articles

  • Link to the DOI of the article – a long unique number appended in even print citations – using the format . Participating publishers have committed to maintaining access to articles via DOIs in perpetuity, even as their online platforms may change. (Remember, though, a lot of the articles are available by subscription only; many who follow the link will get an abstract but not full-text if their institution does not subscribe.)
  • Available digitally but doesn’t have a DOI? Look for a stable URL or permalink at the page with the article citation. Jstor does a good job with these ( but so do many other large commercial article databases.
  • Available digitally but not directly linkable? (This might be the case with an article published in a 19th century journal that has been digitized by the volume, but without the individual articles indexed, or an online-only journal with poor linkability.)  Link to the record for the journal in a repository like Hathi Trust or Open Library (above), or to the home page of the online journal, if articles cannot be directly linked.
  • Print-only? (Lots of journal articles still are, especially older, smaller, or foreign ones). Link to the WorldCat record for the whole journal, using the OCLC number or ISSN if there is one: .

Questions? Quibbles? Cases I missed? Ask in comments.

Previous posts here on LAWDI:

Collection of blog posts and other  resources from LAWDI:


Library-Related Presentations at LAWDI

June 6, 2012

LAWDI was set up with half-hour presentations by ‘faculty,’ and 15-minute presentations by the rest of the attendees.  Links to slides for all presentations that used them are being collected here.  In this post I discuss those presentations most relevant to librarians and the issues they love best (bibliographic citation, authority control, scholarly publishing) as well as recapping my own presentation.

Friday we began with a talk by Chuck Jones of the ISAW Library (links he discussed collected at AWOL) and then a powerhouse tour of library linked data and metadata issues by Corey Harper of NYU’s Bobst Library.  His slides are here.   (For librarians wanting to get up to speed or keep up to date on the issues Corey covers I also strongly recommend following the blog of Ed Summers of the Library of Congress, Half of what I know about linked open data I learned there.)

So, I had a tough act to follow; I think I actually said, “And now for something completely different.”  First I described the goals of and demonstrated the Ancient World Open Bibliographies. Its origins are covered in a post titled “The Beginning” at that blog, and you can follow the links to the Wiki and Zotero library for the project yourself. In the context of LAWDI, it was important to note that Zotero allows the export of bibliographic citations automatically marked up using the Bibo (Bibliographic ontology) vocabulary, so keeping bibliographies there gives you a leg up on becoming part of the linked open data world.  I also demonstrated an online bibliography on Evagrius Ponticus by Joel Kalvesmaki of Dumbarton Oaks as example of what can be done with a bibliography based in Zotero, but presented as an inherent part of a digital project.

The second point I wanted to make was that bibliographic information is linked open data friendly.  (Libraries have worked hard to make it so!) Library catalogs are structured data files on books, and while the current structure is out of date, we’re working on that (see Corey Harper’s talk). Most books have a standard number that represents them: an ISBN, an OCLC number (accession number into the OCLC catalog, now online as WorldCat) or a Library of Congress Control Number (LCCN).  Many books have all three!  Articles, book chapters, or other things  scholars want to cite are more problematic.  Many journal publishers now use DOIs (digital object identifiers) for specific articles, but these have not been universally adopted. I demonstrated the DOI resolver at (which also lets you create stable URIs for DOIs; I’ll cover this in more explicit detail in a future post.)

My third point was to try to think more broadly about how existing open-access online bibliographic indexes for ancient studies could move in the direction of being linked open data compliant.  At 8am the morning I spoke, without any prompting from me, Tom Elliott posted a manifesto on this same topic at his blog: Ancient Studies Needs Open Bibliographic Data and Associated URIs. So, let me say, what he said, and amen.

Saturday we had two talks that were very exciting to me as a librarian, even though they were actually about scholarly publishing. Sebastian Heath of ISAW talked (without slides I think) about publishing the ISAW Papers series using linked open data principles.  Andrew Reinhard of the American School of Classical Studies (ASCSA) publications office brought forward one of the more resonant metaphors of the conference, that the current scholarly publishing enterprise is essentially steampunk, 21st century work with 19th century models. (This got retweeted a lot!) He was bursting with ways ASCSA plans to change this. Slides are here.

Next up: my recommendations on choosing good links for bibliographic stuff.

Previous post here on LAWDI:


LAWDI Conference on Linked Open Data for Ancient Studies

June 4, 2012

This week I was very fortunate to attend the Linked Ancient World Data Institute (LAWDI) conference held at the Institute for the Study of the Ancient World (ISAW) at NYU, in New York City and sponsored by the NEH Office of Digital Humanities.  This is intended to be the first of a few posts in which I discuss the conference topics and some practical outcomes I hope to participate in. (I say this publicly so I have to actually write them!)

LAWDI was a wonderful conference.  The very active twitter feed (#lawdi) was followed by 400 people, and towards the end I began to worry that they would start to think we were all a bit touched in the head, given the levels of enthusiasm that approached a lovefest.

Much credit goes to our ISAW host and general fount of visionary optimism, Sebastian Heath, as well as his co-hosts Tom Elliott of ISAW and John Muccigrosso of Drew University (where a second LAWDI will be held in 2013.) They fostered an atmosphere of collaboration and support that was truly welcoming to attendees at all levels; this is a rare enough feat at any conference, but especially so at one dealing with fairly high-level technological and semantic discussions.  My fellow conference attendees were also a fascinating, bright, energetic and truly nice group of people.  I feel as if I’ve made a bunch of new friends. Thank you all.

So, LAWDI is about Linked Open Data. I am sure I have a lot of general readers who may be wondering what the heck that is.  Here’s my attempt at a basic recap in terms that should be fairly accessible (I just actually tried to explain this to my neighbors, who are neither IT nor ancient studies people).  The internet is all about linking; one of the best ways to draw attention to resources is by linking to them. Links that are stable and short(ish), like or are a lot easier to deal with than 100+ character linksoup with characters like % and ? or websites where you can only link to a landing page but individual documents must be searched for every time you go there. So, people who manage information online should work on making their links resemble those above, for ease of use by everyone.

Second, where possible, links should go to authoritative sources. Pick a place to link to that will be around for a while – forever if possible! There are actually now international authorities for some things – VIAF is a big one for personal names, for example – so if I want to refer to 19th century Classicist Basil Gildersleeve I can link to and be pretty sure that that’s understandable to both people and computers internationally and will be around for a good long time.  (I’ll make a list of “good places to link to for classical bibliographies” in a subsequent post.)

Beyond that, however, there are some background technologies – not necessarily visible to the human viewer of a web page – to allow computers to figure out links between things.  The Wikipedia article linked above gives you a lot of acronyms and links to explain them, but for the non-coder, the gist is as follows.  One uses a special markup language to tell any computer that looks that “Basil Gildersleeve” is a human person, and that the URL is a description of him.  The computer can then find other references to the human person Basil Gildersleeve described at elsewhere, see that they are the same person, and automagically make a link.  This is the ultimate goal.  Examples of projects in ancient studies that are using this technology to, for example, search across disparate data sets include Pelagios and  CLAROS.

Coming next: 1) a recap of my presentation at LAWDI and 2) thoughts about best practices for Linked Open Data related to bibliographies and bibliographic citations specifically, at 2 levels: the low-tech and the higher-tech.


Disciplinary Meetings, Technology and Self-Reflection

January 10, 2012

This weekend the AIA/APA Annual Meetings took place in Philadelphia.  Several other major disciplinary academic conferences take place the first weekend in January, taking advantage of the semester break (for most), including AHA 2012 in Chicago (American Historical Association) and MLA 2012 in Seattle (Modern Language Association).

I didn’t go to any of them, but I checked in on my Twitter feed periodically, and was struck by the differences in the conversations that went on around each of these three meetings.  Tweets from #aiaapa actually appeared this year – a stunning difference from the 2011 meetings in San Antonio.  (I looked for twitter messages about AIA/APA in January 2011, and there were literally fewer than 5).  This year, several doughty reporters tweeted the conference panels they attended, including Francesca Tronchin (@tronchin), Tom Elliot (@paregorios), and Kristina Killigrove (@BoneGirlPhD, who made a nice summary of her Twitter work at her blog, Powered By Osteons, and also wrote a valuable post about the Lessons from Live-Tweeting), and others I probably missed.  Many thanks to them – it’s fun and enlightening to be able to drop in on a conference remotely.

In contrast, twitter took over MLA in December 2009, and the #mla2012 and #aha2012 hashtags both ticked forward so fast one could barely follow them this weekend.  At MLA, tweeters started adding second hashtags for the session numbers, so those following along could separate the streams coming from each room and topic.  Some of the difference in volume between these three conferences can be attributed to size – AIA/APA had about 3000 registrants, while AHA had about 3700, and MLA twice that (no 2012 numbers yet, but 2011 in L.A. attracted 7745).  But much more of the difference has to do with disciplinary cultures.  (And it’s not age of the practitioners – I’m 39, the average age of a Twitter user, and Twitter is actually most popular among the 26-44 demographic, not among undergrads or early grad students.)

There weren’t just differences between AIA/APA and its sibling conferences in their use of technology for conference conversation.  There were differences in what the conversations were about, as well.  Blogs at the Chronicle of Higher Ed and Inside Higher Ed reporting on AHA and MLA tell the story that was unfolding on my twitter feed – most of the “news” was about the future of higher education, the job market for PhDs and what a dissertation should even look like, and whether or not digital humanities and/or public history will save us all.  The titles tell the story:

In contrast, I could find no reports at Inside Higher Ed or The Chronicle about what was discussed at AIA/APA, and the tweets from those on the ground were about actual archaeological, historical, or philological (I suppose, though I don’t think I saw any) content.

Does this mean AIA/APA – and by association classicists and archaeologists – are better or worse off than historians and modern language scholars?  There’s an argument that all the tweeting and blogging and navel-gazing and raging about the future of the humanities in the press are just a distraction from the actual purpose of a scholarly meeting – the dissemination of scholarship.  On the other hand, some self-reflection on the part of disciplines is healthy, no?  Especially in light of political and economic trends that threaten the values of academia generally, and the humanities in particular?  I don’t think public history or digital humanities will save us all, but I do think they are ways to engage the public – the college-attending, state-legislature-lobbying public – in the scholarly topics that matter to us all.  Is there a way to achieve the reflectivity and growth without the “anguish”?

I should note there were certainly potentially self-reflective sessions on the AIA/APA program: “Cultural Heritage Preservation in a Dangerous World,” “Presenting the Past,” “Discussions and Strategies Regarding Applying for Grants, Fellowships & Post-Docs,” “The Politics of Archaeology,” “Beyond Multiculturalism: Classica Africana…,” “Authors Meet Critics: Race and Reception,” “Intertextuality and its Discontents,” “Teaching About Classics Pedagogy in the 21st Century,” “Classics in Action: How to Engage with the Public,” and more. Maybe it’s just that nobody tweeted about them?