Posts Tagged ‘zotero’

h1

TOCS-IN at Zotero: A Project That Didn’t Work

September 20, 2012

So, blogging a project that didn’t work – good idea or not?  Let’s see…

The project was to get the content of the TOCS-IN citation database into the free, open-access bibliographic software Zotero (which David Pettegrew discusses today; his post kicked me over my hesitation about blogging this project). I wanted to do this for two reasons: to draw increased attention to TOCS-IN, which is an excellent, open-access bibliographic resource for Classicists, and make it especially accessible to Zotero users; and to make the TOCS-IN content potentially available as Linked Open Data, because Zotero can export files in BIBO, a linked open data format for bibliographic citations.

My steps were:

1. Get permission from P.M.W. Matheson of the University of Toronto, the manager of the volunteer-driven TOCS-IN project, to use the available data files for this purpose.  She was helpful and supportive – thank you!

2. Write a Python script to convert the data file formatting from a custom SGML markup to RIS format, a common format for bibliographic citations (used by Zotero as well as EndNote, which created it.) I am not a programmer, but happily my husband is; this piece would not have been possible without his help, although I did big chunks of it All By Myself.

3. Add the RIS-formatted citations to a Zotero Group library. This turned out to be the problem.  In theory, there is no limit to the number of bibliographic citations that can be stored by a Zotero user.  In practice, once I got about 40,000 (of the ca. 80,000) citations uploaded my Zotero standalone software began freezing every time I attempted to do anything (like stubbornly add another several thousand citations), and refusing to sync with the online Group Library.  A question posted in the Zotero forums got the swift and helpful confirmation that the sync process simply cannot handle such large datasets well, and that I alone would not be affected; any users who tried to use this large group library would start crashing their Zotero instances as well.

What now?

It’s possible that Zotero, which is actively under development, will make it possible to create very large citation libraries. Zotero used to not be able to handle a couple of thousand citations in one library, and now it can do that with ease (as, for example, the ASCSA Group Library of 2553 items demonstrates). But it may not be a priority for Zotero’s developers to move in that direction; most people use Zotero for personal citation libraries, not as de facto mirror sites for large bibliographic indices.

I have looked at BibSoup/BibServer, related projects that allow the open-access presentation of bibliographic data online, deal with a wide variety of formats (bibtex, MARC, RIS, BibJSON, RDF), and are relevant to the Linked Open Data goal of this project (full RESTful API).  I really liked Zotero simply because it is already very popular with humanities-oriented users and likely to become more so (it seems especially popular among graduate students). BibSoup is geared toward STEM academics, and currently only has about 17,000 citations total (and I’m a little hesitant about breaking things after my Zotero experience!); BibServer requires a server and IT chops which I lack. I do think these applications have a lot of potential, but I don’t think they will work for my project right now.  I’d welcome an argument on this point, or any other suggestions.

Finally, I’d like to add a quick recap and appreciation of what TOCS-IN is and comprises.  TOCS-IN is a bibliographic database  that is fully open-access (searchable at Toronto and at Louvain) and entirely crowd-sourced – that is to say, made possible by the contributions of volunteers who transcribe or copy and paste journal tables of contents and format them for inclusion in the database.  A list of volunteers is available at the site, as is a list of journals currently needing a volunteer.  Do consider joining us; I am currently covering three journals, and the time burden is minimal, especially if the journal publishes its table of contents online (much less typing!)

The basic portion of TOCS-IN is about 80,000 citations, comprising the tables of contents of about 180 journals, all among those indexed by the subscription database L’Annee Philologique. The project began in 1992, so chronological coverage mostly starts there.  A comprehensive list of titles, volumes, and issue numbers is available at the Toronto site. TOCS-IN at Toronto and Louvain currently also searches an additional ca. 56,000 citations, including tables of contents of some TOCS-IN journals dating before 1992 (listed at Louvain), and edited volumes, festschrifts, etc. (listed at Toronto).

Advertisements
h1

Library-Related Presentations at LAWDI

June 6, 2012

LAWDI was set up with half-hour presentations by ‘faculty,’ and 15-minute presentations by the rest of the attendees.  Links to slides for all presentations that used them are being collected here.  In this post I discuss those presentations most relevant to librarians and the issues they love best (bibliographic citation, authority control, scholarly publishing) as well as recapping my own presentation.

Friday we began with a talk by Chuck Jones of the ISAW Library (links he discussed collected at AWOL) and then a powerhouse tour of library linked data and metadata issues by Corey Harper of NYU’s Bobst Library.  His slides are here.   (For librarians wanting to get up to speed or keep up to date on the issues Corey covers I also strongly recommend following the blog of Ed Summers of the Library of Congress, http://inkdroid.org/journal/ Half of what I know about linked open data I learned there.)

So, I had a tough act to follow; I think I actually said, “And now for something completely different.”  First I described the goals of and demonstrated the Ancient World Open Bibliographies. Its origins are covered in a post titled “The Beginning” at that blog, and you can follow the links to the Wiki and Zotero library for the project yourself. In the context of LAWDI, it was important to note that Zotero allows the export of bibliographic citations automatically marked up using the Bibo (Bibliographic ontology) vocabulary, so keeping bibliographies there gives you a leg up on becoming part of the linked open data world.  I also demonstrated an online bibliography on Evagrius Ponticus by Joel Kalvesmaki of Dumbarton Oaks as example of what can be done with a bibliography based in Zotero, but presented as an inherent part of a digital project.

The second point I wanted to make was that bibliographic information is linked open data friendly.  (Libraries have worked hard to make it so!) Library catalogs are structured data files on books, and while the current structure is out of date, we’re working on that (see Corey Harper’s talk). Most books have a standard number that represents them: an ISBN, an OCLC number (accession number into the OCLC catalog, now online as WorldCat) or a Library of Congress Control Number (LCCN).  Many books have all three!  Articles, book chapters, or other things  scholars want to cite are more problematic.  Many journal publishers now use DOIs (digital object identifiers) for specific articles, but these have not been universally adopted. I demonstrated the DOI resolver at http://dx.doi.org/ (which also lets you create stable URIs for DOIs; I’ll cover this in more explicit detail in a future post.)

My third point was to try to think more broadly about how existing open-access online bibliographic indexes for ancient studies could move in the direction of being linked open data compliant.  At 8am the morning I spoke, without any prompting from me, Tom Elliott posted a manifesto on this same topic at his blog: Ancient Studies Needs Open Bibliographic Data and Associated URIs. So, let me say, what he said, and amen.

Saturday we had two talks that were very exciting to me as a librarian, even though they were actually about scholarly publishing. Sebastian Heath of ISAW talked (without slides I think) about publishing the ISAW Papers series using linked open data principles.  Andrew Reinhard of the American School of Classical Studies (ASCSA) publications office brought forward one of the more resonant metaphors of the conference, that the current scholarly publishing enterprise is essentially steampunk, 21st century work with 19th century models. (This got retweeted a lot!) He was bursting with ways ASCSA plans to change this. Slides are here.

Next up: my recommendations on choosing good links for bibliographic stuff.

Previous post here on LAWDI:

h1

Are You A Bibliography Nut?

April 15, 2011

Michael E. Smith posted on his blog that he has 18,000 bibliographic references in his EndNote database! Which got me wondering, how do others stack up?  Anyone got him beat? Have you actually read everything in your bibliographic file, or do you, like Smith, add things you “are likely to use”?

I have an interest in the technicalities of scholarly workflow, so I love to read blog posts like this that track the technological changes that have shaped a scholar’s workflow over decades:

It all started early in graduate school, when Clark Erickson showed me his library card catalog drawers full of references written on 3×5 index cards. How cool was that! I immediately started my own program of price supports for the index card manufacturers. Clark and I would make cards for each other when we came across appropriate references. I think I had between 15,000 and 20,000 cards in all. In the 1980s I got up to 1,000 or so citations into the Minark database. What a klunker! OK for very early PC days, I guess, but I soon switched to a bibliography program (I forget which one).

I think I had a Filemaker Pro database for citations on my Mac laptop in the late 1990s, and I definitely remember when my classmate pioneered an early version of EndNote in the department (I think this was about 1999).  I currently use EndNote, RefWorks, and Zotero, though none of them heavily.  But plenty of scholars – from undergraduates to faculty members – still use pieces of paper or Word documents to keep lists of citations. How does a citation management program affect the way scholars work? If you have 18,000 references in a database, are you more or less likely to turn up the right article for the project at hand? I don’t know that anyone’s studied this, and I can’t really conceive how one would do so quantitatively, but I find it as interesting as the transition (or not) from print to digital texts for scholarly work.

h1

Annotated Bibliography Assignments

November 11, 2010

For another project (Ancient World Open Bibliographies) I’ve created a Zotero account and am playing around with it, trying to figure out how to easily generate annotated bibliographies using the software (which is a web site plus a Firefox plugin).  Unfortunately right now the answer is that it’s not very easy to annotate bibliographies in a way that lets them be printed or shared, although there are some workarounds.

My search for information on this topic led to me to an interesting assignment, by Brian Croxall, who teaches English Literature.  He had his students find a scholarly resource outside the required reading for the class, and write an annotation (in this case, the annotations were fairly long, more like a short review, in some cases several paragraphs).  His description of the virtues of bibliographic annotation is:

Annotated bibliographies get students experience with some of the important steps of literary scholarship: finding secondary criticism and digesting it. While I could (and might!) just assign the standard end-of-term research paper, the unintended consequence of doing so often results in students looking around for any quotations they can throw in to meet the arbitrary requirement of sources. I hope that annotated bibliographies provoke students to read the other sources more carefully: reading for the source’s own argument rather than how it can fit into one’s paper that is due in 12 hours. An annotated bibliography requires you to take more time, giving you a chance to see what kinds of conversations go on amongst scholars of contemporary literature.

One of of my colleagues who is a teaching model for me is a big fan of annotated bibliography assignments.  She works with one class, for example, in which instead of a paper, the students produce an annotated bibliography of 10 items.  They are required to use no more than 3 popular sources (anything from Glamour to web sites) and no more than 3 of what the professor calls “popular scholarly sources” (I think this term is made up, but what is meant is journalistic sources – newspapers, Time, The New Yorker, etc.).  The class is in the anthropology of food, so for a topic like high fructose corn syrup, a mix of scientific resources explaining what it is and how it is processed by body, journalistic treatments, and fluffy stuff like diet magazines allows the student to turn in a well-rounded exploration of the topic and how it is treated in varying types of sources.  For a different subject, requiring only scholarly sources might be a better fit.  The student gets the experience of doing research for a term paper: finding and evaluating sources.  For the faculty member teaching a large class, grading 75 10-item annotated bibliographies is possible in a way that grading 75 10-page papers is not.  Would this kind of an assignment work for you in your larger classes – or even as a preliminary step in the process of writing a term paper?  My guess is it would lead to better papers if required and graded.

As an aside, I would love to see more online discussions and repositories of good assignments, assignments that worked, for classics and ancient studies.  Some faculty blog about their teaching but I found, for example, when I was trying to think of a good assignment for a graduate class that would incidentally teach them to use TLG, that Google was a howling wasteland in this area.  I don’t teach regular classes, only one-shot library instruction sessions, but I do see part of my role as working with faculty to help brainstorm about what assignments work for what learning goals.  I hope to be able to put more ideas on this blog.

h1

Social Networking and Academia

July 26, 2010

Now that everyone and (no joking) her dog is on Facebook, has the time come for social networking to have an effect on academia?  There have been academic and research-related “social” sites for some time now – Connotea.org (part of Nature Publishing) broke big in 2005, and CiteULike happened at about the same time.  Both of these allow the bookmarking of web pages, like del.icio.us, but have a special focus on online academic journal articles.  They pull metadata from the articles to create an accurate mini-citation in your list of resources, and allow lists of articles or web pages to be tagged, shared, and fed out via RSS; you can also explore others’ lists and discover new research articles that way.  I explored Connotea pretty thoroughly in early 2007, but haven’t used it since then.

Mendeley is a more recent entrant into the arena (2007), with a desktop as well as a web application, although like Zotero (2006)  it bills itself most prominently as a reference management software (like EndNote or Refworks) that just happens to have a social dimension.  More truly social, with a goal of promoting campus research and fostering intra-campus collaboration is BibApp, an application developed by librarians and technologists at two university campuses.  Here, the researcher is the focus of the site (not the individual paper or citation) and the institution is the impetus for organizing and collecting the published works of the researcher.

Right now, I’m most interested in Academia.edu, though.  I’ve had an account at academia.edu for a couple of months now, and I think it’s a new idea that could encourage some interesting changes in academic culture.

Phoebe Acheson's page at academia.edu

1.  It increases the visibility of an academic career.  The site comes up quite high in a Google search; higher, I find, than one’s departmental web page.  Also in contrast to a departmental web site, I have instant control over what is on my academia.edu profile; if I update my resume I can simply upload the new version, without having to work through a web administrator.  (I toy with turning off the feature that emails me when someone searches for me on Google and lands at the site.  While one knows, rationally, that people do search for one on Google, it is a little disconcerting to hear about it I find!) In the difficult employment environment academics face, any tool that lets you promote your academic work and manage your own academic ‘brand’ for free is a good one.

2.  It serves as a de facto high visibility repository for open-access papers.  Researchers can easily upload copies of their published or unpublished works to the site.  Scholars should, of course, only upload papers they hold the copyright of, so do read your publishing contracts carefully to make sure you hold the copyright in your own work if you want to post your papers or book chapters online.

A couple of things I wonder about:

1.  If I switched university affiliations, how would that be handled?  Right now the url for my page starts with “uga,” but academics do move around, especially at the early stages of their careers, and I suspect this site is aimed at scholars writing PhDs or assistant professors (the Facebook generation?) rather then senior faculty. (I seem to see a lot of UK-based grad students on the site especially – note the site itself is UK-based.)

2. Will people make connections using academia.edu that turn into academic collaborations?  While I treat Facebook as a public forum, and don’t post anything there I wouldn’t say to my postal carrier (or my boss!), my “friends” are almost all people I have actually met and interacted with.  I am, on the other hand, comfortable “following” the work of scholars I don’t know at academia.edu.  Will graduate students and early-career faculty reach out to each other and turn “following” into collaborating?  If so, could this have an effect on the culture of academia, which (at least in the humanities) is not very collaborative?

h1

On Information Management

June 1, 2010

When I started this blog a couple of months ago, and created an accompanying twitter account (which, I know, I am not using much yet, sorry), I didn’t realize I would be adding the straw that broke the camel’s back in terms of Too Much Information. But in addition to remembering yet another username and password, I am now struggling with keeping up with both producing content and consuming content online.  Disentangling the private and professional online is also a bit of a mess in my head right now.   So I’m starting a blog-project on information management that I hope will help me define things a bit, and will also help me better serve the faculty and students I work with, who are probably facing many of the same problems.

Here’s where I stand as a content producer/distributor:

I have one Facebook account, that is mostly an expression of my private life (i.e. I interact(ed) socially in real life with almost all of my Facebook friends).  But since many of my ‘friends’ are current or former coworkers, or former fellow-students, I also have a dimension of my professional life on Facebook.

I have two twitter accounts (@phoebeacheson and @classicslib), and contribute to a group account (@ugalibsref).  I am most muddled about what is personal and what is professional in this arena right now.

I have three blogs: this one, a private personal one, and a public one that is rather boring unless you are deeply excited by plumbing and gardening.  I have also been invited to contribute to the Ancient World Bloggers’ Group (though have not done so yet) and I contribute to the UGA Library blog.  Here my boundaries and scopes are delightfully clear. Whew.

I also recently set up an academia.edu page, and am on Linkedin (not very actively).  I explored Connotea a couple of years ago, and my account is still there.  Did I mention I have four email addresses?

Offline, I have published one article as a librarian so far (okay, that’s online too, but the sent me paper offprints! I marveled), and just last Friday gave my first presentation at a small regional conference (Atlanta Area BIG 2010).

In terms of content consumption, I subscribe to 8 professional email list-servs (aside from ones limited to my workplace), and read around 80 work-related blogs (it’s hard to say exactly, as I only have one Google Reader account and also sometimes it’s hard to decide what is personal and what is professional – is Language Log or Uncertain Principles something I read because I enjoy them personally, because they inform my work (they don’t, directly, but do get me thinking…), or both?) I continually feel like I’m not keeping up in some areas, but I also feel like I’m devoting too much time to this sort of keeping up as it is.

I don’t do nearly as much original research as active scholars in Classics do, but I have both a RefWorks account and an EndNote library (the latter really only so I can teach it to others; I use RefWorks for my research projects) and keep meaning to do something with Zotero.

Note I haven’t mentioned non-internet methods of acquiring and distributing information, like talking to colleagues, teaching classes and one-on-one sessions, browsing the stacks of the library, and keeping up with print periodicals.  I do almost all of those, too.

Am I managing all this well?  In terms of consumption, am I finding what I need and keeping up with areas I am most interested in, while weeding out irrelevant-to-me information?  In terms of production, am I reaching my target audience (have I defined a target audience?) with the information and messages I want to convey?  Am I useful to my target audience?  How can I tell?

To further my thinking in these areas I’m going to start having a series of conversations with scholars – hopefully people of different generations who are in different places in their careers – about how they manage information related to their scholarly work and identity.  With their permission, I’ll write up accounts of our chats and post them here, in hopes that I can start answering some of these questions for myself, and help those around me solve any information management problems they may be having.