Panel 2 Discussion
I have thoughts on all three of your preceding questions. On preservation, our university archives now uses a web archiving tool (Archive-It, https://archive-it.org/) to capture online resources. This will ultimately include this site. It's a convenient way to capture and preserve these kinds of projects as they evolve. This is separate from preserving the underlying data and code.
Re: a virtual library space, that's something we've toyed around with a bit. It's important to remember, as Christine has noted, that the library was a physical space and that mattered. I work with a colleague whose shop, the Institute for Digital Intermedia Arts (IDIA: https://idialab.org/), does virtual reality and 3D animation design work (they put together this website too). We explored doing a library space but ran into the same prohibitive cost issue. there is also a challenge of credibly recreating these historical spaces. Nonetheless, this is still an ambition of ours. Note that this team is a set of 3D artists, not computer science students, so their design chops are probably stronger. They operate at the intersection of art and coding.
Finally, Frank and I had great success employing student researchers to complete almost all of the data collection/transcription. We had an evolving team of grad students and undergrads, with the most experienced grad students supervising undergrads. We also had a pretty rigorous system of cross-checking the data entry, with at least two sets of eyes on each entry, which helped a great deal with accuracy.
@mikesanders A .pdf is better than nothing, but it's not all that useful for creating the kind of living, collaborative legacy your cathedral metaphor is looking towards. It's considerable work to turn a flat .pdf back into data a machine can use well, and digital projects should be looking to create reusable data so that the work they do doesn't need to be repeated. As a community, we should be working towards a rich archive of openly-available sources that can be used all over the world to facilitate forms of scholarship previously restricted to those able to access major research institutions and/or large-scale grant funding.
The model I'm keener on is the one that both Books and Borrowing and C18th Libraries are aspiring to: drawing on good templates and conventions from existing sources (such as the ESTC) to ensure interoperability where possible (working towards the ideal of linked open data, while recognising the difficulties this presents in practice), and thinking from the outset about creating a data set and an interface that are separable (with the data set archived at key stages for the use of others). The cathedral metaphor falls down a bit for digital objects because once they exist, they're quite easy to duplicate and use in different places at the same time as their original locations. Kyle's presentation is really good on the ways that adding to Dissenting Academies Online has benefits in terms of power, but also drawbacks in terms of increased complexity. I think what we want with Books and Borrowing is to create a really good custom interface for exploring the Scotland-focused data we create, but also to create openly-available data that can be taken up in whole or part by other projects and institutions. It would be great, for example, if our partner libraries could use the data we're producing and mapping against their existing collections to enrich their catalogues. We're also in in continuing dialogue with Mark's project (on which I'm one of the Co-Is) to ensure that the data sets the two are producing can talk with one another, so that there's potential for creating an overarching borrowing database using the two data sets as starting points, but building on these using shared conventions. This won't mean taking down the separate databases, necessarily - indeed, it might be advisable to keep them up for the purposes they were intended for, as (again as Kyle's presentation shows) adding data together often requires reworking and changes focuses. Ideally, though, it might allow the two projects to be used as the transepts of a mighty borrowing cathedral while still existing as things-in-themselves.
@msangster "As a community, we should be working towards a rich archive of openly-available sources that can be used all over the world to facilitate forms of scholarship previously restricted to those able to access major research institutions and/or large-scale grant funding."
Really interested in this proposal - but am wondering does this require some kind of co-ordinating body or can it emerge more spontaneously?
@mikesanders I think part of this follows from the expectations now placed on grant holders by funding bodies, i.e. to develop robust data management plans that reassure assessors that the data will be sustainable and open access. But it also follows from the sort of conversations and plans we're developing in this great workshop (thanks again Jim and team!), and in other such venues. Something like Matt's proposal was one of the main aspirations that I took from an AHRC-funded network on community libraries many of us were involved with earlier in the 2010s, and it's been really exciting how the conversations we had then continue to connect many c18 library projects now (both those that were already around then, and those that have been funded much more recently).
A couple more follow-up questions to Katie & Matt, & Kyle:
@katiehalsey @msangster I wonder if I could ask you to say a little more about the possible tensions between recording data in a way that is meaningful for the C18 and the C21? I imagine this is a problem which arises frequently in historically focused research, so if there's anybody else out there with examples from their own projects please join in too!
@kyleroberts6 you mentioned that the involvement of a librarian in phase 2 of your project brought another way of thinking about the data & I wondered if you'd given any thought to the ways in which other disciplines might bring similar critical yields? Again, opening this question up more generally, is there an 'ideal' combination of disciplines that need to be involved in projects?
@mikesanders: I'm not sure that a single co-ordinating body would work - there'd always be things they'd miss, scholarly projects don't necessarily fit neatly with rigid systems, and there are a lot of issues with linguistic and cultural variation that any given constitutive body might struggle with. I think what's emerging at the moment, though, is a looser body of conventions and good practices that will help with this. Libraries have a lot to teach us in this area - academics considering projects like those we're discussing would be well advised to take a look at the assumptions underpinning standards like MARC, General International Standard Archival Description (ISAD(G)) and Dublin Core.
@mikesanders: Trying to maintain fidelity to eighteenth-century knowledge systems while also creating data that will perform the functions C21st users will expect is a tricky balance - there is a tension there, and as a mediator of the data, you have to be aware you're taking a stand (@Julieanne-Lamond has some great observations on this in the thread for Panel 1). I think the key thing is a kind of triangulation between the fact that your team will develop especial expertise with your data that you need to communicate; facilitating access for new users by using schema they can quickly pick up; and making sure that you fully document what you're doing so others know what assumptions have underpinned data creation and what your system will find or ignore when certain filters are used. Ideally, you should also make your data available for reconfiguration by others, so those your assumptions won't serve will can amend them, or draw in other systems of organisation using your data's openness to pattern matching.
@mikesanders Thanks for the questions and apologies for the delay in responding. The beauty of library projects is that they open up to so many different disciplines. Factoring in the end goal is important, of course. Dissenting Academies Online was created to write a new history of dissenting academies and the co-PIs were a historian and a literary scholar, so perhaps not surprisingly, the postdocs on the project mirrored that. Given that the data we worked with (MARC records) came out of the world of library and information sciences, it was good to have a co-PI who was a librarian as well, but in the first generation, the focus really was always on humanities end users. A decade later, there are many more digital humanists out there, who I think can hit that balance between content and structure. New projects starting today would benefit from having at least one self-identified DHer on the project.
I'd love to add my two cents to your question to Katie and Matt as well. It's such an important question. Our data is from the 18thC but our users live in the 21stC. We want to keep the integrity of the former, but if we want actual users then we have to make the project accessible to them. This has really come for Ben Bankhurst (Shepherd University) and me on a project which came out of a joint-taught course on the American Revolution. The Maryland Loyalism Project (loyalismproject.com) is a digital archive and database of the Black and White Marylanders who identified as Loyalists during the American Revolution. We built the project with our students, for whom we felt it was essential they have access to digital surrogates of original 18th century documents (in this case, the Loyalist Claims Commission and the Book of Negroes) but they also be able to navigate the information in those records in ways that 21st century folks do. So the digital archive has scans of each document with diplomatic transcriptions, structured on the format of the original manuscripts. The database, on the other hand, extracts all manner of information about individuals named in those records and allows people to search and make connections between them. Right now we are working on restoring the names of the enslaved women and men who are listed simply as unnamed property in the original records but each of whom has an individual database record, as one small step to restore their humanity.
I do worry, in general, we do not bring enough non-scholars into conversation when we design these projects and that we lose vast swaths of potential audiences by creating projects simply not accessible to them.
@msangster A good point to @mikesanders question. It would be an interesting exercise to map out the primary data fields for the virtual library projects out there, both for holdings and borrowings data. While the data, as I think you are saying, is going to reflect the original source data, it would also be helpful for individuals who are going to launch future projects to see where the common ground is, both in terms of fields employed, but also how many are basing their work on MARC records (or subsets, like ESTC records for the 18thLibrariesOnline project). This mapping would also reveal how inter-operative - or not! - the current projects would be with each other.
@mikesanders Sorry for the extra long delay here! All of the above. With Dissenting Academies Online, there were a few postgraduates who were given access to the data or hired on an hourly basis to do specific task, but they weren't really part of the project team. I took an opposite approach when I launched the Jesuit Libraries Provenance Project when I got to Loyola ( https://jesuitlibrariesprovenanceproject.com/). I started that project in a graduate seminar where 16 graduate students and I reconstructed the original library catalogue for St Ignatius College (precursor to Loyola). Syllabus for that class here: http://libblogs.luc.edu/archives/ We launched the second phase, to track down all of the original books from the library, at the initiation of several students, who served as project managers, researchers, digital technicians and the like. It was a fantastic experience for me and the students, who have gone on to a range of careers - and who get credit on the website for their contributions. The thing to keep in mind is that all of the supervision on the project was completely unpaid work for me, while I tried to ensure students either received credit and sometimes stipends for their digital work. My students responded very positively to this opportunity and I'd encourage others to open up their projects - and to really let students play a central role in design, implementation, and analysis. I loved hearing how much @msangster and @katiehalsey had opened up Books and Borrowing to students. It's alot of work, but it's worth it.
@kyleroberts6 - Concur on fields. For borrowing projects, there are quite a small number of fields that would be essential for interoperability: with just the identity of the book and the date borrowed (as proxy for the time of engagement) you can do a lot of the comparative work, although identifying information is more powerful if standardised across fields (i.e. disaggregating author, title and edition information, or doing this work by proxy by linking to a standard reference like the ESTC). Comparing borrowers is a bit more tricky, which is one of the reasons why for the two AHRC projects, we're working with a shared occupation taxonomy to try and make this easier to do.
@kyleroberts6 I think you raise a vitally important point about the need to engage with people beyond the academy. Clearly, there are very many good reasons for doing this, but one which I suspect is going to become increasingly (and depressingly) important is the need to explain the value of Humanities research and if we're working with extra-academic communities from the outset, then this will be much easier.
A general question for people involved in this panel - is there a difference between extra-academic engagement with particular institutions (often those we collaborate with on a given project) and engaging the 'general public'? Anybody like to share examples of these activities?
I'm not sure if this directly answers your question, but one advantage of digital humanities work is that almost all of the work you produce reaches both academic and public audiences. In the case of WMR, we've had plenty of academic (research and teaching) usage but we've also drawn people just curious about reading habits, or even people using it as a genealogy resource. That can help with grant funding by allowing you to stress both the broad value for public audiences and the research value, sometimes stressing one or the other depending on the funding agent and the project's goal.
I do think there is a difference between work geared generally for a public audience and that which involves collaborating with a specific group or community. In the latter case, there are often concrete benefits to both sides of the partnership, academic and community, that can help as we seek funding. We've been involved with various projects where we've worked with labor unions, churches, nonprofits and of course the Muncie Public Library. These groups appreciate the recognition that comes with these kinds of community-based work while at the same time we've been able to work with them in ways that advance various parts of our research agenda.
@jim-connolly I'm particularly interested in the second paragraph of your response, about how we might work more with community groups. The Colored Conventions Project (not a library project, but a DH one) has done a really nice job of working with community. While Protestant Dissent looks different in the 21st century than it did in the 18th century, it would be nice to draw more connections between them and the project. With the Jesuit Libraries Project, one objective was to get members of the current day Loyola community to connect with their past, which worked very well with the student interns. Getting beyond Humanities disciplines (like to Catholic Studies) was a bit harder, given how much other competition there is within the university for students' attention.