Panel 1 Discussion
Discuss Panel 1 presentations here. Feel free to start a new topic.
Thanks to both Frank and Julieanne for their presentations. I thought I’d post a couple of observations and questions while we await comments from Rafael.
First, Julieanne, we all want to congratulate you (and Tim Dolin) for creating “the world’s largest database.” That is a remarkable achievement. I also appreciate your reference to the Endings Project. I realize now that we should have thought more carefully about the long-term sustainability and preservation of the project from the start of WMR. Of course we were so caught up in building it that it was hard to see beyond our noses.
I appreciate the way you (Julieanne) raise the question of what to do with our “discovery” (in a sense) of ordinary readers through making circulation records publicly accessible. The Charlton example, as well as the Felski and Naish Gawen references, are good reminders of the complexity involved in making claims based on otherwise obscure reader/borrowers. A couple of questions come to mind. First, do you see any patterns in Charlton's overall borrowing record that suggests a characterizable approach to reading or does it simply seem pointlessly eclectic? Second, how much does the nature of the collection in Lambton, as well as his access to other sources of print in a small mining community, matter? Do you think he was eclectic because there was not enough material in the collection and elsewhere in town to sustain a more studied course of reading? (I’m thinking here of some of the “infrastructure” issues Christine Pawley raised in her presentation [Panel 3]).
Frank’s presentation reminded me of other work he’s done that stresses the significance of books as objects, a point also raised in the presentations here by Mark Towsey, Matt Sangster and Katie Halsey, Edmund King and Shaf Towheed. When we were constructing WMR we made room for details about the physical character of individual books. My question for Frank (and others) is whether there is a way to link the information gleaned from texts to their circulation records in ways that facilitate systematic and even quantitative analysis.
I'll have more to say but this will do for starters.
I think these two talks work so well together. Thanks so much Julieanne and Frank for sharing your longterm experience with virtual library systems and how you are thinking about your future. Julieanne's comment at 9:50 or so mark about the importance of centering the user(s) is so important, and perhaps something we come to appreciate even more as time goes on. For many of us those users are scholars, educators/students, and the longed-for, but often vague "broader public." Each has very different wants and needs. Because many of us designing these projects fall in the first user group, we often end up designing for what we know we want / need and then end up retrofitting for the other audiences. I'm struck by how we have two ways to go in re-invention: with the interface and with datasets. With interface rethinking, are we still designing for scholars? Are we thinking about the types of interfaces that our target group feels comfortable with? If we want a broader public, should our projects look more like Goodreads or Amazon and less like a university ILS? One of the things that has struck me in moving to the American Philosophical Society is how in 2021 we are much more focused on open data than I think we were a decade ago. This means not only how people expect to access data but also the ethics around its creation. (I'm still struck that scholars 100 years from now can't do the projects we have done because privacy laws have stripped away so much of the borrowing information of contemporary readers, although Jennifer's work gets us around that a bit, but I digress.) At the APS, we create the open data set first and then, sometimes, we make an interface for it. But the first and foremost deliverable is the csv file, not the website. How has that new reality conditioned the expectations of users who come to our projects now? And while I love the success that Julieanne has seen with the open data / data visualization tool reboot, I still fear the data literacy required and learning curve for it.
So very glad to be able to have these conversations with fellow practitioners facing common issues!
Sorry to join in as discussant a bit later than I hoped. My partner had her second vaccine dose on Sunday and was knocked out a bit longer than we expected, so I was on intensive child duty.
I thought both presentations were really interesting, and my first question had to do with asking both of the speakers how they settled on their book and user as a subject of research. More clearly, whether the affinity analysis led Frank to the Tinkham Bros. Tide Mill or whether the affinity analysis took place because of his interest in the book. Likewise, I found Mr. Charlton to be a very interesting research subject, and the main point towards which I'm angling for is, how do you make sense of the information that comes from particular individuals from times long gone, what strategies have you followed to parse through your users.
Towards Julieanne, I would also have another question, Do you think we would gain an advantage from focusing on database preservation by separating the dataset (which could be flatpacked or otherwise designed for preservation) from the tools used to extract information from such a dataset (which would then be different according to the user base or changes in technology)?
I don't want to take up much more space, but thought this might help us spark a couple of lines of conversation. Thanks a lot to both of you for such interesting talks, I really look forward to this chat.
I'm quite interested in Zora Clevenger, who Frank mentioned in his talk. I'd look him up in your book, Frank, but it's on the shelf in my office, from which I have been banished all these many months. But I seem to recall that he was quite the schoolboy athlete, who group up to be a legendary athletic director at Indiana University. We at WashU have not paid much attention to the occupation data in the database; however, your talk prompted me to go back and look, and to reconsider what I had concluded regarding Clevenger's boyhood reading, which I had dismissed as typical boyhood stuff, as Alger and all of that.
Now, realizing after looking at the census data that Clevenger rose above his father's station in life, I'm tempted to constuct some sort of narrative which says, "See, he read all that hortatory be-a-good-boy-and-get-ahead stuff, and it worked!" Which temptation I'm going to resist, for all sorts of reasons, all of which are obvious.
Still, it does suggest the larger problem of drawing inferences from this data. Of what sort of histories will this data support convincingly, which conclusions are too speculative to bear appropriate scrutiny. I believe we received a caution in this regard from our keynote speaker, but I'd also be interested in hearing your thoughts.
Thanks for the pointer to The Endings Project, Julieanne. I think it will prove useful for an unrelated project or two, where we've been struggling with such matters.
I did poke around on the Common Reader site, although in an unstructured and informal way. I was curious to see which of the authors in our Southern corpus, or in Muncie more generally, we in one of the libraries in the Common Reader. It's a lovely interface, and quite convenient!
The point Kyle raises is a key one for us too. Totally agree on the open data point. I've come around to the idea that an accessible .csv file is the most useful output for scholars. And maintaining a flat file as Rafael suggests seems like it is, and should be, a basic step for all projects. But questions of how best to design an interface for other groups of users (students, the public) are challenging, especially given how labor intensive and costly that can be. Since we want to reboot WMR I'm interested in hearing other thoughts on these questions.
Raphael -- Thank you for your comments. The choice of The Tinkham Brothers' Tide-Mill as an exemplar emerged from the 2020 annual conference of the College English Association where I was a speaker. The generic title of the conference was "Tides", a strange choice to say the least! "Tide" or "Tides" had to appear in the title of each of the selected talks. And so, I fed the word into the WMR mine shaft and, heigh ho, off to work I went with Trowbridge's novel as my source to dig dig dig! The idea had been that I would try to elicit responses from my CEA audience and then report these to the WMR Workshop the following month. Because of the Covid pandemic both events were cancelled or postponed.
As it turned out The Tinkham Brothers proved to be an excellent choice since it allowed me to review the findings of the database, and query further some of our results. It was the first time that I explored a single book from the MPL in such detail. I found that new resources that had not been available when we compiled the database made it possible to extend our hit rate of identifiable library patrons (I gave the example of John Russey). In addition, Trowbridge's memoir My Own Story had become accessible on-line via Googlebooks, and his remarks on the writing process and reception of his boys' stories, which I quoted in my talk, helped link writer to reader. That The Tinkham Brothers had a smallish borrowing record of fewer than one hundred hits made my mining of the book manageable within the context of my presentation to the Workshop.
You also ask about "whether the affinity analysis took place because of [my] interest in the book." Here, I am indebted to Steve Pentecost for demonstrating to us how this feature can enhance the usefulness of the database by allowing users to see at once rather than through a tedious linear search what other books a borrower of Trowbridge's novel also borrowed. The bubble sizes will also denote the overall "popularity" of these other books, and we should come away with a broader and deeper understanding of readership trends. I do hope that the updated WMR will include this important feature.
Finally, you make reference to "Mr. Charlton", and I am left wondering whether that relates to Julieanne's presentation rather than to mine. Let me know if I missed something here.
Julieanne, Frank and Jim, thank you for a wonderful panel—you given us lots to think about, and in a very enjoyable way. I’ve particularly been pondering the meaning of “electic” reading. My immediate reaction to the use of this term was that today many of us might think of ourselves as eclectic readers, and that this is a reflection of our many-faceted lives. Research as well as our own experience tells us that people read in and for many different circumstances—for entertainment, information and education certainly, but for lots of other reasons, too: we read as therapy of various kinds (in sickness and bereavement, for instance), for devotional purposes, to put ourselves to sleep at night, to pass the time on long journeys and in doctors’ waiting rooms, to escape our current circumstances, at the beach, in the bath and so on and so on. To me, “eclectic” has a random sound, but given the many different aims and circumstances for individual reading, over the course of their lives people could choose a great variety of reading materials with distinct purposes in mind. Far from aimless and random, their choices could be very pointed.
We also read for social purposes as sociologists (especially Elizabeth Long) have pointed out, but maybe less now than in the past. These days we probably don’t often read aloud to or with other people until they are children, or ill. But in the C19 and until at least the 1920s, reading with and to others was quite common—husbands to wives as they sewed at night, friends or siblings together reading the same novel, even people reading to workmates. A hundred years ago someone might check a book out of the library with just such a social purpose in mind, and this too might explain why their choice doesn’t seem to fit our preconceived idea of what they were likely to be reading, given their demographic characteristics. In addition, some public libraries in the past restricted borrowing cards to a certain number per household rather than per individual. So again, patrons might choose books for their maximum appeal to a group.
Final thought here: people choosing a variety of different books might actually reflect the fact that they were faced with more, rather than less choice.
Christine -- As I might have hoped, you extend the thought processes of our discussion. Thank you for doing that so adeptly. Reading aloud or reciting of poetical works has a long history dating back at least to Homer. I recall when I was a schoolboy in London listening to a wonderful BBC reading of Paradise Lost by a host of major actors, including Michael Redgrave. Once a week and through the poem's twelve books, they brought the epic alive in a way that was almost impossible or at least far more difficult through silent reading. I still have a smile on my face when I think back to those live readings, and, as a teacher, I would encourage my students to read aloud, particularly in the case of Milton but also with other authors, whether waiting at the bus stop or in their digs. That tradition of reading aloud was carried beyond poetry through books that were borrowed from public libraries. Likely, in the nineteenth century, there would have been one room (a sitting room or parlor) in an everyday home where there would have been sufficient lighting for reading aloud to other family members in the age before electric lighting. We cite the example of Robert Maggs's family in What Middletown Read (the book rather than the database!). David E. Nye's still useful book, Electrifying America (1992), which concentrates on Muncie, gives a context to this. It's too easy for us to take the light switch for granted, and to recognize that a little over a hundred years ago, and for many communities far more recently, that facility was not yet available. One of the attractions of the old Muncie Public Library, which stayed open late, was that it was quite adequately lighted. I'd be most interested to learn from other participants whether there is more recent research on reading aloud and the development of lighting, particularly in the context of library borrowing and use.
Thank you all for these thoughtful questions, and Frank for your excellent presentation, and my apologies for the delay in responding. I have dug my way out from under a huge pile of undergraduate essays and can finally contribute. Frank, it is wonderful to see the continuing work happening on the WMR database, especially in relation to the affinity analysis. The possibility of linking the database to full text editions of works will be fantastic, too - I wonder if it’s possible to find a way to easily export, say, a corpus of works borrowed by a particular patron, or demographic? And it’s wonderful that you are able to improve the demographic data in terms of identifying patrons. We simply don’t have the resources to do this for the ACR as it stands and I’m very aware that the limited extent of demographic data in our database.
Jim, in response to whether Charlton’s reading was characterizable, I would agree with Christine in that it appears that Charlton may have been borrowing for many different purposes - some of them familial and sociable. But if I had to describe it, I would say that he tended towards novels that we would describe today as popular - he borrowed 9 different horse racing novels by Nat Gould - but that these popular novels, as I mention in my paper, were often tackling serious topics. There is a lot of adventure and romance in there (often both in the same works). He was borrowing authors that a lot of his peers also borrowed, and during a similar time period. There are 1,200 titles listed in the database for the Lambton Institute, so some infrastructural constraint is likely at play here, too.
Christine, I love your thinking around eclecticism. I think we often regard past readers’ habits as eclectic because of shifts in how we categorise books. Charlton’s reading perhaps looks more eclectic from our perspective now than it would have at the time, because the ways that we distinguish literary works from one another in terms of value (popular vs serious), genre (romance vs adventure) and gender (books for boys and girls, men and women, as Christine your own work has shown) have solidified since the late 19th c. I agree that many of us remain eclectic readers - especially literary studies academics, who all have our guilty pleasures (mine is hiking memoirs and guidebooks).
Rafael, I came to Matthew Charlton by accident. I was using cluster analysis to look at works patrons at Lambton borrowed in common and found an interesting cluster surrounding a voluminous borrower, Martha Charlton. I then looked at other people connected to her, came across Matthew, and a google search revealed his interesting political career. That’s what’s so wonderful about such databases - they can be explored, rather than just searched in a targeted way.
In relation to separating the csv data from the interface: this is effectively enabled by the capacity to export to csv, which was my number 1 priority in reworking the ACR interface. But you’re right in that we should perhaps think of our interfaces as ‘for now’ rather than ‘forever’.
Frank, I was interested to note your comments about privacy. The last time I gave a public talk about The Australian Common Reader there was some concerned discussion from librarians present in relation to the ethics of looking at past readers’ habits - they weren’t aware that their borrowing would be dissected by researchers so far in the future! Apparently when the ACR was relaunched there was some consternation in library circles online about this issue. I explained that library borrowing registers were a little more ‘public’ then than now, but the question of privacy hadn’t really occurred to me to that point and I’m curious to hear what others think about this.
One of the strengths of What Middletown Read from the beginning has been that it encourages users to understand and explore the primary sources through linked digital images of manuscript ledger books. This is worth celebrating and sustaining in future work.
I have been thinking about what I tend to want as a data consumer of resources like WMR, while also thinking about Melanie Walsh's question in Panel 3 about data-work versus close analysis, and Jim Connolly's invitation to consider which is the starting point and focus and which the background. I've also been thinking about Christine Pawley's wise advice in her presentation, which in general terms I understood to remind us that we ought to be careful to keep looking out for what we don't yet know we don't know.
Although I tend to prefer not to think there is (or should be) a sharp divide around computation, quantification, and data, there is perhaps a characteristic set of temptations peculiar to seeing something as data. The kind of standardizing that's necessary for analysis -- or even for access -- can obscure the messy heterogeneity that sometimes is our best hope for learning that we have been missing something, when the surprises of unexpected context are allowed to make themselves known.
Standardization has a lot to offer; we're usually ultimately interested in people rather than their names, and it's helpful to get distinctions between "Geo." and "George" out of the way so that we can aggregate information about the same person. This standardization isn't just for the sake of quantitative analysis; it's also about access points for "close analysis," helping researchers not to miss things that might otherwise be missed. And in fact the abstraction involved in turning the identity of a book or a patron into a mere number isn't new; they were doing that in the 1880s and 1890s. The WMR database was built on a data bureaucracy a century before it became a historical digital project.
When we first aggregate data to look for larger patterns, we often newly notice discrepancies and inconsistencies, and understandably want to regularize and standardize some more. There is value in this.
Yet not everything can be standardized. One of the most common kinds of data is missing data. Sometimes we have excellent evidence that something is missing, but not-so-good evidence exactly what it was that went missing.
I can offer an example that was marginally relevant to our presentation at this conference. It's interesting to me that Albion Tourgée's fictionalized accounts of Reconstruction were read in Muncie in the late 19th century -- we should read him more today! -- but there's good evidence in the Muncie records that resists being made easily into data. If we use the default WMR web search to find "Tourgee," at first it looks like five books (three titles) were in circulation with just 110 checkouts altogether. He's present, but doesn't seem all that significant.
But if we search for "Tourgée" (with the e-acute) or for "Albion," we'll see eight books (five titles) accounting for 191 checkouts. If we then check "include non-circulating records" in the web interface and search "Tourgee," we will see accession records for 12 books. Between 1880 and 1901 the library acquired three copies of A Fool's Errand, two copies of Bricks without Straw, and two copies of An Appeal to Caesar. This suggests Tourgée was at least a steady presence for librarians making acquisition decisions from the 1880s, and that perhaps he was read enough to justify purchasing additional copies while earlier copies were still in service.
The web interface for Acc #8082, one copy of Bricks without Straw, lists the number of checkouts as "0 (indeterminate)." Following the handy "View page" link takes us to an image of the accession catalog page where we can see that when that copy of Bricks without Straw was acquired in 1892, accession #8082 was mistakenly assigned to Tourgée's novel and again to Mary Agnes Fleming's A Mad Marriage immediately after it.
At this point, the current interface doesn't offer us any further leads. We have an older version of Ball State's data that indicated 253 transactions had been attributed to #8082. That's a high number -- if this were a single book, it might rank in the top 25 most popular over the 1890s.
I don't see a way to disentangle things further. There is very little overlap between readers of Tourgée and Fleming otherwise; the uncertain records do overlap with both, which suggests both books may have been circulating with the same accession number.
Sometimes the first thing we want to do with data is to look for ways to drop the parts that don't fit, and can't be made to fit. But the missing data is potentially evidence, too, and it's important that the records offer us the chance to become aware of it.
So one thought for future work is to aim to build on the solid foundation of WMR as a history-conscious project, and look to for ways to preserve and communicate the missingness of missing data and the uncertainty of uncertain data. It would be a mistake to offer just one convenient web interface and one convenient download interface that are consistent with each other as data, at the expense of accurate representation of the self-inconsistent historical record. Instead, helpful inconvenience might be a better goal. Or helpfully planted obtrusive pointers that acknowledge context -- through interfaces, file formats, and data fields that try to alert researchers to the inconveniences that they probably won't be looking for, but that will be essential to understanding what is and is not in the record, and the limits of what can be inferred from it.
Even the parts that resist easy quantitative analysis are going to be necessary for good quantitative analysis sometimes.
Thanks Doug, for a very helpful comment that will inform our thinking about next steps for our work in developing WMR. I remember when we saw Tourgee's book coming up as widely circulating, a somewhat surprising--and exciting--development given its racial politics. Discovering that error was a reminder of how messy the records could be.
As we developed WMR, there was a tension between the need for efficiency, which sometime meant not dwelling on the contextual material too much, and collecting details and evidence from outside the records that provides a sense of the context. That's a tension we hope to lessen, though not eliminate, as we revise the project. One ambition we have for this workshop is that it will help us sharpen our thinking about how best to do that.
Steve -- Belatedly following your post from 4/22, we have outline details of Zora Clevenger's life in our book (p. 202). Looking at the record of his reading, we concluded that little of this gave "any sense of a reader prepared to explore beyond what was conventional among his peers." In other words, his borrowings from the MPL don't point in any significant way to his later career as an athlete. We speculate that summer gaps in his borrowing record suggest that "he was probably engaged [elsewhere?] in summer sports." Very few of the MPL's common readers achieved national eminence, though one can cite Clevenger (sports), J. Otis Adams (artist, who likely used the library before our records begin), Benjamin V. Cohen (politician in FDR's admin. but born in 1894 so too young to enroll in the old MPL, though his father, Moses, was an active reader), and now Harreld Kemper (musician; concert pianist) as evidence that -- to quote your post -- "all that hortatory be-a-good-boy-and-get-ahead stuff,...worked!"
@douglasknox I wholeheartedly agree, and the question of how to retrofit this understanding into The Australian Common Reader is a vexed one for me, because unlike WMR we don't even have links to image files of loan records etc. I have found it interesting, doing collaborative work on other kinds of data with colleagues in literary studies, how high their expectations are about how far the data can be made to be clean and exact - to reveal the truth about the thing. Because I learned to work with data using library loan records, a certain level of messiness and uncertainty has always been baked into the process for me. Your point here is akin to what Lisa Gitelman argues: that data is not neutral, but rather is ‘always already “cooked” and never entirely “raw”’. Katherine Bode has some good things to say about this too - that a dataset is a form of argument, and we should be aware of our own aims and prejudices in constructing them.