Towheed, From UK-RED to READ-IT (2018-2021): Insights and Challenges for the Future

Shafquat Towheed
The OpenUniversity

In this paper, I will be presenting a ‘work in progress’ update on READ-IT (2018-2021), the pan European successor project to UK-RED, funded by the JPICH and led from Le Mans University (The Open University is a core partner). UK-RED had been developed entirely as an electronic version of a 19th century census form, with over 150 data fields and manual entry and corroboration. The challenge for a new 21st century European reading database is to speed up and scale up the kind of data gathering done by UK-RED through automation (we know there are millions of unharvested records out there, waiting to be aggregated and analysed) and also account for new forms of evidence source (e.g. data scraping from digital sources). As such, READ-IT has been working with a test corpus rich in evidences of reading to develop machine learning tools to rapidly and accurately identify records of reading. We have been developing standardized annotation tools, to facilitate rapid and more granular mark-up of texts, and we have also developed more finely tuned ontologies for the accurate analysis of reading experiences that will work across a long time period (18th century to the present day) and in a diverse range of languages and forms. READ-IT is being developed as an entirely open source, open access set of tools, with codes, ontologies and resources being made available via GitHub and through project websites. In a similar vein, we continue to draw upon and develop the public crowdsourcing element so successfully established by UK-RED by creating a public contribution portal, and QR enabled postcards, to allow members of the public to contribute their thoughts and responses to reading directly and anonymously to the project.