On Radio 3’s Between The Ears

I was delighted to be invited to contribute to a BBC Radio 3 programme, Between the Ears, with an episode called “The Virtual Symphony“, celebrating 30 years of the Internet, and the impact of it on our lives. I was interviewed for over an hour by the producer, Laurence Grissell, reflecting on my use of the internet and how it has impacted my professional and personal life, my memories of the early days in the physics and computer labs going online, and my thoughts on how it is changing society. Kieran Brunt, the composer, weaved four such interviews in with archive material, and new musical pieces, to produce a documentary that is also an artwork, showing how our relationship to and with the net is changing.

The Between the Ears logo, from BBC Radio 3.

First broadcast on Radio 3 on 18th July 2021 at 19.45, you can listen to it online, or here’s the MP3:

The official blurb goes like this! It would be great to hear what you think of it:

The joys and horrors of the internet, evoked by stories, sounds and an exciting new electronic and vocal work composed by Kieran Brunt. Opens with an introduction by the composer.

30 years ago, Tim Berners-Lee created the very first website. This powerful edition of Between the Ears explores how the internet has dramatically reshaped our lives over the following three decades.

In 1990s Glasgow, a young woman in a physics computer lab glimpses a different future for the world – and herself. In Luton, the web awakens a young man’s Sikh identity – a few years on, it will bring him riches. In 2001, a young mother in France finds escape through Wikipedia. Ten years later, an Austrian law student is horrified when he requests his personal data from Facebook…

Over four movements of music and personal stories, the Virtual Symphony moves from sunny optimism to deep disquiet, as our relationship to the internet shifts. Around these stories, composer Kieran Brunt weaves electronic and vocal elements in an exhilarating new musical work commissioned by BBC Radio 3.

Kieran Brunt and documentary producer Laurence Grissell worked in close collaboration to produce a unique evocation of the way in which the internet has fundamentally changed how we experience and understand the world.

Composer: Kieran Brunt

Producer: Laurence Grissell

Interviewees:

Melissa Terras, Harjit Lakhan, Florence Devouard and Max Schrems

Electronics performed by Kieran Brunt

Vocals performed by Kieran Brunt, Lucy Cronin, Kate Huggett, Oliver Martin-Smith and Augustus Perkins Ray of the vocal ensemble Shards

Programme mixed by: Donald MacDonald

Additional music production: Paul Corley

Additional engineering: Ben Andrewes

New Paper: Identifying the future direction of legal deposit in the United Kingdom: the Digital Library Futures approach

I’m delighted that a paper from the Digital Library Futures project has come out in the Journal of Documentation:

Gooding, P. , Terras, M. and Berube, L. (2021) Identifying the future direction of legal deposit in the United Kingdom: the Digital Library Futures approach. Journal of Documentation, (doi: 10.1108/JD-09-2020-0159)

Until this paper, there had been next to no research into how users are approaching and utilising the digital library collections now being amassed by our Legal Deposit (or colloquially known as “copyright libraries”) following the Legal Deposit Libraries (Non-Print Works) Regulations 2013, which enables and mandates them to collect digital copies of publications, as well as or instead of print. This paper addresses that gap by presenting key findings from the AHRC-funded Digital Library Futures project. Its purpose is to present a “user-centric” perspective on the potential future impact of the digital collections that are being created under electronic legal deposit regulations. Through our user study, we show that contemporary tensions between user behaviour and access protocols risk limiting the instrumental value of these digital library collections, which – although they have high perceived legacy value – are not being used in the way that they could, due to access and legal restrictions.

I’ve stuck the authors’ last copy up here, so you can read it if you can’t get beyond the paywall:

Gooding, P. , Terras, M. and Berube, L. (2021) Identifying the future direction of legal deposit in the United Kingdom: the Digital Library Futures approach (authors’ last copy, PDF).

Fully funded AHRC SGSAH CDA Studentship: “Slavery and Race in the Encyclopaedia Britannica (1768-1860): A Text Mining Approach”

I’m delighted to say I’ve been awarded a fully funded PhD studentship (open to international applicants!) with the National Library of Scotland, as a AHRC-funded Collaborative Doctoral Award, working with Professor Diana Paton (William Robertson Professor of History, University of Edinburgh), Dr Sarah Ames (Digital Scholarship Librarian, National Library of Scotland) and Robert Betteridge (Rare Books Curator (Eighteenth-Century Printed Collections), National Library of Scotland). Please do share this opportunity with recommended potential students, in History, Digital History, and/or Digital Humanities. An official advert will appear soon on UoE digital real estate, but I’m posting here first for expediency!

Fully funded AHRC SGSAH CDA Studentship: “Slavery and Race in the Encyclopaedia Britannica (1768-1860): A Text Mining Approach”

Application deadline – 5pm on Monday 17th May

Award – Annual stipend of £15,690 per year and tuition fees for 3.5 years (FTE). Open to Home and International students. (The successful candidate should reside within reasonable distance to the University of Edinburgh during the course of their studies).
PhD – English Literature

The University of Edinburgh and the National Library of Scotland are seeking a doctoral student for an AHRC-funded Collaborative Doctoral Award, “Slavery and Race in the Encyclopaedia Britannica (1768-1860): A Text Mining Approach”. The project has been awarded funding by the Scottish Graduate School for Arts and Humanities (SGSAH) and will be supervised by Professor Melissa Terras (College of Arts, Humanities and Social Sciences, University of Edinburgh), Professor Diana Paton (William Robertson Professor of History, University of Edinburgh), Dr Sarah Ames (Digital Scholarship Librarian, National Library of Scotland) and Robert Betteridge (Rare Books Curator (Eighteenth-Century Printed Collections), National Library of Scotland).

The studentship will commence on 13th September 2021. We warmly encourage applications from candidates who have a grounding in EITHER text and data mining/Digital Humanities, with proven knowledge and understanding of the history of slavery and/or race, OR UG/PG study of the history of slavery and/or race while demonstrating good technical skills and an interest in Digital Humanities/ Digital History methods. This is an extraordinary opportunity for a strong PhD student to explore their own research interests, while working closely with a major cultural heritage organisation, in important issues regarding the legacy of slavery in our information environment. 

The student will be based in the School of Literature, Languages and Cultures, at the George Square campus of the University of Edinburgh, but will also spend considerable time in the School of History, Classics and Archaeology at the University of Edinburgh, and at the National Library of Scotland. There will be a period of funded work placement at the National Library of Scotland, which will be co-determined with the student: for example, highlighting authors of articles relating to slavery and race in the Encyclopaedia Britannica, and exploring how these link to Library Collections in innovative ways.

The award will include a number of training opportunities offered by SGSAH, including their Core Leadership Programme and additional funding to cover travel between partner organisations and related events. This studentship will also benefit from training, support, and networking via the School of History, Classics and Archaeology the Edinburgh Centre for Data, Culture and Society, and the Edinburgh Futures Institute. The student will be invited to join National Library PhD cohort activities.

Project Details

“Slavery and Race in the Encyclopaedia Britannica (1768-1860): A Text Mining Approach”

How is the impact and outcomes of Atlantic slavery represented or alluded to in historical information sources? What is the legacy of slavery in our printed information environment? What text-mining approaches can be used to identify, analyse, and visualise these diverse and problematic histories? This research will use advanced digital approaches to understand how race and slavery feature in the Encyclopaedia Britannica (EB). The first eight editions of the EB, published 1768-1860, from the height of the UK’s involvement in the transatlantic slave trade, to the abolition of British slavery in 1838, and to ongoing subsequent debates about slavery and race, contains rich content related to Atlantic slavery and to forms of racialisation that developed from it. Utilising data from the newly digitised 143 volumes of the EB from the National Library of Scotland’s Data Foundry (comprising 167m words), this research will both provide insight into the explicit and implicit representation of slavery, the slave trade and race in this key reference material, but also develop a best-practice methodology for others wishing to use text mining to analyse race and slavery within other historical information sources.

This CDA will involve learning (well established) text and data mining approaches, applying them to the EB, involving unique corpus analysis that would need to consider the intellectual and cultural context in which eighteenth and nineteenth-century encyclopaedias were produced and published, and also linking and cross-referencing to other information sources available within the National Library of Scotland collection. By searching, analysing, and visualising the ways in which terms related to slavery appear in this essential reference material, using a variety of methods including GIS, accurate geoparsing, and following concepts and their relationships diachronically, we will both understand more about how Atlantic slavery was understood or instantiated within our information sources, whilst also developing a methodology for research into other similar primary reference material, and the ideas that they disseminated.

This is a timely topic, of significant relevance, given increasing interest in decolonising academic and cultural institutions. This project will have scholarly impact in Digital Humanities, History, and Library and Information Science, as we consider how to analyse, deconstruct and decolonialise historical information sources using computational methods, as well as contributing to discussions and policies at the National Library of Scotland on this topic.  

Eligibility

At the University of Edinburgh, to study at postgraduate level you must normally hold a degree in an appropriate subject, with an excellent or very good classification (equivalent to first or upper second class honours in the UK), plus meet the entry requirements for the specific degree programme.

In this case, applicants should offer a UK masters, or its international equivalent, with a mark of at least 65% in your dissertation of at least 10,000 words.

The AHRC also expects that applicants to PhD programmes will hold, or be studying towards, a Masters qualification in a relevant discipline; or have relevant professional experience to provide evidence of your ability to undertake independent research. Please ensure you provide details of your academic and professional experience in your application letter.

Experience in the study of the history of slavery and/or race, prior experience of digital tools and methods, an understanding of digitisation and the digitised cultural heritage environment, and use of quantitative research methods including text and data mining of historical sources, will be of benefit to the project.

The AHRC requires that students reside within a reasonable distance to their HEI as a condition of funding, although Covid disruption could be taken into account in the short term. 

Application Process

The application will consist of a single Word file or PDF which includes:

– a brief cover note that includes your full contact details together with the names and contact details of two referees (1 page).

– a letter explaining your interest in the studentship and outlining your qualifications for it, as well as an indication of the specific areas of the project you would like to develop (2 pages).

– a curriculum vitae (2 pages).

– a sample of your writing – this might be an academic essay or another example of your writing style and ability.

Applications should be emailed to pgawards@ed.ac.uk no later than 5pm on Monday 17th May. Applicants will be notified if they are being invited to interview by Tuesday 25th May. Interviews will take place week commencing Monday 31st May via an online video meeting platform.

Queries

If you have any queries about the application process, please contact: pgawards@ed.ac.uk

Informal enquiries relating to the Collaborative Doctoral Award project can be made to Professor Melissa Terras, m.terras@ed.ac.uk and Professor Diana Paton, Diana.Paton@ed.ac.uk

Further Information
How is the impact and outcomes of Atlantic slavery represented or alluded to in historical information sources? What is the legacy of slavery in our printed information environment? What text-mining approaches can be used to identify, analyse, and visualise these diverse and problematic histories? This research will use advanced digital approaches to understand how race and slavery feature in the Encyclopaedia Britannica (EB). The first eight editions of the EB, published 1768-1860, from the height of the UK’s involvement in the transatlantic slave trade, to the abolition of British slavery in 1838, and to ongoing subsequent debates about slavery and race, contains rich content related to Atlantic slavery and to forms of racialisation that developed from it. Utilising data from the newly digitised 143 volumes of the EB from the National Library of Scotland’s Data Foundry, this research will both provide insight into the explicit and implicit representation of slavery, the slave trade and race in this key reference material, but also develop a best-practice methodology for others wishing to use text mining to analyse race and slavery within other historical information sources.

The early EB was produced and published amidst the development of colonisation, globalisation and the transatlantic slave trade, and from its first edition it contained entries on slavery. Although the EB’s early success was facilitated by London book trading networks, it had distinctively Scottish roots, appealing to national sentiment.  In this context, examination of the early EB offers the possibility of discerning contemporary Scottish attitudes to slavery. The EB’s eventual popularity provides a useful case study concerning the representation and dissemination of ideas about slavery (and its abolition), but also the implicit legacies of the slave trade, such as the transmission of knowledge, culture, and products, as well as people. 

There is to date, a dearth of scholarship on the representation of chattel slavery in encyclopaedias. The limited studies that do exist amount to pieces of contextual evidence or small case studies that serve larger arguments. Much of the scholarship concerning the EB only examines it in terms of its publication history or epistemological approach. Studies of the early EB have omitted examination of change across particular entries across various editions. Investigation of the EB’s entry on slavery over time would in itself make a valuable historiographical addition. This doctoral project will go well beyond that, analysing the 167 million words contained in the 143 volumes of the first editions, using advanced Digital Humanities methods, particularly to look for implicit legacies of slavery, regarding products traded (eg cotton, sugar, tobacco, coffee), places mentioned (eg Haiti, Guyana, Saint Domingue, Calabar), individuals (eg Toussaint Louverture, William Wilberforce), or peoples (eg Igbo, Ashanti/Asante/Ashantee, Carib). 

Vincent Brown has argued that the nature of the slavery archive – riddled with gaps and silences – demands that historians move away from an approach that seeks straightforward ‘historical recovery’ to one that focusses on ‘rigorous and responsible creativity.’ (Vincent Brown, ‘Mapping a Slave Revolt: Visualizing Spatial History through the Archives of Slavery’, Social Text 33 (2015), p.134). There are existing, innovative digital humanities (DH) approaches to the study of slavery. Projects have used computational methods to explore large-scale corpora of slavery-related literature, examining the size of the English lexicon, the evolution of grammar and the frequency with which certain words or phrases were used over time, or in the study of emotions in narratives written by enslaved people. There is a broader range of DH projects that examine slavery in the Atlantic world, which have made novel historiographical contributions, perhaps most notably the broad databases Slave Voyages (https://www.slavevoyages.org/) and Legacies of British Slaveownership (https://www.ucl.ac.uk/lbs/), recently brought together with other projects as Enslaved (enslaved.org) but also the more focused Runaway Slaves in Britain (https://www.runaways.gla.ac.uk/) and the Early Caribbean Digital Archive (https://ecda.northeastern.edu/home/about/decolonizing-the-archive/). What we describe is the utilisation of (well established) text and data mining approaches, applied to the EB, involving unique corpus analysis that would need to consider the intellectual and cultural context in which eighteenth and nineteenth-century encyclopaedias were produced and published, and also linking and cross-referencing to other information sources available within the National Library of Scotland collection. By searching, analysing, and visualising the ways in which terms related to slavery appear in this essential reference material, using a variety of methods including GIS, accurate geoparsing, and following concepts and their relationships diachronically, we will both understand more about how Atlantic slavery was understood or instantiated within our information sources, whilst also developing a methodology for research into other similar primary reference material, and the ideas that they disseminated.

The University of Edinburgh is an ideal place to carry out this research. The Edinburgh Centre for Global History, which Paton directs, has Migration, Slavery and Diaspora studies as one of its three thematic hubs (https://www.ed.ac.uk/history-classics-archaeology/centre-global-history). The Centre for Data, Culture and Society’s recent push to establish text and data mining as a core research interest alongside training events and materials (https://www.cdcs.ed.ac.uk), aligned with support from the Edinburgh Parallel Computing Centre’s research software engineers (https://www.epcc.ed.ac.uk). We have already mounted the EB on EPCC systems, and ran preliminary searches on a selection of terms, as a pilot study to establish that there would be enough content upon which to build a PhD, in the analysis and visualisation of results. The candidate would be trained in both R and Python, and have access to our in-house text-mining at scale platform, Defoe (see “defoe: A Spark-based Toolbox for Analysing Digital Historical Textual Data”, Filgueira Vicente, R et al, 2019 https://doi.org/10.1109/eScience.2019.00033). 

This is a timely topic, of significant relevance, given the Black Lives Matter movement and increasing interest in decolonising academic and cultural institutions. The University of Edinburgh has recently established the Institute for Advanced Study in the Humanities Institute Project on Decoloniality (2021-24) (https://www.iash.ed.ac.uk/institute-project-decoloniality) and the candidate can engage with this. This project will have scholarly impact in Digital Humanities, History, and Library and Information Science, as we consider how to analyse, deconstruct and decolonialise historical information sources using computational methods.  

New article: The value of mass-digitised cultural heritage content in creative contexts

One of the projects I’m working on right now is Creative Informatics, (2018–2023), which aims to enhance data-sharing and innovation across the creative sectors throughout the City of Edinburgh and local regions, to develop ground-breaking new products, businesses and experiences, as part of the Creative Industries Clusters Programme (2020). I’m pleased to share our first team effort paper, which just came out in Big Data and Society, in its special edition on Heritage in a World of Big Data: re-thinking collecting practices, heritage values and activism, edited by Chiara Bonacchi (which is a fab set of papers, btw). Our paper is fully open access, so I’ll paste the abstract in here, and the full citation.

How can digitised assets of Galleries, Libraries, Archives and Museums be reused to unlock new value? What are the implications of viewing large-scale cultural heritage data as an economic resource, to build new products and services upon? Drawing upon valuation studies, we reflect on both the theory and practicalities of using mass-digitised heritage content as an economic driver, stressing the need to consider the complexity of commercial-based outcomes within the context of cultural and creative industries. However, we also problematise the act of considering such heritage content as a resource to be exploited for economic growth, in order to inform how we consider, develop, deliver and value mass-digitisation. Our research will be of interest to those wishing to understand a rapidly changing research and innovation landscape, those considering how to engage memory institutions in data-driven activities and those critically evaluating years of mass-digitisation across the heritage sector.

Terras, M., Coleman, S., Drost, S., Elsden, C., Helgason, I., Lechelt, S., Osborne, N., Panneels, I., Pegado, B., Schafer, B. and Smyth, M., 2021. The value of mass-digitised cultural heritage content in creative contextsBig Data & Society8(1), p.20539517211006165.

It’s worth stressing that we problematise the act of considering such heritage content as a resource to be exploited for economic growth before people set the pitchforks upon us.

It was a great paper to write with the team, and I can recommend working with the BD&S editors and peer reviewers – this one had a few turns around the block, and it is all the better for it.

#DHGoesViral, a year on

I haven’t talked much about on here about the pandemic. A year ago today, the #DHgoesViral twitter conference happened, swiftly organised by Agiati Benardou at the outbreak of Covid-19 across Europe. By then we were a few weeks into a rapid change in how we were all living, and locked down at home with minimal contact with the outside world. Only a few weeks before – and the day before the UK lockdown started – I remember talking to a senior administrator, who was convinced universities wouldn’t close. We were closed down 24 hours later. Everything was stress and uncertainty and a huge cognitive load to deal with.

DH in the time of Virus played out entirely over twitter. It saw Digital Humanities experts, both academics and practitioners, as well as Digital Research Infrastructures and Initiatives from across Europe, give their thoughts on what was happening to our field and our professional areas at the time of the sudden lockdowns. I was asked to give mine, and honestly, finding the mental ability to concentrate on preparing these 10 tweets was hard, it took me nearly a day (when in normal life I could bash this out in 10 mins or so, although what is normal anymore….?). I thought I would park them here, to think about what has changed – and what is the same – at the end of our second lockdown in the UK, and as central Europe goes into its third.

You can see the starting point for the other #DHgoesViral twitter stream “talks” on this blog. Here was mine. I can see now we’re not so panicked, but still restricted. We’re still depending on infrastructures that are under resourced. There are still loads of people doing a tonne of work behind the scenes. And we’re dependent on digital given the libraries and archives are (at the moment) still closed…

Look after yourselves, everyone.

New Paper: Text mining Mill: Computationally detecting influence in the writings of John Stuart Mill from library records

Image of handwritten library register
London Library Issue Book No. 3 showing John Stuart Mill’s intensive borrowing record during 1845, London Library Issue Book Number 3, p. 529. The horizontal lines indicate the return of individual books. The vertical lines indicate that all the books listed on the page have been returned. Image reproduced with the kind permission of the London Library. © The London Library

How can computational methods illuminate the relationship between a leading intellectual, and their lifetime library membership? I’m pleased to say that a paper, derived primarily from the work Dr Helen O’Neill conducted for her PhD thesis in Information Studies at UCL, on The Role of Data Analytics in Assessing Historical Library Impact: The Victorian Intelligentsia and the London Library (2019), supervised by myself and Anne Welsh, has just been published. The interesting thing about this paper is that it started life as a tweet:

Replies from both David A. Smith (at Northeastern), and Glenn Roe (now at the Sorbonne), who took the time to detail and explain their previous work in detecting textual reuse, led to a collaboration. In O’Neill’s doctoral work, we explored the interrelation between the reading record and the publications of the British philosopher and economist John Stuart Mill, focusing on his relationship with the London Library, an independent lending library of which Mill was a member for 32 years.

Building on O’Neill’s detailed archival research of the London Library’s lending and book donation records, O’Neill constructed a corpora of texts from the Internet Archive, of the (exact editions) of the books Mill borrowed and donated, and publications he produced. This enabled natural language processing approaches to detect textual reuse and similarity, establishing the relationship between Mill and the Library. With Smith and Roe’s assistance, we used two different methods, TextPAIR and Passim, to detect and aligning similar passages in the books Mill borrowed and donated from the London Library against his published outputs.

So what did we show? The collections of the London Library influenced Mill’s thought, transferred into his published oeuvre, and featured in his role as political commentator and public moralist. O’Neill’s work had already uncovered how important the London Library was to Mill, and how often he used it, but we can also see that in the texts he wrote, given the volume of references to material in the London Library, particularly around certain times, and publications.

The important thing about this really is that we have re-conceived archival library issue registers as data for triangulating against the growing body of digitized historical texts and the output of leading intellectual figures (historical turn-it-in, huh). This approach, though, is dependent on the resources and permissions to transcribe extant library registers, and on access to previously digitized sources. Because of complexities in privacy regulations, and the limitations placed on digitisation due to copyright, this is most likely to succeed for other leading eighteenth- and nineteenth-century figures. Still cool, though.

On a personal note – this is the last paper I’ll be publishing from work that I started while employed at UCL. It was important to me to see through the PhD supervisions I had committed to, and I was delighted when Helen O’Neill (who did her PhD part time, while working full time!) passed her viva with no corrections at all (yay!) in 2019. Happy end to an era, for me, and yes, it has taken a full three years to finish up the transition of research to Edinburgh! But all good.

Here’s the reference to the paper proper:

O’Neill, H., Welsh, A., Smith, D.A., Roe, G. and Terras, M., 2021. Text mining Mill: Computationally detecting influence in the writings of John Stuart Mill from library recordsDigital Scholarship in the Humanities. https://doi.org/10.1093/llc/fqab010

I’ve parked an open access version for you to read without the paywall. Enjoy!

New paper: Digital cultural colonialism: measuring bias in aggregated digitized content held in Google Arts and Culture

Map showing the dominance of content with origins in the USA, in Google Arts and Culture.

In February 2011, Google launched its Google Art Project, now known as Google Arts and Culture (GA&C), with an objective to make culture more accessible. The platform (and the content on its app) has dramatically grown since then, and currently hosts approximately six million high-resolution images of artworks from approximately 2,500 museums and galleries in almost every country featured in the UN member list. Although Google does not publish user statistics for the resource, it is understood that virtual visitor numbers increased dramatically during 2020, when many leading arts institutions had to close their doors because of the COVID-19 pandemic. There has, to date, been very little research published on the platform, and our newly published research in the Journal of Digital Scholarship in the Humanities (which was a Digital Humanities class project with Inna Kizhner‘s students at Siberian Federal University) interrogates GA&C to understand its scope, scale, and coverage.

Our scraping of the site content (in summer 2019) shows that Google Arts and Culture is far from a balanced representation of the world’s cultures: 

  • The major proportion of the holdings feature content that resides in the USA;
  • Only 7.5% of the content is from institutions beyond the USA, UK, Netherlands and Italy; 
  • There are very few African cultural institutions who have contributed to the platform, and very little African culture present;
  • The culture of some countries (such as Kazakhstan) are represented entirely by pictures of American and Russian culture via the space programme;
  • Artworks from capital cities dominate the collections, while art from provinces is underrepresented;
  • There is a dominance of art from the 20th century.

Pie chart showing that 82% of individual items in Google Arts and Culture are presented by USA institutions.

This leads to some extreme examples in the platform. There are next to no entries in GA&C featuring content originating in large parts of the African continent. American culture dominates Canadian culture. The culture of Kazakhstan (in 2019) was represented entirely in GA&C by 4000 pictures of American and Russian astronauts, from the NASA archives: no Kazakhstani institution had uploaded content themselves. 

In 2016, Amit Sood, the Director of GA&C, 2016 claimed that it would introduce a new era of accessible art. However, we have shown that GA&C is a corpus where a number of cultures are underrepresented or marginalized. It maintains dominant cultural systems, including promoting the cultural holdings of the United States of America above all others. Google Arts and Culture is therefore an example of “digital cultural colonialism”, which amplifies the cultural holdings of one particular country, and also reinforces the conventional traditions of art collection and interpretation that dominate museum displays in larger Western cities.


The biases we have discovered in GA&C may have long-ranging affects. If larger quantities of objects, images, and stories related to a particular idea or representation of selected knowledge are present in aggregators of cultural content, these ideas and concepts will be promoted, accessed, disseminated, and studied, becoming the foundation of the new digital canon: one that can be appropriated for Artificial Intelligence and machine learning. 

We argue that the choices that have gone into the content feature on GA&C should be made clear, as well as the biases contained within the platform. Our research is the first step to understand that this major platform is not neutral, and contains biases that will impact on its large user base, as well as data-led approaches that draw upon it, both now, and in the future. We end with a challenge: what will GA&C do to make its processes for ingesting and showcasing ‘arts and culture’ transparent, and how will it deploy its resources to expand the reach and spread of the digital content it features?

For more, see: “Digital cultural colonialism: measuring bias in aggregated digitized content held in Google Arts and Culture” by Inna Kizhner,  Melissa Terras,  Maxim Rumyantsev,  Valentina Khokhlova, Elisaveta Demeshkova,  Ivan Rudov,  Julia Afanasieva. Digital Scholarship in the Humanities, https://doi.org/10.1093/llc/fqaa055

Given that the copyright in DSH belongs to the authors, I’ve placed a copy of the journal article here on this blog, for easy and open access.

We are following up this research on the platform with a follow up study to understand the GLAM sector’s views on Google Arts and Culture, including the process of becoming a partner in the platform. If you would like to give your anonymous opinion, our online survey is open until 15th February 2021: we would appreciate all insights.

Arise Sir Generative: When AI Met the Queen

Example of one of Rudolf Ammann’s ImprovBot illustrations CC-BY-NC if you want to reuse…

Over the past few months, I’ve had a lot of fun with generative AI. Last summer, I put on ImprovBot (with my colleagues Rudolf Ammann and Gavin Inglis), which was the world’s first AI-generated Arts Festival Programme. Taking  2.5 million words of material from the past 8 years of Edinburgh Festival Fringe Society listings, we trained our neural network, The Bot, to generate “new” show blurbs, which we put our hourly over the expected period of the Fringe (which couldn’t happen in person last year), with custom generated imagery, and a live improv show every day from the Improverts, the Edinburgh University student improv society. We got fantastic coverage worldwide, include a 4 Star review from The Stage! – and the whole thing was a month long elegy for a Fringe, and its associated industries – that were decimated this year. We hoped to walk the line of poignant, fun, creative, and everything-being-mediated-by-tech these days. Incidentally, you can see all the outputs of the ImprovBot text over at Zenodo, and if you are after any art work to represent AI, we’ve licensed reuse of 376 ImprovBot illustrations CC-BY-NC: fill your boots with ones like the one above. An easy to download selection, also available under a less restrictive licence that allows for commercial use and does not require attribution, is available over at Pixabay.

Just before Christmas, the Alan Turing Institute (of which I am a Fellow) was asked by Wired Magazine if there was anyone who would like to explore generating a Christmas Speech for the Queen, using AI? Why yes! David Beavan and I were delighted to hack together a response, which went out on Christmas Eve:

To train the system, he had to combine the two datasets, one of the Queen’s previous broadcasts and the second of WIRED Covid-19 stories, into a single document to ensure both were equally considered. “You give it [GPT-2 ]the beginning of a sentence and the idea is it guesses the next word,” he says. After examining the results, the temperature is dialled up or down.

The system churned out thousands of words, which were then passed to Terras to edit down. She started by taking out anything negative or controversial – “the computer put together some dark stuff,” Terras says, especially around race, the commonwealth and war – and then selected relevant passages, keeping the sentences whole but altering order and placement. Some AI systems can analyse documents for structure, but not this one, so a helping hand was required. “I took a box of tiles and put them in a mosaic,” she explains. “There’s a lot of human editing.”

You can read the whole article over on Wired, and I’ll post our generated Queen’s Christmas Speech below. It was a lot of fun – but also raises issues of ethics, the amount of human interaction that is needed for the “softer” things (humour! power structures! respect!) and also the role of these low-hanging-fruit, quick win, playful things in the public engagement with AI and algorithmically generated content. Both of these projects weren’t technically doing anything new, but by pointing the power of generative text to a new, playful application – well, the results help us consider AI afresh, in a way which is explainable to others.

Enjoy the Windsor-o-Tron‘s output!

Christmas is a time for reflection on the past and making new friends. On the first day of the year, however, things began to look a bit more grim. I remember meeting Joseph and Mary at the Inn in Sandringham. We were both looking forward to the future and looking forward to our visit to Oxford this autumn. I shall never forget the scene in Windsor, where the Covid-19 outbreak was reignited. In the first lockdown, all tourists were restored to normal, adults were ordered to stay at home and children under five were allowed to stay at home.

I have spent the last couple of weeks listening to some of your radio and television interviews, which has touched me deeply. I have thought to myself whether it is time to send you my best wishes for Christmas and the New Year. The NHS has faced a real and growing challenge in the years ahead. It’s been a difficult few months for many people living alone. But with so much to build on and many exciting opportunities to be found in the nature of our relationship, this year I think it is safe to say that we are all looking forward to a new year.

We are also living in a time of social distancing: the less we live together the more distanced we become. I am thinking of those now living with their parents or caring for them at home. These people are now their families. That motherly instinct has helped to shape my own views of the world, thoughts on life and my own beliefs. I remember the first time I was asked by a kindly visitor, a man of few words, what year was Jamaica.

The world has to face its challenges and confront its problems with courage, patience and fortitude. A vaccine for Covid-19 hinges on the delivery of a drug, so that new antibodies can be triggered. But the real power lies in the invisible hand that draws the world in. When invisible hands come to the task, it’s often the invisible workers at the machinery who are making that change. It is through their example and willingness to show the world that they deserve our respect that we can make a real difference.

One of the things that has remained constant throughout the Commonwealth, I believe, is the effort to reconcile the differences between nations and between countries. That spirit of brotherhood which has survived the most serious challenge of the present century can be best expressed in the British Commonwealth and the Commonwealth international formula. Every year I look forward to opening the letters, parcels and telegrams that come to me from the Commonwealth. I can think of no better time than now to say a big thank you to all the people who have given so much to this country and all around the Commonwealth. Every one of them has given so much to me.

This year I’ve spent a great deal of time and effort in various fashions and colours, some of which are familiar to many of you. Naturally I would like to draw attention to the fact that my family and myself have enjoyed a very happy and prosperous past year. We are fortunate to have a home and some children.

Like many other families, we gathered to watch the bubbling fountains of humanity rise above the evil. In the meantime, members of my own family are celebrating Christmas with their families and we shall see further developments as I set out to see which side of the Atlantic the peace will be in the coming year.

In January 2021, after we’ve all lined up patiently for our jabs and the threat of the virus has receded, we may finally start to count the damage the novel coronavirus has wrought on our lives. The Prince of Wales also saw first hand the remarkable resilience of the human spirit. Yes, there are many of you unhappy families, but there are also millions of ordinary people who are helping keep our country and our Commonwealth together. They are making a real contribution to our society. There may be small signs of recovery, but in the meantime, we must all keep an eye out for signs of a slowing or complete return to the days when King James was a political and economically powerful man. 

The real value of Christmas lies in the message and the spirit that it brings. Christmas is a very human offering, and it speaks to the needs of all people. So, as it passes through our thoughts are diverted to other planets, and to the struggles beyond our control. 

The Christmas story reminds us that it is not only about one man, but about many. We have a message for you all: hope, peace, brotherhood and a happy Christmas. Whether you are talking to a friend, or a relative, or a stranger, or a visitor from another world, the message of Christmas is ever more relevant than ever. I would like to see a message of encouragement, as I go about my business in the rain.

Our lives are shaped by our past, and as we live out our future together we should know each other best. It is difficult for us to know far into the future as our families gather round us, but it is better that we have some sense than that we have any sense at all. I wish you all, together with your children and grandchildren, a blessed Christmas.

New Book Out Now! Electronic Legal Deposit: Shaping the Library Collections of the Future

I’m delighted that my latest edited collection, with Paul Gooding, is out now with Facet Publishing: Electronic Legal Deposit: Shaping the Library Collections of the Future. Stemming from our Digital Library Futures AHRC funded project, which looked at the  impact of changes to electronic legal deposit legislation upon UK academic deposit libraries and their users, we’ve pulled together this collection from contributing experts worldwide to look at issues and successes in preserving our digital publishing environment.

For those who don’t know what electronic legal deposit legislation is, lets back up a bit. It is of course related to legal deposit, and as we say in the introduction:

Legal deposit is the regulatory requirement that a person or group submit copies of their publications to a trusted repository. First introduced by France in the sixteenth century… legal deposit has since been adopted around the world: as of 2016, 62 out of 245 national and state libraries worldwide either benefited from legal deposit regulations or participated in legal deposit activities… Regulations permitting legal deposit of printed publications have played a vital role in supporting libraries to build comprehensive national collections for the public good… In the last two decades, the scope of legal deposit has grown to formally incorporate ‘electronic’ or ‘non-print’ publication; those published in digital and other non-print formats. (Gooding and Terras 2020, p.xxiv).

We believe that this is the first book to attempt to draw together an overview of contemporary activities in major organisations and institutions trying to preserve our digital publishing world, which of course includes the world wide web, and how it is archived. We do so from a user perspective, looking at the implications this will have from users of the collections, both now and in the future. And we poke a big stick at the intersection of copyright and legal deposit legislation which often conspire to make user access so limited and tricky to negotiate that end users are presented with a series of obstacles to even get basic access to electronic legal deposit content. You can find a break down of the chapters and contributors here, including those from the National Library of Sweden, Biblioteca Nacional de México, National Archives of Zimbabwe, etc etc!

We’ll be holding a book launch on 5th November 2020, online, for those who want to hear some excellent speakers on the topic, including from the National Library of Scotland, and Universidad Nacional Autónoma de México. And I’m particularly taken with the cover of this one, which is an art work created from an actual LiDar scan of the National Library of Scotland stacks, by Edinburgh College of Art PhD student Asad Khan. I love it when a plan comes together.

For those who want a sneak peek of the content, under Facet’s Green Open Access rules, I’m allowed to share the author’s last copy of a single chapter from an edited collection. So here, from Paul and I, is our chapter on how the digital turn has affected legal deposit legislation, showing that ” print era notions that influence the NPLD access and reuse regulations are increasingly out of step with broader developments in publishing, information technology, and broader socio-political trends in access to information”. Have at it, and enjoy.

Gooding, Paul and Terras, Melissa (2020). An Ark to Save Learning from Deluge’? Reconceptualising Legal Deposit after the Digital Turn. In Gooding, Paul and Terras, Melissa (2020) (Eds).  Electronic Legal Deposit: Shaping the Library Collections of the Future. Facet: London, 203-228.

New paper: Understanding multispectral imaging of cultural heritage: Determining best practice in MSI analysis of historical artefacts

What do people actually do when they undertake multispectral imaging of cultural heritage? I’m really pleased that our latest paper has been published, that helps set out the answer to this question, and provides a literature review on heritage digitisation that has been using multispectral imaging, comparing and contrasting methods. This formed part of Dr Cerys Jones’ PhD research, and I was really delighted to supervise this with Adam Gibson and Christina Duffy:


Jones, C, Terras, M, Duffy, C & Gibson, A 2020 “Understanding multispectral imaging of cultural heritage: Determining best practice in MSI analysis of historical artefacts”. Journal of Cultural Heritage.

You can see the journal version online here – but the link above will take you to the authors’ submitted copy.  Enjoy!