Reuse of Digitised Content (4): Chasing an Orphan Work Through the UK’s New Copyright Licensing Scheme

An (award winning!) regularly updated concluded blog post in which I document trying to get a license to reuse an item for which no copyright information exists, under the UK Government’s new legal framework.

Diary Entry 1: Weds 29th October, 1.16am UK time

Well, today’s the day! Wednesday 29th October, the day the UK law changes to allow licenses to be granted for “orphan works” – items whose copyright owners cannot be located. This is a thorny problem – as a recent government publication explains:

If an individual wants to use a copyright work they must, with a few exceptions, seek the permission of the creator or right holder. If the right holder – or perhaps one of a number of right holders – cannot be found, the work cannot lawfully be used. This situation benefits neither the right holder, who may miss opportunities for licensing, nor potential users of those works. This is not a situation peculiar to the UK; other countries face the same issues [source].

A new framework was announced as part of the Enterprise and Regulatory Reform Act 2013, and has been implemented after a consultation process earlier in 2014. The UK government also introduced regulations to ensure that they comply with the EU Directive on Orphan Works. This new scheme will be administered by the UK’s Intellectual Property Office.

In my previous blog post I introduced a lovely orphan work, from the mid 1960s: a “lantern” interval slide tempting patrons to buy an ice lolly, used at the Odeon Cinema, Eglinton Toll, Glasgow.

The image is used here with permission from the Scottish Screen Archive, National Library of Scotland. It’s part of a collection of Lantern Slides, with no individual collections record. It has a small identifying note to say it was made by Morgans Slides Ltd, which is no longer trading. The Odeon cinema were contacted, but their records dont go that far back for design so they cannot prove they own copyright, but they gave me permission to use it if they do, with the caveat that a copyright owner, whom they cannot speak for, may come forward at some future date. It’s an orphan.

I want to adopt it, so I can use it widely, but also, to investigate how easy? hard? costly? problematic? easy? it is to get a license for orphan works under this new scheme.

So let’s go! This blog post will expand over time as I update on the process. I have all the data I can get on the item gathered, and am ready to roll. I contacted the IPO last week via email, and they promise that “all the relevant information on how to apply for a license and the due diligence needed will appear on the website on Wednesday morning”. There’s nothing up there yet (I’m on Australian time writing this, in Melbourne, and its only just past midnight in the UK)… but will check back later, come UK business hours.

Diary Entry 2: Weds 29th October, 11am UK time

It’s online! Here we have the process and the system of how to apply for a license for an orphan work. It’s an online form – I’ll get cracking with it…

Diary Entry 3: Weds 5th November

Its taken me a week to update this – not because I’m not interested, but because I’m over in Australia just now on a lecture tour, and its been a jolly whirlwind of lectures, lunches, masterclasses, flights, trains, dinners, and kangaroo spotting. Excuses excuses. I’m now in a hotel room in Sydney with a few hours spare of an evening and an actual internet connection: lets get to it.

The first thing to report is that the IPO read this nascent blogpost and contacted me! The Head of Copyright Delivery, from the Intellectual Property Office, thought my blog was interesting, and we’ve already exchanged an email or two: they are interested in the needs of our sector, and want to assist in this. Isn’t that interesting (and slightly scary – that’s social media for you) – and I assured them that I wasn’t doing this out of “lets see what is wrong with the process” but in the spirit of genuine exploration. The process is clearly flagged as in Beta, so lets all proceed in manner of mutual respect, and give some feedback as we go. I’m happy to be a guinea pig. (I’ll flag up recommendations to the IPO with a bold IPO: for easy scanning).

The process itself seems relatively straightforward (check list from the IPO website):

  • check that the work you want to copy is still in copyright because if it isn’t, you don’t need a licence to use it
  • check whether the use falls within one of the copyright exceptions
  • read the guidanceon orphan works and take this short questionnaire to find out if you are eligible to use an orphan work under the EU Directive
  • carry out a diligent search for right holders in accordance with IPO published guidance
  • complete the diligent search checklist(s) and convert these to one PDF document to upload as part of the application process

Then, work out what license you want. So lets go through this check list, to get that first phase out of the way.

1. Check that the work you want to copy is still in copyright because if it isn’t, you don’t need a licence to use it.
TICK! we have an orphan item right here.

2. Check whether the use falls within one of the copyright exceptions
UNTICK! exceptions are for Non-commercial research and private study; Text and data mining for non-commercial research; Criticism, review and reporting current events; Teaching; Helping disabled people; Time-shifting (eh? Dr Who a go-go); Personal copying for private use; Parody, caricature and pastiche; Certain permitted uses of orphan works (which allows the GLAM sector to digitise orphan works and make them available online); Sufficient acknowledgment, and Fair Dealing.

Now, I’m no lawyer, but none of these are me: what do I want to do? I want to take an orphan work, and make some fabric with it using it as the basis for a pattern, and perhaps make a few items of things, which I might want to sell on etsy, or put the pattern up on spoonflower, etc.  Yes, there is a teaching and research element to this, but by the same token, the teaching and research is in the fact of making it available, and chasing through the process. It’s not commercial a la huge superstores, but its certainly not uncommercial, even though the chance of making a profit is slim. I’d say that there were no exceptions to copyright for me in this case, so we (un)tick the item, and proceed to the next in the checklist.

3. Read the guidanceon orphan works and take this short questionnaire to find out if you are eligible to use an orphan work under the EU Directive.

Read it (interesting read). The checklist for the EU Directive the first thing where I go… wait a minute, Gov. See screenshot.

Problem number 1: I’m not an organisation. I’m just a person. If I wanted to start trading and selling this stuff, I’d be starting off as a sole trader, which is still not really an organisation. So I already dont know if I’ve met this check list or not, given I’ve been asked what categories my organisation falls under (so, recommendation 1: IPO I’d be putting in some thing about being an individual person on that screen). I’m very much “none of the above” so lets tick that, which gets you to:

I’m not exempt, as I thought, although who knows because of the organisation thing… so lets crack on with the licensing scheme.

4. Carry out a diligent search for right holders in accordance with IPO published guidance
TICK! I have all my info here, good to go!

5. Complete the diligent search checklist(s) and convert these to one PDF document to upload as part of the application process.

The form is a bit of a funny beast. First, you have to select from a list of different forms, choosing one which may apply to you.  As ever, its really hard to make up a list that covers everything, so it took me a bit to decide that I should file my request under “still visual art” which is the closest there is to it. Then you get to filling out the form, which mostly a list of places you should check to see whether the item is listed or found –  first page screenshot:

The list of places to check goes on and on: British Society of Underwater Photographers (BSUP), Bureau of Freelance Photographers (BFP), Chartered Institute of Journalists (CIOJ), Editorial Photographers UK (EPUK), Master Photographers Association (MPA)… I make it 55 different places you have to check to claim that it is an orphan work, which is fine: you want to be diligent about this (the clue is in the title). But, recommendation 2: IPO it would really help if you said exactly where you wanted people to check (giving URLs would help a lot) and how – for example, “water mark search or image recognition software” is quite a broad church. Another thing recommendation 3: IPO is that a lot of the places are societies, and I’m not sure what they thought you could do there, or what they have that is relevant. For example, if you go to the Professional Cartoonist’s Association website, you can browse portfolios, but I cant seem to search for “lolly time”, and its much more a membership organisation than a repository for content. So I’m not sure what I’m being asked to check: that I had a quick look at the website? that there was nothing of relevance because they dont have a repository? that I was supposed to email them and ask? Guidance on that would be super useful.

But I worked though the list diligently – and I know how difficult it must have been to put together that list of things, as one size wont fit all, and I tried not to get frustrated by demonstrating why Its Lolly Time isnt probably going to be relevant to the Society of Wedding and Portrait Photographers, but it is what it is. MS word was probably the easiest way to go, but it would be nice recommendation 4: IPO if there could have been a slightly more interactive way to choose which evidence you want to present (where do I attach my emails from the National Library of Scotland, or the Odeon, for example?)

Now on to choosing a license that you are applying for! And it is here that the fun starts! Because that’s where you get to play with the cost calculator to buy the license.

First, you name the item, and choose carefully my friends! as that will go in the register:

Then you choose a category:

Again, its hard to come up with these categories, and “my” objet doesnt really fit anything, but lets go with still image (recommendation 5: IPO you may want to expand on that list a little?) And then you choose subcategory:

Umm… I dont know. Illustration? Maybe?

And then comes the killer question… DA DA DAAAA!

Well. I want to get a license that means I can put something up on etsy or spoonflower, so technically is that commercial? I dont know (IPO: I’ll be emailing you about this!). Their further information would seem to think so:

So even selling one thing = a commercial license needed. Ok. Lets got the nuclear option, and choose monetary compensation! For…

Let’s go with retailing and merchandising? And with checking that, I know that one does not simply walk into Mordor. As individuals who are interested in making one or two things are suddenly being lumped into the same categories as large chains that want to sell tens of thousands of items…

There’s some more granularity and some more drop down menus – what is the exact use of this work? erm… On apparel? that’s scarves, right? and for spoonflower? (further gulp. We are not in Kansas now, Toto). There is a prompt to contact the IPO if the exact use is not listed, and I think we need to come back to this idea of individuals and small businesses using content, versus massive clothes manufacturers because once you see the costs, in a minute…

But first, I have to choose how much surface the item will cover?

IPO recommendation 6: I’d suggest this is a nonsensical list. Why? Well, repeating patterns are pretty common. How do they relate? My scarf is bigger than a “page”. So what does this mean? And what is a page? A4? A3? A0? you might want to revisit this. My orphan image will cover as much fabric as you can print out, as its repeated. But lets go the nuclear option, and tick “more than a full page” for a commercial license, for apparel!

You are then asked to choose how many things you are going to make to sell. I’m thinking of making two or three, just to test the waters (starting up as a sole trader, remember!) but the minimum item load you can make, in the licensing structure,  is…. 5000!

IPO recommendation 7: its clear that these structures are geared towards large corporations, rather than individuals or sole traders. I’d revisit this: you want to help bootstrap creative making, not treat everyone like Walmart? But I’ll click 5000 or less for now.

Then you have to choose how long the license is going to last for.

Shortest is 3 months, longest is 7 years… hmmm. Well, selling these two or three items is going to pay for my early retirement, so lets go for 7 years (IPO recommendation 8: it would be good to know if once you have licensed an item, noone else can? Or if multiple people can license the same thing? just a thought)

So here we are. A 7 years license to use an image on a repeating pattern for apparel that I can sell, less than 5000 items. And the cost comes to…. drumroll!!!!!

Did your eyes go as big as mine did when that figure popped out?

Now, its clear that I’ve stumbled into the commercial sector line here, and this is probably a fine amount for an image license if you are a major clothing retailer. But there is no way that I can stump up over £2.5k to allow me to put some scarves on etsy, or fabric up on spoonflower. I need to email the IPO and ask about this, which I will do next, and report back, but before I do … I wonder who is benefiting from these license costs? Where does the money go to? I’ll ask the IPO about that and update the blog with the info they supply.

Of course, we could go back to the start of the process, and I could decide that I’d make things and donate any profits back to the NLS, in which case I would check the no commercial gain box. The process (and costs) are then significantly reduced:

A mere ten pence for non commercial use (hurrah!). But I’m not sure if what I want to do is covered under this list. Perhaps “personal use” – it could be argued that I am just making the digital files available for people via spoonflower (or I could make the digital files available to print up a scarf, and tell people where to go and get one if they do), but my dreams of etsy stardom reusing digitised content have been dashed.

It’s clear that I need to talk to the IPO about this and ask the best way to proceed – and I’m aware that I started a very public conversation right from the get go – but it does seem to me that the licensing structure, as is stands in Beta, allows reuse for freebies but doesnt really allow people to make products out of orphan material, unless they are for personal use only – and to be fair, I got that license from the NLS in the first place, so I’m not sure what is to be gained here, appart from the fact I could now share the source files of the item I made for personal use with everyone else.  The licensing costs do hamper anyone wanting to start making and selling at a cottage industry scale, which is the majority of the online marketplace over at places like etsy, and probably a major source of reuse for this content: it should be opened up for everyone to use? Or am I being naive or utopian?

So I’ll take this back to the IPO, and ask them to think about how this helps or hinders uptake of this material so you can actually do something with it if you are not a corporation, and report back. I’ll also ask for the table of licensing costs from the IPO, which must exist somewhere, so that I dont have to spend ages with the online tool manually drawing that up myself. The online tool is pretty straightforward (save the baffling question about pages?) and usable, given the complexity of the range of uses people must be asking them for, and its great that it went up on the day the legislation changed.

The orphan works scheme – especially the 10p non-commercial license – is fantastic for the heritage sector as it does clear up a lot of issues with reusing content for not for profit usage, but I was hoping I could do something more…  I’ll be right back when I have more to report, travel and poor internet access notwithstanding.

Diary Entry 4: Friday 28th November

The delay to updating is all my fault, due to catching up on life and work (and sleep) upon my return from Australia. The IPO were really very quick to respond to my query, and it has sat in my inbox for a couple of weeks (sorry for being so tardy IPO, and thanks for being so speedy yourself).

Here is what I asked, and how they responded. I quote directly from an email to me from the Head of Copyright Delivery at the IPO.

1. I asked: Where does the money from the licenses go?

They said:

The application fee covers the cost of processing your application.  The licence fee (minus the VAT of course, as that will have been transferred over to HM Treasury) is held in case the right holder for the work comes forward.  They have a period of 8 years to do so, and if we are satisfied that they are the right holder, we will pay out the licence fee.  If they do not come forward in 8 years, we can use the money to pay for the set up of the licensing scheme, and also for ‘social, cultural and educational activities’ (regulation 14 of The Copyright and Rights in Performances (Licensing of Orphan Works) Regulations 2014 – http://www.legislation.gov.uk/uksi/2014/2863/contents/made).

2. I asked: Even if you only sell one or two things, do you need a commercial license?

They said:

As you set out in your blog, if there is any money changing hands for the use you are making, it counts as commercial.  If you are going to put the pattern on Etsy or similar for free, it would be non-commercial.

3. I asked: Can I get a copy of all the licensing costs?

They said:

Our pricing structure runs to literally thousands of possibilities for licences, so it is not possible to send you all costs for commercial content – it really depends on the type of orphan work and the use you want to put it to.

4.  I asked: How were the licensing costs worked out? Were commercial costs used as a model?

They said:

We undertook a detailed analysis of prices for non-orphan works, taking into consideration both the type of work and the use.  This used information from a wide range of sources, both commercial and non-commercial providers.  We had always committed to ensuring that orphan works do not distort the market unduly against non-orphan works, which provides a certain level of reassurance for right holders and licensors in the market.

5. I asked: Is a license exclusive? So if you pay all that money, could someone else also get a license?

They said:

The licence is non-exclusive, so someone else could apply to use the work for a similar or different use at the same time as you using it.  This is because we have no way of knowing whether the right holder would negotiate an exclusive licence or not.

They also directed me (and you, dear readers!) to the scheme overview guidance on gov.uk: https://www.gov.uk/government/publications/orphan-works-overview-for-applicants.

This makes it much clearer for me, and I can see the reasoning behind such decisions, although I’m going to ask them to offer a significantly smaller run of items (the minimum you can get a license for is 5000 at the moment, which really does stop small businesses experimenting: a minimum license of 20? items, with a license fee reduced in ratio to the number of items would bring the license costs way down, whilst allowing our kitchen-table makers to use orphan works in items offered for sale at a small scale). I’ll ask them, and brb, honest.

I also have an issue about orphan works being offered for sale at the same licensing rates as commercially available art is… although I can see why you dont want to flood the market with low cost design materials and put today’s beleaguered designers out of work (its a tough enough gig out there at the moment). However, this still means that there will be so much really fab cultural and heritage material locked up in institutions that people cant afford to use. Harrumph.

I also consider that there must be a model that generates all the costs: its not like computers spit out all this stuff themselves without any human input. I do know a little about that, in my line of work. So I’ll ask again about getting hold of the underlying model that generates costs.

And what am I going to do, myself? Well, I’ll wait til I get a response about lowering the number of licenses available, and if they do, I’ll buy a commercial license for 20 units, and get my make on. But if they wont, all I can do is apply for the non-commercial license, and make source files available for others to use in a not for profit manner. There’s very little in that for me (for the price of £20.10, including processing fee) but I feel I should see this process right through to the end, even if it isnt the magic, transformative process we had all hoped for,  for the cultural and heritage world.

Diary Entry 5: Monday 5th January 2015

It’s worth pausing for a moment here, to think about what the alternative to chasing such a license are. Of course, there were a lot of changes to the law in October last year regarding copyright and orphan works, not just the orphan works scheme, but exceptions to the scheme. Just before Xmas I had the delight of taking one of Naomi Korn‘s one day courses on Digital Copyright. She was keen to point that for many institutions these exceptions may be more than adequate for their needs, and compares the exceptions to the licenses in a table below, which I’ve been given permission to reuse from a forthcoming piece by Korn called “The Orphan Works Dilemma”.

In many cases, provided due diligence is undertaken (and there’s a guide from the UK government on how to do that) then reusing items under the Orphan Works Exception is preferable, easier, and more cost effective than pursuing a license via the orphan works scheme. Its therefore worth exploring this option first, before thinking that a license is necessary for every reuse case of an orphan work.

For my use case – small run commercial printing, by an individual – the exception doesn’t count, so its not a route that is open to me for this particular case. But for many libraries, archives and museums, wishing to display orphan works, or use them in a non-commercial way, the orphan works exceptions are more practical than trying to obtain licenses.

Diary Entry 6: Tuesday 6th January 2015

I have a response in from the IPO, and I think this is as much as we are going to get pursuing this. I’m copying it here, as it is self explanatory:

1.    Minimal commercial licences

You asked for a licence for 20 items or less, rather than the 5000 items which is the minimum for using a still image.  This is something we will consider going forward, which is why we have a contact form in the application process to allow people to say that their use is not listed.  I obviously cannot guarantee that we will offer it, as it depends on us getting sufficient evidence of its need.  Do you have any evidence of licensing which allows such small numbers?  We would be able to take that into account in making a decision.

2.    Fee calculations and sources

You asked about the underlying mathematical model for the fee calculations.  I am afraid we are not in a position to share this with you.  As I have mentioned before, we have used publicly available licence information which we researched and averaged (commercial and non-commercial).  We then took into account relevant factors like the fact that we only offer up to 7 years for the licence whereas non-orphan licences might be available in perpetuity or for other lengths, non-exclusivity versus exclusivity and territoriality.  As I said before, this means we do not have precisely the same prices as Gettys, for example, although they were one of the sources of information because they have publicly available information.

3.    Unclaimed licence fees

After the 8 years, as you know the unclaimed fees are first use to defray the setting up and running costs of the scheme.  Then any excess will be used at the discretion of the Secretary of State for social, cultural and educational activities.  This could be a wide variety of activities, which might include cultural heritage research, digitisation projects or benevolent projects for right holders.  It is some time in the future, so it is difficult to say what might qualify under any given Secretary of State’s opinion, but I hope that gives you a better sense of what might be included.

Interesting. I think (1) is rather the wrong way to think about it: they have compared commercial licensing costs for stock photography with the licensing of orphan works that are mostly of historical importance, within institutional contexts… It is rather like applying rules for apples to those for oranges. There is an opportunity here to do something different that will help the use of content in the library and archive sector, not just to ape what is happening in industry (with a licensing structure that is set up to maximize profits and minimize time wasting experimentation). The model really doesnt apply here. But that said, the lowest licensing number that the IPO offers is for 5000 items, but a quick look at the main stock photography licensing sites shows that iStockPhoto licenses a maximum of 2000 items of apparel created with an image. If I understand it correctly, BigStock’s license allows you to print on apparel with no limit, with the cost of each image being significantly less than the quote given here.

You could spend weeks working out the licensing structures for stock photography, which are often not comparable across websites, and I dont think that the costing structure here has really been worked out in such a way to take the needs of the library and archive sector into account. I understand the need to balance the needs of commercial artists, photographers, and illustrators, but I think concern for them has outweighed the needs of the cultural and heritage sector. Given the costs, and the restrictions, if I were in an institutional context such as a library or archive, I wouldnt advocate going down this licensing route at all, but I would try and do what you could do within the exceptions, as detailed above. The orphan works licensing scheme is good in theory, but in practice it seems overly concerned with the models for commercial stock photography, and not at all concerned with the needs of the gallery, library, archive and museum sector.

Regarding (2): should I pay an RA to sit for a day or two and work out all the licensing fees? We can then retrospectively calculate the model they use. Sounds like fun? Is anyone interested in that bar me? Let me know and we’ll crack on. It may be, though, that the sector has already accepted that these costs simply arent bearable (and you try and license 300 different items, individually, even for non commercial reuse – it would take weeks, and cost ££££.)

I think (3) sound fair – 8 years is at least two governments away, so planning ahead for the state of culture and libraries and museums is like chucking darts blindfolded, anyway. But remind me in 8 years to file a FOI request asking about income and revenue from this scheme – if FOI requests still exist then ho ho ho! (hollow laughter).

So that brings me to the end of my quest, I think. I’ll continue to engage with the IPO about the licensing number question, and fill you in if there are any changes, but its clear that at the moment, as it stands, the scheme excludes people like me: sole traders, wanting to use orphan works in small print runs. And the scheme is too unwieldy, time consuming, and costly, to allow any more than the odd item to be licensed: those wanting a non-commercial license are better trying to look at the exceptions, first.

What shall I do re Lolly Time? I want to share decent images of it, and its free standing artwork, so its not covered by the exceptions. So I’ll file the paperwork for getting a non-commercial license to use this, so I can share the big files of the image online legally, just to wrap this up, and to see that process through. Should anyone take the image once its put online and do anything with it, well, I cant stop them. But you shouldnt go near Etsy with this one to sell…

Diary Entry: 14th March 2015

So where is the license? I’ve tried to get a non commercial license, I really have. I spend 15 minutes carefully filling in all the details to the online system, then I fail at the last hurdle, as it wont let me upload the due diligence form. I’ve tried different browsers, but no. I cant save that 15 minutes of data entry, and need to start again every time. It seems like not only the process is broken, but the online tool. I have emailed the IPO about it, but I think I’m done. I tried, I really did.

I keep getting an error message asking me to upload a pdf file – when I have tried to upload a pdf file.

What has the take up been across the sector, given the licensing scheme has been running for the past 4 months? In total there have been 228 licenses issued altogether to date, with 200 of those from one museum alone (Museum of the Order of St John) and a further thirteen to the Rawk Agency (“one of the UK’s leading providers of boutique and marketing events“).

According to the “In from the Cold” Jisc report on Orphan Works, “across UK museums and galleries, the number of Orphan Works can conservatively be estimated at 25 million, although this figure is likely to be much higher” [link]. The Orphan Works licensing scheme is not being used to make these available in any numbers worth considering.

Addendum, 31/03/15. 

I couldnt let this one lie, could I? It took a chat to the IPO, 3 different browsers, and me tinkering up the back of class while invigilating an exam over a 3 hour period, and look what I just did! I’ll update here with further information, once a license is granted. Victory of sorts, but I still maintain it shouldnt be this hard... 


And finally, in April 2015, I got the license. THE END. Now what?

This blog post was nominated for, and won, the “Best Exploration of DH Failure” category in the Digital Humanities Awards 2014 following public voting. Thanks! 

Addendum, 18/09/15.
This just in from the Orphan Works Licensing Team at the IPO:
We have been considering the issue, particularly in relation to the
use of an image on apparel, and have looked at the publicly accessible
pricing information available. This has allowed us to introduce new,
lower quantity amounts of 500, 1000 and 2,500 for orphan works
licences – you will be able to check these on the application system.
However, licence fees for the very small quantities you were
suggesting are not universally available, which has meant that it has
been difficult for us to source data for comparison. Where we have
come across pricing information relating to certain lower quantities,
we have added this to the online application system.
W00t. I can haz impact? 🙂  I think this is a move in the right direction (but I’d still prefer to have an even smaller license of 50 or so for those kitchen-table-makers, like me). Great that they are licensing, though!

Reuse of Digitised Content (3): Special Festive Halloween Image Give-away Edition

In my first blog post about reuse of digitised content, particularly images, I suggested that institutions could think about batching up some good images, for people to take and reuse, so they could find them easily. They could also be prepared for people to reuse. But what would this mean, in reality? I decided to have a try, myself. Halloween is approaching – lets look for 5 really cute, public domain images about Halloween, and see if we can make them “more” reusable, whatever that may mean. Like this one:

Isn’t she handsome? An illustration tagged with witch, over at the British Library book images photoset, Flickr. Originally taken from “Life & Finding of Dr. Livingstone”, 1897.

But bother about all that writing, which makes it unusable on my Halloween party invitations. It would be better if there wasnt all that writing, just the image, right?

Or even, make the background transparent. Ta da! take it and do with it as you like, please do.

Nice, huh? and all this took me was time. An hour or so of grubbing about on flickr, an hour or so of messing around in Photoshop (I’m rusty). And as we all know, time is precious, and institutions dont have that level of time to devote to this kind of thing. Hmmm.

I also wonder what I’m really doing here. Turning images into clip art? erm, yay? Is that what we mean by reuse? But why else are we making images available, if its not for people to take them and do something with them? Does this make them more “useable”? Its certainly more easy to take the image and dump it into a poster, or webpage, etc. We need to ask ourselves what we mean by use and reuse, if we cant conceptualise what that really means in the first place.

But I said 5 images, right? I’m time pressed at the moment (shortly off on a big work trip), so – being honest here – I signed up for the first time to Fiverr, where you can get a myriad of small tasks done for $5, and bought some photo retouching for photos, and within an hour, I had four other Halloween images, this time from the Internet Archive Flickr Pool,  converted into black and white, with transparency too. A set of Halloween images! But Fiverr made me feel icky – even though this fixing up would be a relatively simple task for someone with better PhotoShop chops than I to do, and even though I chose someone who said they were a student in a first world country, it just seems such a small amount to pay someone. (I did try to engage them in conversation about that, and offered going hourly rate I would pay a student: they didn’t reply). I am happy with the images provided, but I wouldn’t advocate institutional use of this type of service if it can be avoided, something about it feels exploitative to me. It was interesting to try. (Perhaps its part of my penance that I share these images here for everyone but… shudder. Is that how we value skills now? Sorry, world. I know is the market economy, but, doesn’t mean I have to pay people less than I believe a job is worth).

So now what.

I parked this, and a selection of others I found that I’ll put at the bottom of this post, on a group over at Flickr. There’s been obvious interest in them, with a total of 50 views or so in 24 hours, even though I didn’t tell anyone where they were, yet. So I’ll leave them up there, and take them if you like! I think they are cute. Do something, they are in the public domain! They are free! Use them at will! It only cost me time and some perhaps student’s time and $5 and the electric that drives the internet and the heavy metals that are in our computers etc etc! and if you fancy telling me how you used them, on here or on twitter, that would be great, but you don’t have to because its public domain! woohoo! (I may do some reverse image lookup in a while and see where they got to).

This is a minor experiment – especially compared to my last blog post, which was much more of an investment in both time and money – but it goes back to what I was saying previously about the time and skill needed to use the image content available successfully. Its not all just “there” yet, you need time to sort, and time to manipulate, and resources to do so. It also makes me think of what you read about in pre-print times, when artists’ workshops had teams of people working for them who just painted silk, or hair, or skin or whatever, and the whole thing was a production line, where you farmed jobs out to other painters – sure, its a makers revolution, but its one that involves getting a student to do a quick job on PhotoShop for you, or a print shop to do some formatting and printing. You can take the content and do something with it, if you have the resources to both pay for and manage the process. The stuff is in the public domain, and is free. But doing something with it isn’t, not really.

Except, of course, I’m not Raphael, I’m just messing about with images taken offline and turned into slightly cleaned up versions of themselves for clip art. I’d like to see a “real” collection do a longitudinal study on the benefits of this, releasing some of their content in different graphic formats, and tracking interest… hmmm, a potential MA student dissertation for this year, perhaps? Its a worthy topic, and one that should be pursued in more than a couple of hours, and a hurried blog post.

Still, Happy Halloween, and feel free to reuse these in any way you like, should you want to. The full size I have is up here, made smaller to fit in blog format, you know what to do to grab the larger file. Black and white jpgs first, then transparent png.

Originally taken from the Internet Archive Book Images Flickr Pool.

This originally had only a couple of previous views, and isn’t it delightful? ripe for putting at the top of any manner of Halloween related paraphernalia…

Originally taken from the Internet Archive Book Images Flickr Pool.

It started off pink, mind!

Originally taken from the Internet Archive Book Images Flickr Pool.

And last but not least, my favourite:

Originally taken from the Internet Archive Book Images Pool. Brilliant!

All of them over at Flickr, too, if you’d prefer. Have fun! And don’t have nightmares.

Reuse of Digitised Content (2): Here’s One I Made Earlier, or, It’s Lolly Time

Following on from my previous post in which I bemoan how hard it is to reuse digitised content as a source for creating something, I reuse a digitised image of an item in the National Library of Scotland, discovering how tricky it is to reuse images of “orphan works”, but producing something that, well, I like!

After a few months of exploring digitised collections looking for A Thing to Make and Do, something caught my eye. Ironically, I found it whilst flicking through a print catalogue of an exhibition I hadn’t had the chance to attend: Going to the pictures: Scotland at the cinema, which had run at the National Library of Scotland* in the summer of 2012. A quick google showed it had been digitised at least in low resolution, appearing on the website:

A 1960s lantern interval slide tempting patrons to buy an ice lolly
A 1960s “lantern” interval slide tempting patrons to buy an ice lolly, used at the Odeon Cinema, Eglinton Toll, Glasgow. Image used here with permission from the Scottish Screen Archive, National Library of Scotland. [source page]

Look at that! How cheerful is it? And right up my street. I kept going back to it and going… ahhhhh! But was it digitised in high enough resolution, and could I get permission to do anything with it, given it is quite clearly still in copyright?

The folks at the Scottish Screen Archive, and the Intellectual Property Officer at the National Library of Scotland, couldn’t have been more helpful. Yes, they had previously digitised it at high resolution (all 69MB of it), and I could get permission to use it for my own use (and to feature the image(s) here on my blog) for the princely sum of ten of your British Pounds for the license. I also contacted the Odeon: their records dont go that far back for design so they cannot prove they own copyright, but they gave me permission to use it if they do, with the caveat that a copyright owner, whom they cannot speak for, may come forward at some future date (and hey, stranger things have happened once you put things into the blogosphere, if anyone knows anything about the illustrator, please get in touch). This lantern slide is officially an “orphan work”, then. This means it isn’t in the public domain, and I cant reuse the high resolution image provided from the SSA willy-nilly (such as making a pattern for anyone to use with it, or giving away the source files, or putting it up on third party website such as spoonflower), under the terms of the license agreed. But it means I can use it for personal use. I’ll come back to that later, but lets crack on.

Getting My Make On

The process of turning this into something was straightforward. Once I had the high res file, I spent a few hours tidying up the image, removing some scratches and marks from the slide: this is a fragile, opaque, archival item, and it’s no wonder that, close up, there were some marks that may detract from print quality. Its a line to walk, though: you dont want to make it too cleaned up. It still wants to look original.

Before and after, with a bit of cleaning up in PhotoShop.
Before and after, with a bit of cleaning up in PhotoShop.

This resulted in a cleaned up version of the lantern slide, ready to go:It's lolly time

It’s not a huge difference from the original (and I havent put the full resolution file up here that I have, I’m not allowed to), but it just makes the whole thing a bit fresher for printing.

Then it was just a case of more PhotoShop jiggery pokery, measuring up, tiling, choosing my printer (I went with BagsofLove, a UK company which seems to offer quite a range of printing: people online say that if you order from Spoonflower, a company based in the USA, import duty can really make the costs mount up for shipping to the UK).

The Big Reveal

Ta da! A pure silk scarf, with repeating motif. Cute, huh? Bagsoflove offer silk printing plus hemming, given a lot of people want silk scarves to test patterns. I got quite a large one made, and the whole thing cost £100 all in, ready to wear).

And here I am wearing it! While we are talking about copyright, etc, this photo was taken by my 6 year old who has given me permission to post it here (which also might explain why all the scarf is cropped out on the left! but you get the drift).

Thoughts On the Process

Do I like the resulting item? Well, I chuckled when it came in the post, so yeah. I do feel as if I’ve made it – a few hours navigating licensing issues, about 5 hours total in PhotoShop, a few hours choosing where to get it made and what to get it made into, so it feels like I’ve had to invest time (and some brain effort, in working out tiling sizes, etc, and what I actually wanted size wise: this was a significant investment in time and cash, so its good to get it right). It’s already made me think about the next digital printing project, which means the whole thing must’ve been fun. Working, as I do, with so much digital data, its nice to actually have a product at the end of the day. Going with silk was expensive, and there are cheaper options available, but I’ve got a high quality item (that would probably cost around the same on the high street – I’m not going to make a fortune if I choose to sell these on etsy, unless I go for a cheaper supplier!).

The one frustration I have is that I cant share the files with anyone, and I cant say, if you like it, here it is, get it printed up yourself, and I cant, at the moment, stick it up on etsy for sale even if I wanted to, due to the orphan works copyright restrictions. I talked at length with the NLS’s Intellectual Property Officer, and we walked through why its just not legal, at the moment, for them to allow someone else to “publish” something that is in their collection and still in copyright without getting the holder’s permission, and I understand this – although it doesnt mean I’m not frustrated by that. (You could get a license from the NLS yourself, if you wanted to use it for personal use).

But of course, the law on the licensing of orphan works in the UK is changing very soon. The upcoming orphan works licensing scheme (coming into force on the 29th October 2014) will allow that a person can obtain a license for commercial or non-commercial use of an orphan work on payment of a nominal fee and demonstration of a ‘diligent search’. (There’s a PDF summary of this new scheme over at the Intellectual Property Office’s website, with more on diligent search here). At time of writing, there is very little up there about how the process will work, or what the “nominal” fee would be (one person’s nominal is another person’s how-bleedin’-much?) but that’s one to watch. Come the end of October, I’ll start a blog post chasing this image through the Orphan Works Licensing Scheme: who knows, within a few months, you may be able to make some It’s Lolly Time! merchandise yourself, should you care to.

It’s been a fun journey, chasing something from idea to conception to manipulation to production. I’ve learned a lot about how we are delivering digital content to end users in the gallery, library, archive and museum sector, and also how frustrating it can be at times. But look, I’ve eventually ended up with a bespoke thing that I love, just for me. And once I’ve published this blog post, I’m going to start wearing the scarf that I made, just in time for winter-a-comin’ in.

*One final thing to say: eagle-eyed regular readers may know that I’m currently serving on the board of the National Library of Scotland, but I applied to use this image from my civilian, non-work, unidentifiable email account, so as not to get any special treatment in the process of licensing. It has to be said though, that being on the board was the reason I was flicking through past catalogues of their exhibitions in the first place! And I’m personally glad I found something in the NLS collections that so tickled me: a little bit of Scotland to remind me of where I’m from, and an emotional attachment to a piece of digitised cultural heritage.

Reuse of Digitised Content (1): So you want to reuse digital heritage content in a creative context? Good luck with that.

Although there is a lot of digitised cultural heritage content online, it is still incredibly difficult to source good material to reuse in creative projects. What can institutions do to help people who want to invest their time in making and creating using digitised historical items as source material?

The Garden of Earthly Delights
The Garden of Earthly Delights, repurposed over at Etsy

Over the last few months I have become increasingly interested obsessed with creative reuse of digitised cultural heritage content. We live at a time when most galleries, libraries, archives and museums are digitising collections and putting them up online to increase access, with some (such as the Rijksmuseum, LACMA, The British Library, and the Internet Archive) releasing content with open licensing actively encouraging reuse.  We also live at a time where it has become increasingly easy to take digital content, repurpose it, mash it up, produce new material, and make physical items (with many commercial photographic services offering no end of digital printing possibilities, and cheaper global manufacturing opportunities at scale being assisted with internet technologies). What relationship does digitisation of cultural and heritage content have to the maker movement? Where are all the people looking at online image collections like Europeana or the book images from the Internet Archive and going… fantastic! Cousin Henry would love a teatowel of that: I’ll make some xmas presents based on that lot!

I’m not the only person interested in this: The British Library is currently tracking their Public Domain Reuse in the Wild, looking to see where the 1 million images they released into the public domain, and on Flickr, end up being used. At the moment, they manually maintain a list of creative projects of what people have got up to with their content. And people are using digitised stuff: pop over to a commercial fabric printing service like Spoonflower and you can see people grabbing creative commons images off Wikipedia and providing the means to print them on a whole range of materials for creative reuse. At Spoonflower, people are remixing images, providing opportunities for creative projects, designing and playing with available heritage content, using it as a design source and inspiration, although many dont quote the source of their hopefully out of copyright images used a basis for fabric design. Pop over to Etsy, and you can see (as the illustration above shows) high res images of historical art and culture turned into coasters, corsets, bangles, pillows, phone cases, jewellery, etc – and mashed up and remixed into further creations, all of which are for sale (although, again, where they got the source images from isnt usually made clear, and there are obvious copyright infringements happening in some cases). But overall, I’m left wondering why more use isn’t made of online digital collections – and why we havent seen the “maker’s revolution” where everyone is walking around going “this old thing? I cobbled it together from public domain images on wikimedia and had a tailor on Etsy run it up for me!” – or even see more commercial  companies start to use this content as the basis for their home and fashion collections on the high street. There are now funding programs and efforts to help try and help the exchange between the “multiple sub-sectors of the creative industries and the public infrastructure of museums, galleries, libraries, orchestras, theatres and the like” and funds for “collaboration between arts and humanities researchers and creative companies” etc etc – in this this new “impact” world, allowing reuse of your content will probably score huge brownie points – but what can institutions be doing off their own back to make sure the digitised content they spent so much time creating is used, and reused, further?

I was really impressed, at DH2014, to see Quinn Dombrowski have an entire wardrobe made with fabric designed using heritage content images in the public domain, and this inspired me to think: I should have a go at this. I should find something which is digitised and online, that I like, that I can access, that I can repurpose, and make something that I want and will use from it. What larks! But the rest of this blog post is an expression of sheer frustration at the current state of play of delivering digitised content online, for people who want to take digitised content, and reuse, and repurpose it.

Before I get started: let me make clear that I’m entirely supportive of folks like the Rijksmuseum, LACMA, The British Library, and the Internet Archive making their out of copyright images freely available for folks to use. Its absolutely the right thing to do, and I’m not going to start railing against them (there are, of course, many institutions who haven’t made their digitised content available and they deserve railing against.) But with that caveat in place, let’s broach some frustrations of someone looking through digitised heritage content, wanting to get a decent image of something they want, to reuse in a way that they would like (whether or not that involves paying for the privilege – this isnt just about getting stuff for free, its about getting it at all). It isnt pretty.

1. So much stuff, such poor interfaces.

Yay! so much stuff online! Europeana now has over 30 million items online from 2000 institutions! Flickr Commons has a tonne of stuff online! Flickr is now being used, independently of the commons, to host tens of millions of digital cultural heritage objects, by thousands of institutions! But for a user, browsing through this stuff, it is nigh on impossible to navigate or search Flickr in any meaningful way, and sift through this, simply because Flickr’s interface is so poor (and often the content isnt tagged very well, so isn’t very findable).  What if institutions dont use Flickr? Dont get me started on content management systems, and their “user friendly” interfaces, such as Aquabrowser, or Digitool: shudder. Unless you know exactly what you are looking for, it’s incredibly difficult for a user to browse and view content – and there is a lot of dross out there to sift through. Finding decent images that are interesting from a design perspective is a time consuming, utterly frustrating task. I speak from a few months of chuck-my-computer-across-the-room frustration in trying to navigate ( mostly unsuccessfully) what the cultural heritage sector has spent millions of pounds putting online.

Suggestion: Institutions should use a little resources to get folk with any sort of graphic or design background help sort through the thousands or millions of images and present to their users a curated collection of a few hundred really good things which are ripe for using. Heck, put together some downloadable packs of images of art, logos, boats, trains, etc. Here are 10 great images of witches you may like to play with! At the moment you are making users work too hard to sort through the digital haystack to find the interesting, usable needle. No wonder much of the content isn’t used – people simply cant find it, or they walk away from your rubbish interface before finding that digitisation diamond.

2. The shackles of Copyright, part 1: aesthetic.

The copyright free images which are put online free to use are out of copyright (duh) which means they are from a particular time period: generally pre-1920s (depending on the country’s copyright laws). There’s a lot of stuff up there, but an incredible amount of it is Victoriana, which has a particular aesthetic. This is great if you are into Steampunk (check out the first few pages of the Internet Archive book images Flickr stream and you’ll see what I mean) but… having scrolled thought oodles of this stuff, it just doesnt float my boat. I’m into mid-20th-century design, so that puts me into an entirely different category of user: one who is going to have to sort out permission for reuse for items still in copyright, if the institution hasnt sorted out copyright before publishing online. B*gger. This isn’t going to be as easy as it first appeared for me, then.

Suggestion: Institutions should cherry pick a few in-copyright items that are really very reusable, and preemptively clear copyright under various licenses. Here are 10 fabulous 1950s illustrations which we have arranged for you to use under a creative commons license! (There is some of this stuff up on Flickr Commons, but it is in the minority). I understand the resources which are required for this, but really, institutions could be leading the way in making images of selected in-copyright items available and usable for people, to encourage uptake and creativity. Or – at least – make processes for chasing copyright clearance a bit clearer to users. Information on that is very sketchy, to say the least, and its often impossible to even find out who in the institutions to email about rights clearances.

3. The shackles of Copyright, part 2: cowardice.

Let’s put aside the wonderful work of those who are bravely making their collections available for reuse, and arranging licensing for folks to do so, and address the majority of institutions who dont do this. Say you think… I’d like to make some of my own stationery! I know, I’ll pop over to Europeana, and grab some cool images of old envelopes, and print up some notecards with those on (not to sell! just for my own use!). There’s 6563 images labelled “envelope” currently in Europeana.  The licensing for these – what you can and cant reuse – is incredibly confusing. Only 60 of these items have been put into the public domain. I have no issue with institutions wanting attribution when their images are reused – of course not – and you can do that with 592 images (although… how are you going to provide attribution on fabric or a cushion or a corset or a bracelet, etc). My beef is with the quarter of these digitised items which allow access but no further reuse of the images. Seriously, why not? What are you scared of? That someone is going to pop over to Photobox (other commercial photo printers are available) and make up some notelets? That someone will make a corset out of those image and sell them on Etsy? Quite frankly, if your stuff is out of copyright, and if you dont have the nous or cant afford to employ a graphic designer to turn your images of envelopes into going commercial concerns, good luck to anyone who can. I dont get why you would put images of old stuff online and say to the users “You can’t use it. At all”. What are you afraid of? (I also presume here that people wont use digital images when they dont have persmission to do so. Which is nonsense. People will take it and use it anyway).

Oh yeah, you are saying, but copyright is complex, envelopes are manuscripts, manuscripts never go out of copyright, blah blah, till the cows come home. But just let people reuse digital content, and good luck to them. Seriously, what is the worst that could happen? That something archival takes off and becomes another “keep calm and carry on” meme? But really – wouldnt your institution love to be the source of one of those, for perpetuity?

Yes, I did find a really good image of an envelope I wanted to use on some notecards, but couldnt get permission to do so (hence choosing it as an example). I’ll address licensing and paying for image licenses in another blog post (I’m not averse to that either. At the end of the day, just let me reuse that cool image, even if I have to pay license costs to do so).

All over the world, institutions are digitising cultural heritage content and putting it online with restrictive licensing which means that users cannot do anything at all with it (at least not without jumping through lots of begging hoops, or using it illegally). Not use it on a blog post. Not print it on a home made birthday card. Not make their granny a key ring with it on. Not make a scholar who is an expert in this field a mug with it printed on for their retirement present. This seems absolutely bonkers to me – and a complete waste of limited resources in the sector. What “access” do you think you are actually providing, if its only of the “look but dont touch” variety?

Suggestion: if you arent going to monetise it yourself, just make it available for others to reuse, with a generous license. Go on!

4. Image quality

All I want is a clear, 300dpi (or higher) image of the digitised item. Its no use saying “this is in the public domain!” if you only provide 72dpi: you cant do anything with that, except stick it up on another webpage. Just give me a reasonably high resolution image, and let me go and play with it. Cheers! So, so much of the “public domain” material is quite low resolution, which stops people from using the images for creative purposes. Maybe that was your plan all along (ha ha! we’ll put this online but only at low resolution! that’ll thwart those corset makers!) but seriously, 300dpi. Let folk have at it.

One other point: if you are using algorithms to crop lots of stuff before sticking it up on Flickr, please make sure that it works, and isnt cropping things too tightly. I understand that its all about efficiency and storage capacity – you dont want to be storing tens of millions of blank pixels and paying for hostage for empty content – but if you crop things too closely, its just unusable. Another reason I stopped looking for images in the Internet Archive Book Images Flickr pool was all the ones I want were shaved off. I know! I’ll make a montage of ye olde fruit and veg! except this apple is cut off at the bottom, these carrots are missing part of their top, this apple sliced right through, as are these peaches. Thanks for offering to give me all this stuff free, but its unusable for creative purposes unless you give me a whole illustration, not one that has been chopped off around the edges.

Suggestion: 300dpi, at least. Cheers, love.

5. Checking the maker privilege

Its worth just remembering that you may be making some content freely available, but its still actually quite costly for people to do anything creative with it where digital printing is concerned, especially in small print runs, making individual items, etc. It takes significant investment of time and resources to take an archival tiff and turn it into, say, a cushion (or a corset). I’m not really sure what I’m trying to say here in making that point (isnt that what ranty blog posts are for?)… perhaps it offsets the feeling that institutions are giving this stuff away for nothing: people reusing digital images are putting in significant time and often money to turn them into something else. It becomes co-creation, rather than mere duplication. Or something. It’s certainly not an activity that is available to those without the skills to do image manipulation (despite many publication features being available on these commercial digital image printing websites: if you want to do anything that deviates from very simple printing, it still takes time and effort to set up). It still takes skill and resources and sometimes training and probably talent to make something nice and that people will want from something someone else has digitised, and it often takes a huge amount of time. It certainly surprised me how long the selection and preparation of items takes before you get to the stage of sending something to the print shop. So let’s all proceed in a realm of mutual respect and adoration, yeah? Love the provision of high quality digital heritage imaging online: love the people who have the sewing chops to make the corsets.  (There are also ethical considerations if people start sending high resolution images of items to be made into products in “cheaper” international production contexts, but I’m not sure realistically how that can be broached by image licensing).

Suggestion: Wonderful things can happen when individuals work with institutional digitised content! sometimes.

Conclusion

Overall, here is what institutions can do if they want people to really use digitised content:

  • Put out of copyright material in the public domain to encourage reuse. Go on! what are you scared of?
  • Provide 300dpi images as a minimum.
  • Curate small collections of really good stuff for people to reuse. Present them in downloadable “get all the images at once” bundles, with related documentation about usage rights, how to cite, etc.
  • Think carefully about the user interface you have invested in. Have you actually tried to use it? Does it work? Can people browse and find stuff? Really?
  • Make sure the image quality is good before putting it online. Dont chop bits off illustrations.
  • Make rights clearer. Give guidance for rights clearance for in-copyright material, and perhaps provide small collections with pre-cleared rights, to allow some 20th Century Materials to be reusable.

What do we want! Curated bundles of 300dpi images of cultural heritage content, freely and easily available with clear licensing and attribution guidelines! When do we want that? Yesteryear!

So what about me, and my task? Did I find something that I like, that I can access, that I can repurpose, and make something that I want and will use from it? After a few months trawling digitised collections online, I eventually stumbled across something which I adore, which got sent off to the print shop last week. I’ll be waiting by the postbox over the next few days, in the hope that my investment in time and resources has paid off: I cant wait to see it IRL. But that, my friends, is for another blog post. And in the meantime, I leave you with this conclusion: institutions can be doing so, so much more to help those wanting to use digitised content creatively.

Want to be taken seriously as scholar in the humanities? Publish a monograph

(This is the unedited version of a piece published yesterday over at Guardian Higher Ed.)

A decade ago, in my first year as lecturer in a Humanities department, an eminent Professor helped me secure a book contract with a top university press for my recently completed doctoral thesis. Another senior colleague stopped me in the corridor: “This is very rare,” she said. “And this is what gets you ahead in this game.” The book itself is a lovely object, of which I’m still very proud (it took me four years of doctoral research, plus another two years of preparation). It only sold a few hundred copies: enough to make the press happy, and to give me annual royalties of a fiver. There is an ebook, comparable in price to the physical version, but no Open Access version. Despite little proof that it is well read, it has been cited just enough to give me another elusive point on the dreaded H-index. We don’t write Humanities monographs for riches, we may do for an attempt at academic fame, but the career kickback for me was rapid promotion. In the Humanities, the monograph’s the thing.

Today, the Humanities publishing landscape is, of course, changing alongside every other. We must work through the potentials and issues that digital technologies bring. With digital publishing comes the uncoupling of content from print: why should those six years of work (or more) result in only a physical book that sits on a few shelves? Why can’t the content be made available freely online via Open Access? Isn’t this the great ethical stance: making knowledge available to all? Won’t opening up access to the detailed, considered arguments held within Humanities monographs do wonders for the reputation and impact of subject areas whose contribution to society is often under-rated?

Research councils are prescribing Open Access requirements for outputs which will be submittable in the next REF, and there are now nods towards monographs being included in those requirements at some elusive point in the future. The Humanities’ dependency on the monograph for the shaping and sharing of scholarship means that scholars, and publishers, should be paying attention.  How will small-print runs of expensive books fare in this new “content should be available for free” marketplace? How will production costs be recouped? Predatory models are already emerging, with established presses offering Open Access monographs alongside the print version for an all inclusive £10,000 charge to offset a presumed (but not proven) fall in revenue: out of the reach for most individual academics, or many institutions. I certainly couldn’t have afforded those costs, a junior academic fresh out of the doctoral pod, with student debt hanging around my neck.

The latest JISC survey on the attitudes of academics in the Humanities and Social Sciences to Open Access monograph publishing makes an interesting contribution to this debate, showing how central single author monographs still are to the Humanities, and how important the physical – rather than digital – copies are. People still like to read, and in many cases buy, them. The survey suggests monographs are fairly easy to access even in physical form (inter-library loan, anyone?). Open Access is welcomed, and is seen to increase readership, but the physical object is still central to the consideration of the monograph: something which should allay fears of publishers wondering how any change in the REF requirement will affect their bottom line.  The most difficult problem seems to be securing a book contract in the first place, whether that has an Open Access option or not: the survey clearly shows that ECRs need help and guidance to do so.

Will I publish another monograph without an associated Open Access version? No, but getting published in the first place is the important thing. What advice do I have for early career researchers looking to publish their doctoral thesis, especially if they had the chance to do so with a strong, established academic publisher? The monograph is still the thing: anyone who wants to be taken seriously as a scholar in the Humanities should work towards having one. Open Access requirements are on the horizon, so broach them with the publisher. Don’t accept £10,000 costs. Brandish this survey, say People Still Buy Books. Ask for help from those further along the academic path to help you navigate the pre-contract stage. Even with the changing publishing environment, some things stay the same: the importance of the physical single author monograph, and the importance of academic patronage.

Inaugural Lecture: A Decade in Digital Humanities

This is the crux of what I planned to say – or hoped to say! at my professorial inaugural lecture at UCL on the 27th May 2014. I’m not one for reading off a script though, so may have deviated, hesitated, or expanded on the night. A video of my talk on the night is now available. No I haven’t watched it myself!

41077-screenshot2014-04-24at11-16-29

I decided to call my inaugural lecture “A Decade in Digital Humanities” for three reasons.

1. The term Digital Humanities has been commonly used to describe the application of computational methods in the arts and humanities for 10 years, since the publication, in 2004, of the Companion to Digital Humanities. “Digital Humanities” was quickly picked up by the academic community as a catch-all, big tent name for a range of activities in computing, the arts, and culture.  A decade on from the publication of this text, I thought it would be useful to reflect on the growth, spread, and changes that had occurred in our discipline, and my place within them.

2. This year sees me in my 10th year of being in an academic post. I joined UCL in August 2003, my first academic post after obtaining my doctorate, and since then have worked my way up the ranks from probationary lecturer, to senior lecturer, to reader, and now full professor. The professorial lecture gives me a rare chance to pause and look behind me to see what the body of work built up over this time represents, and what it means to be undertaking research in this area.

3. You’ll have to wait for later in the lecture to see the third reason…

Who here would be comfortable defining what is meant by the term Digital Humanities? In this, the week of UCL Festival of the Arts, celebrating all things to do with the Arts and Humanities, let’s go back to first principles. In UCLDH and 4Humanities‘ award winning infographic “The Humanities Matter” we defined the humanities as “academic disciplines that seek to understand and interpret the human experience, from individuals to entire cultures, engaging in the discovery, preservation, and communication of the past and present record to enable a deeper understanding of contemporary society.” It stands to reason, then, that the Digital Humanities are computational methods that are trying to understand what it means to be human, in both our past and present society. But it may be easier if I give some brief examples to demonstrate the kind of work we Digital Humanists get up to.

One of the easiest things we can do with computers is count things. For data to be computationally manipulated, it has to be in numeric form. If we can get text into a computational form, we can easily count and manipulate the language, showing trends across time. For example, if we take a million words of conference abstracts from my discipline from the ALLC/ACH conference across various years, we can easily see how mentions of one technology (XML) becomes more popular, while another (SGML) is in decline. Much of the work in DH is in manipulating and processing and analysing text – our iOS app Textal is just part of that trajectory. Much of my work, though, has been in digital images, starting with developing systems to try and read damaged documents from Hadrian’s wall, and more recently working on multispectral and 3D manipulation of damaged texts. We’ve also worked with museums on large scale 3D capture of cultural and heritage objects. The important thing about all of this is that as well as implementation, we’re also interested in use and usage of these technologies, and what impact that they have on those working in culture and heritage, and the ability to study the past and present human record. We often innovate new systems, or adopt concepts and apply them to humanities projects, such as the crowdsourcing of Jeremy Bentham’s handwriting by volunteers, or working with visitors to the Grant Museum of Zoology at UCL to encourage debate about zoological collections. We build, we test, we reflect back on what using these technologies means for the humanities, giving recommendations which can be useful across the sector. From these projects, its difficult to pin down what Digital Humanities actually is, but that sums up the difficulty of our discpline’s title: it encourages thinking about computational methods in the arts and humanities, and then into culture and heritage, in as broad a sense as possible.

What made Digital Humanities spring, fully formed like Athena from the Head of Zeus, as an academic field in 2004? Was it because that was the first time quantifiable methods had been used in the Arts and Humanities? (remember – all computational methods require quantification). Well, of course that is nonsense. When you look back across the history of Humanities scholarship, quantifiable methods were used in the Arts and Humanities since the birth of Universities. If we think of the book as technology, from its inception scholars took it to pieces to see under the hood: concordances and indexes of works were manually created, such as this “Concordance or table made after the order of the alphabet” from 1579 which lists how many times concepts such as “abomination” appear in the New Testament. Or the work of Joseph Scaliger who in the early 1600s plotted the different periods in time in which different civilizations must have existed, through quantifiable methods. Or the work of August Schleicher in the 1850s who showed, by quantifiable methods, that the languages of Europe must have had a common historical root. All of these texts are available from UCL Library, none of which I have to leave my sofa to see because YAY! Digitisation! Changing humanities scholarship! – but the point is that quantifiable methods are part of established methods in the humanities, and have been for as long as the Humanities have existed. So when I undertook my first project at UCL, looking at whether we could use the high performance computing facilities at UCL to analyse historical census data – this is part of an quantifiable humanities academic tradition which harks back 500 years, just at a grander scale.

So what made Digital Humanities spring, fully formed like Athena from the Head of Zeus, as an academic field in 2004? Perhaps in 2004, this was the first time people had used computational techniques in the arts and humanities? But of course, that is nonsense too. When you look back at the history of computing – and not even digital computing, but the very first computer – the very first computer programmer, Ada Lovelace, hints at the possibilities for art, music, and understanding human knowledge and culture in her earliest writings. She understood that there was something more to the mathematical calculations afforded by this machine than science, and they called her a madwoman for it. Well, this madwoman has a (yet unproven) theory that if you look at the history of the first 100 electronic programmable computers in the 1950s, 1960s and 1970s across the world, you will see humanists eyeing them up and asking “how can I use, or develop this tool for use, in my research”? Its certainly true of Father Busa, working with IBM in the 1950s on the concordance of the works of Thomas Aquinas (counting, indexing, and manipulating words, as part of the historical trajectory of humanities methods stretching back 500 years, just a change in scale…) but also of Roy Wisbey, in Cambridge, who set up the Literary and Linguistic Computing Centre there in the 1960s. When the first computers arrived at UCL, the artists from the Slade School of Fine Art were over there like a shot to establish the Experimental and Computing Department. We should also mention Susan Hockey, who led various initiatives in text encoding, text analysis, and digital libraries. Susan, incidentally, gave me my first academic job here at UCL in 2003: UCL had included a Digital Resources in the Humanities module course as part of its MA offering for librarians and archivists in the School of Library, Archive and Information Studies (now the Department of Information Studies) from 2000, under Susan’s auspices. But the point is, considering how best to use computing in the arts and humanities is not something which started in the 21st Century,  nor 2004, and Humanists have been looking at available tools, and how best to use them, since computation began. So when we undertook one of the latest projects at UCLDH, which came from looking at an iPhone, thinking “how can I use, or develop this tool for use, in my research in the Humanities” and developed an iOS app for text analysis, this is part of a longer trajectory of considering available computational tools, and how they may be appropriated, adopted, and adapted for our means in the humanities, just at a grander scale, as processing technologies increase in speed.

So why Digital Humanities, in 2004? Firstly, the coalescing of interested scholars into an identifiable field is an understandable academic response to societal changes. The speed of computing rises, the price of computing plummets, the information available on the internet (and the possibility to create new information) increases, use and usage of internet technologies has become commonplace. Remember, its up to Humanities scholars to look at the past and present record to enable a deeper understanding of contemporary society: quite frankly, it would be more alarming if an academic movement hadn’t emerged looking at what using computational methods could do for our understanding of human society, both past and present, and how best we can grab the technical opportunities which fly by and appropriate them for our means, to inform both ourselves and others about the prospects of using computing in this area. The discipline of Digital Humanities is inevitable, and would have appeared whatever the title it was given.

Secondly, Digital Humanities is a handy, all inclusive, modern title which rebrands all the various work which has gone before it, such as Humanities Computing, Computing and the Humanities, Cultural Heritage Informatics, Humanities Advanced Technology… DH has a ring to is, and boy, what a rebranding it was. We tend to call it “Big Tent Digital Humanities” meaning: roll up! roll up! everyone using any computational method in any aspects of the arts and humanities is welcome! but really, Big Wave Digital Humanities may be more appropriate, as we countenance the sudden swell, dissipation, and speed of the activities of the discipline. Taking a peek at the mention of Digital Humanities on Google Ngrams we can see its sudden growth, and the fact that it is now used as a proper noun, with Capital Letters (although remember that this, counting words, is part of a long tradition of humanities scholarship, Google simply have more books to include in their count). We can see how DH has trended over time, appearing in headlines in the media. Many, many textbooks in DH appear, some of which I am responsible for myself. Journals appear, such as Digital Humanities Quarterly (of which I’m one of the general editors), and the ALLC/ACH conference renames itself Digital Humanities (this year, for my sins, I’m the Program Chair for DH2014 which will be held in Lausanne, Switzerland. We have seen over 700 proposals from more than 2000 vying for a space to present). There are many more DH conference presentations and workshop slots, worldwide, year on year. In 2010, I gathered together all the available evidence I could on DH in an infographic called Quantifying Digital Humanities, showing that there were 114 DH Centres in 24 countries. Today, not even four full years later, there are 195 DH Centers in 27 Countries. Those knowing how long it takes to set up a research centre know that this is phenomenal growth in the university and GLAM sector, and that institutional support must be strong, behind each and everyone of these.

UCL Centre for Digital Humanities is part of those who have joined the recently founded centres. We officially launched four years ago to the week of this lecture, in the same lecture hall where this lecture is being presented. We dont talk about the launch much – its not often I’m part of something at work which ends up featured in the political pages of the newspapers – but you’ll have to google that to find out more (YAY! digital media! the internet never forgets!) but in those four years since launch we’ve undertaken a phenomenal amount of projects, covering many aspects of Humanities and Arts research, and considered Digital Humanities in its broadest sense. This isnt all me – there is an amazing team who are part of the Centre, and we’ve won various awards for our academic projects and collaborations, published many books, papers, and book chapters, and been part of successful funding bids from research councils worth tens of millions of pounds. One wonders what makes a Digital Humanities Centre attractive to universities that dont have one. Nope, I cant see what makes that level of activity attractive, at all.

So what proportion of Humanities scholars are now digital humanists? Back in 2005, participants in the Summit on Digital Tools in the Humanities at the University of Virginia estimated that “only about six percent of humanist scholars go beyond general purpose information technology and use digital resources and more complex digital tools in their scholarship” (p.4 of this PDF). By 2012, N. Katherine Hayles, in her chapter “How we think: transforming power and digital technologies” in David M. Berry’s edited text “Understanding Digital Humanities”, estimates that 10 per cent of Humanists are now digital humanists (p.59).  Now, in 2014, a forthcoming study from Ithaka S+R (with the working title of Sustaining the Digital Humanities: Institutional Strategies beyond the Start-up Phase) includes surveys of faculty at four American universities. In the departments surveyed at each institution, nearly 50% of faculty members indicated they have “created or managed” digital resources. Granted, the departments were chosen by campus staff (often at the library) who felt there was some significant activity taking     place there. The percentage of these “creators” was consistent across all universities (Brown, Columbia, University of Wisconsin, Indiana University), and most of the creators also felt that their creation was intended for public use (not just their own research aims), and would require ongoing development in the future.

50% of humanists are involved in digital activity, are digital humanists. How can this possibly be? And how can we conceptualise what it means to be a digital humanist, amongst this spread of activity and range of available technology: is creating or managing digital resources the same as being a digital humanist? At a time where (nearly) every library catalogue is digitised and available online, and (nearly) every book manuscript written on a work processor, and many historical documents digitised and available for consulting from your own sofa, does that make everybody working in the humanities a digital humanist? How can I begin to conceptualise my contribution, and my place, and where my work sits within Big Wave Digital Humanities?

I find it useful, here to turn to Roger’s Innovation Adoption Curve, a sociological model that looks at how technology spreads through society. This is a bell curve, and right at the start of adoption of technology, are a few innovators, experimenting (and developing) new technology. These innovators sometimes persuade a larger number of early adopters to take up the new technology on offer, and only once a sufficient mass of users are achieved, does the technology “cross the chasm” and become used by the majority of individuals in a society (who are split into an early majority, or late majority). Finally, we have adoption by the “laggards”, who are slow in taking up technologies, but do so if they have permeated throughout society. (Hard not to think, here, of my elderly grandmother who recently got her first mobile phone).  Now, this model is useful as we can plot along it some of the technologies which are available to a humanist. Things like word processing, and searching for references online, and even looking up the digitised texts which I showed at the start of this lecture: even the technologically laggard humanists can do it now, and although these technologies are changing scholarship, its a question of scale (better! faster! more!) rather than of approach or technique, for the main. Technically facilitated tasks like updating websites, using and updating wikis, using social media: even the late majority of humanists can do it now. Online tools are available, such as Voyant, which allow you to do text analysis, and manipulate texts to see the underlying patterns: so the early majority of humanists can use these tools should they want to. But the most difficult, intellectual work of applying technology in the humanities still occurs before the chasm has been crossed, in the phase of innovation, and early adoption, where we are looking at the technologies that cross our path and saying “how can I use, or develop this tool for use, in my research?”, much like those in the 1950s or 1960s who were coming across university mainframes and asking how best to apply that in the literary and linguistic arena. It’s important to note, of course, that this wave of technology keeps on coming at us, and the place of where technology sits along the curve changes: 20 years ago, had you been making a website for your humanities project, you would have been an innovator, rather than a late majority, and the same holds for word processing 40 years ago. The technology keeps coming: we have to respond to this, innovate, adopt, and see what is useful or useable for, or used by, the majority of people in our discpline.

Now (and this is the most contentious thing I’m going to say in my whole lecture, for those attending who are dyed-in-the-wool Digital Humanists) one of the problems that we have as a movement is that we tend to get caught up and fixated upon a certain technological solution. For example, every DH program I’ve come across teaches XML, that technology which took over from SGML in the conference abstracts – as the best practice way to encode text. And there’s no doubt that XML provides the framework with which we can both explore theoretically what is means to describe texts computationally, in such a way they retain the information in their printed or manuscript form, whilst also the means to build and test prototypes. But XML as a technological standard has been around for 16 years, and technology moves on, but DH doesnt seem to be doing so. In many ways, DH’s relationship to XML is similar to the AI community’s relationship with LISP: the means of computational expression in the language or format suit the questions which need to be asked by the field, so there is no need to use other technologies which come on stream, which may be more efficient from a computational point of view, as we explore what is means to work with our question in this computational way. And that’s ok, but we shouldnt be blind to the fact that, hey! technology is advancing all the time and, also, XML is not a technology that crossed the chasm: it may be in use for technical systems, but its not one that you see a lot of the general populace using. This, in turn, means that DH has permanently hitched its wagon to an aging technology, which is hard to explain to others, including other non-XML humanists, whilst other things are happening in the technological world around us. Just something we have to watch out for, when building teaching programs, or looking at the scope of outputs in our field. We dont want to be left behind as the digital in digital humanities rolls on without us.

I find it useful to plot my research on the Innovation Curve, to see where what I am doing sits. So, the work on counting terms across a corpus – very much sits in the early majority, given the availability of tools to do so. But the work on building an iPhone app to do so – very much innovation: it took a lot of pure programming in a relatively new space to achieve it. The work in image processing I do is either innovation (we are publishing here in pure computer/engineering science venues, as well as in humanities venues, which I’m very proud of), or we adopt technologies our academic colleagues in the engineering sciences have generated and roll them out to a humanities or heritage application. Our work on user studies is something completely different though: here we are generally looking at how the majority of people are using an extant text, or (in the case of something like Transcribe Bentham, or QRator) we are conducting reception studies, where we innovate and build a technology, launch it, and study its uptake across the whole cycle. We can see, then a range of DH activity across the innovation cycle, but the majority of the work I do is certainly at the start of the innovation curve. Is this where DH sits? I like to think so, but more to the point, I’m confident its where I sit best, when doing DH.

I need here to show you another curve, though. This time, the Gartner Hype Cycle, which looks at how technologies are launched, mature, and are applied (so people know when to invest). The premise of this is that when technologies are first triggered, everyone thinks they are going to be the Next Big Thing, and so they reach “the peak of inflated expectations”, before crashing down into a “trough of disillusionment” when those adopting them realise they aren’t that great at all. Its hard work to get technologies up the “slope of enlightenment” where useful, useable applications are found, and few technologies make it to the “plateau of productivity” where they become profitable. Its a useful curve – this year’s predictions show Big Data right at the top of the peak, which chimes in with media coverage of how it will solve everything, for example. So where would I put DH, if I had to as a movement, on this curve?

I’d put it at the top. At the top of the Peak of Inflated Expectations. We’ve got a lot of pressure on us to prove our johnny-come-lately benefit to the world of academia, to demonstrate our worth, to show that the investment made in us over the past few years is worth it (whilst also bringing in further investments in research funding, to meet institutional expectations). After a peak, comes a crash, and we have to be prepared for the tide to turn and the backlash to begin, after the years of media hype and raised expectations. So how do we get to the plateau of productivity of Digital Humanities?

First, I would argue that we have to understand our lineage: that the current manifestation of DH is a logical progression of qualitative methods used in the humanities for the past 500 years. That the current manifestation of DH is a logical progression of humans wondering what the potential is for applying computational methods to humanities problems, which has been going on in the digital space for the past 60 years. These combined trajectories aren’t going away, and despite what funding cuts and media backlash may come at us, it is the role of the digital humanist to understand and investigate how computers can be used to question what it means to be human, and the human record, in both our past and present society. Secure in our mission, we can carry on whatever the storm throws at us.

Second, I would argue we have to ignore naysayers who are unsure about this new Digital Humanities lark (and believe me, there are plenty, even in my own department) and just do good work. The way to demonstrate our worth is to demonstrate our worth through doing good work. We have to keep asking questions about computational methods, computational processes, and the potentials that they offer humanities scholars, as well as the pitfalls, to explore this changing information environment from the humanities viewpoint. Its not just about building websites, or putting information online, its about innovating and adopting, and questioning while we build about the ramifications of doing this, the impact on the humanities, the issues using technology raises, and the answers it provides that you couldn’t otherwise generate, to do good work in Digital Humanities. I realise this is very Calvinist of me – you can take the lass out of Scotland – but I do see that we have to be engaging with theories and questions of what is means to be doing this work in this way, as well as updating a website or creating a digital file. A continuation of what it means to be a humanities scholar, in the digital space.

I’m not one for looking back, and despite the title, I deliberately didn’t want this inaugural to be a survey of all the projects I have undertaken over the past ten years – then I did this, then I talked to that person, then I visited there – but when I look back over the variety and range of projects, publications, and outputs that I’ve worked on, either on my own, or as part of a team (there’s a lot of teamwork that has gone on here) I’m firstly surprised at how much of it there is and the range of topics we’ve covered, and the opportunities we’ve pounced on. I see a body of work which explores various aspects of what it means to be applying digital technologies in the humanities space, and facilitates both those in engineering science and those in the humanities to explore issues which are important to them. I’ve learn’t things along the way about the nature of interdisciplinary work, the nature of teams, the nature of the academic publishing and peer review process, the nature of the grant funding process, but I’ve written about that elsewhere. There are things, also, that I am proud of that are physical rather than purely digital: over the last few years I’m most proud of building the UCL Multi-Modal digitisation suite, which is a shared space between the UCL Library Services, UCL Faculty of Arts and Humanities, and UCL Faculty of Engineering Science, contributing to the infrastructure of UCL in a collaborative endeavor. But what I see here, as a common thread, is that the work I do tends to sit right at the beginning of the technology adoption cycle, aiding and abetting the application of technology within the arts, humanities, and heritage, and I’m comfortable with that. There’s a strength in knowing your place, and your remit, and what you do best.

So the third reason for calling my talk “A Decade in Digital Humanities” is that I didn’t say which decade we were talking about, and it is time also to look towards the future, and what the next ten years holds for both DH, as the field turns into a teenager, and for me, as I go into my next decade here at UCL. I’m not one for crystal balls, so I’ll keep my scrying brief. I see an inevitable fragmentation of the DH community and DH focus – it was never conceived of as a homogenous entity anyway, and it is the nature of waves and swells that they will dissipate. We’ll see (we are already seeing) more focussed groups of scholarly work around, say, Geographical Information Systems and literature, as people specialise and work on specific technologies and specific methods. The technology will keep coming, and its up to individual humanities scholars to respond to what is appropriate to their research question: the effects of DH scholarship will continue to ripple out across the humanities as technologies go along the adoption cycle, and certain aspects of digital research will just become normal for humanities scholars, as time goes on. But I do see that there will always be a place, right at the start of the technology innovation uptake curve, for specialists in Digital Humanities to sit, watching out for these changing and emerging technologies, setting up pilot projects to experiment with different aspects of these technologies, feeding back recommendations and the potential ramifications for other humanities and engineering scholars and those within the wider cultural and heritage sector, and exploring what is means to be doing humanities research in that area. I’m happy to remain there, and I see that this will remain my place working with other humanists, and engineers and computer scientists, over the next decade. I’m delighted to be a co-investigator on the doctoral training centre for Science and Engineering in the Arts Heritage and Archaeology, which is the EPSRC’s largest every investment in Heritage Science, and for the next 8 years we’ll be training up a range of doctoral students in this cross section of the arts, heritage, humanities, and engineering and conservation science. (Perhaps what I really do is Heritage Science, but that’s another talk entirely, and DH has work to do with the Heritage Science community in future).  That said, we do have work to do, in keeping an eye to making sure people know about the successes, outputs, and impacts of DH work. Given the expectations foisted upon us, we have to learn to be more vocal about our objectives, our remit, and our results. It’s our job to be thinking what it means to use digital technologies in humanities research, and just research, full stop. As a result, our insights can benefit a range of other fields, if we communicate them effectively.

Digital technologies are not going away any time soon: and although DH has had a rapid swell, it will remain essential that we investigate, use, and experiment with technologies over the coming decade. There is a new Companion to Digital Humanities coming out in late 2014, showing how the technologies used in humanities research have developed since the first edition (I’m delighted to have written a chapter on our public engagement work for it), and our see our field, as well as knowing where we have come from, has to understand that the technological wave on which we sail is continually on the move. I hope I’ve shown here that our uptake of technologies in the humanities is, and will continue to be, a moving target, and that as part of a longer trajectory of investigation into humanities methods, DH is a modern but necessary, and even inevitable, part of the Humanities, and even computational, landscape. I look forward to what adventures the next Decade in Digital Humanities holds. There is so much to do!

Now, that is where I’d normally pause and say thank you for your attention, but hey, its my inaugural, so I’ll cry if I want to. I have a few brief thanks to make – its quite a lick to go from probationary lecturer to full prof in ten years, and so I have to thank those who have supported me. Thanks go to my family up in Scotland for all their support, and my family of my own: many of you know that in the past few year’s I’ve had three children, so biggest thanks of all go to my husband Os, aka Expert Sleepers, for his forbearance and baby juggling skillz. I’ve been blessed with an amazing support network of friends, who have supported my enormously over this period. My first academic supervisor was Professor Seamus Ross, who kick started my interest in this area, and his support and interest at the start of my career really set me up for the work I do today. Likewise, my PhD supervisor Professor Alan Bowman remains a fantastic mentor: thank you, Alan. My other PhD supervisor, Professor Sir Mike Brady, made me promise (when I got my doctorate in engineering) not to go near any nuclear power stations or bridges, a promise I have kept – thanks Mike. I’ve already mentioned that Professor Susan Hockey gave me my first academic job: but her work remains an inspiration on what is possible in computing in the arts and humanities. I work with an amazing team of people at UCLDH and I thank them for their input both for the centre and on our various projects. Special thanks go to Rudolf Ammann, our designer at large, who helped prepare the graphics for this lecture.

But in this week of UCL’s Festival of the Arts and Humanities, its good to pause and see how embedded Digital Humanities research is now throughout college, and how much we work, in the Humanities, with those around us. The projects I’ve shown, albeit briefly, today, are carried out in league with various other faculties (UCLDH reports to both the Arts and Humanities and Engineering Faculties here). Colleagues come from a range of different departments including not only those across the Arts Faculty, but the Bartlett Centre for Advanced Spatial Analysis (in the UCL Bartlett Faculty of the Built Environment), and across the UCL Faculty of Engineering (I have joint projects with Medical Physics, Computer Science, and Civil, Environmental, and Geomatic Engineering). We are dependent on input from both our colleagues in UCL Library Services, and UCL Museums and Collections, and work very closely with items in all the collections across college. The success of DH at UCL is then dependent on the institutional context we have here. Digital Humanities is now embedded into college life at UCL, and in this week of the Festival of the Arts, my final thanks go to UCL as an community for its institutional support in encouraging us to ride the DH wave: for without being at UCL, my decade in digital humanities would have been completely different.

Roy Wisbey, and Literary and Linguistic Computing, 1965 style

2b4e8-wisbey

I recently got in touch with Professor Roy Wisbey, who set up the University of Cambridge’s Linguistic Computing Centre in 1960, to invite him to my inaugural lecture. He is not able to attend (but passes on his regards to those who know him!) and he also briefly loaned me this newspaper article, from 24th September 1965, from the Cambridge News. A very early piece of Humanities Computing history! It’s in very fragile condition – I’ve spliced it together here to give the whole piece in one image (and the blog stylesheet is not my friend here – will sort out later – but…) – enjoy!

The use of computers will save the scholar years of mindless drudgery! indeed!

Siberian Digital Humanities Adventure

The Siberian Federal University
The Siberian Federal University

Greetings from Krasnoyark, Siberia, where for the past week I’ve been hanging out at the Siberian Federal University, the largest university in the Siberian region, which is in the top rankings in Russia. I’ve been giving some guest lectures on digital humanities, meeting various staff and students, and plotting with them on how to support their work and how to make connections to the wider digital humanities community.

How did I end up here? Its all down to the wonderful Inna Kizhner who approached me nearly two years ago, in my guise then as secretary of what is now the European Association for Digital Humanities. After helping source some teaching materials, in English and Russian, for their taught courses, Inna remarked to me “no-one ever comes to Siberia…” and I immediately said “ask me!”. And finally, after much preparation, here I am.

Siberian Federal University are establishing a solid Digital Humanities presence. In the Institute of Humanities they currently offer digital humanities modules at both undergraduate and postgraduate level, and also an undergraduate module in the subject area of digital history (which next year will be taught by Inna). They have a digital lab (door sign, above!)  and digitisation lab. They have a range of projects they have been working on with both researchers and students, many of them led by Maxim Rumyantsev who is now the university’s deputy head, so there is positive institutional support here. These projects are mostly in the area of multimedia and digitisation. For example, working with the Museum of Geology of Central Siberia to create the simply stunning companion to their minerals collection (it is no easy task to capture minerals in this detail, at this quality); capturing, virtually exploring,  and explaining regional heritage architecture (which is fast disappearing under new developments in this region) from the nearby town of Yeniseisk, documenting regional art shows and youth art shows; capturing high resolution images of the art contained within the Surikov Museum (life size copies of which adorn the university’s walls at every turn); working with Gigapan capture methods and the State Russian Museum to create zoomable images of large art works (can you spot Pushkin?); and creating an interactive model of the Siberian Federal University campus itself. They are keen, now, to be making connections with others across the world, and I’m delighted to be helping them, and introducing them to various figures, and associations, in Digital Humanities. There is much work to be done, we have plans set out, and they are keen to make new relationships and new collaborations.

Its not all been work! I’ve been welcomed into colleagues’ homes for meals (often meeting their families), treated at friendly restaurants (the food is wonderful), and toured round museums and supermarkets (Inna patiently put up with me pointing and exclaiming at various products we dont have in the UK, such as dried fish, and tinned horse). Today we went to the Krasnoyarsk Dam, 30km upstream from the city, on a glorious spring day which showed off this remarkable feat of engineering (which is so exceptional it features on banknotes across Russia). There is a heavy security presence, and no photos allowed, but I did manage this sneaky selfie…

It’s been a fantastic, trip, and I’ve been very welcome here. Thanks to Inna, Maxim and Marina for their hospitality, and I look forward to further opportunities, visits and introducing anyone who wants to be introduced (if I can be of help, drop me an email and I will forward it on). I have to admit I was nervous about my trip here – but instead of stress I’ve found friendly connections, and much opportunity to help further establish DH in this region, and throughout Russia. Now to pack, and begin the long trip home, where my three small boys are missing their mummy on the other side of the world (and I them). до свидания!

Digitisation’s Most Wanted

What are the most commonly accessed digitised items from heritage organisations? Even asking the question leads to further understanding about the current digitisation landscape.

Have you seen this Dog? Last spotted on the Flickr account of the National Library of Wales. Dog with a Pipe in Its Mouth, Taken by P. B. Abery, 1940s.

Last month, at a meeting at the National Library of Scotland, an interesting fact flew by me. The NLS has hundreds of thousands of digitised items online, so what do you think is the most popular, and most regularly accessed and/or downloaded? (it is difficult to make the distinction regarding accessed or downloaded on most sites.) Is it the original Robert Burns material? The last letter of Mary Queen of Scots? or any of the 86,000 maps held in this, one of the best map collections worldwide? No. It is “A grammar and dictionary of the Malay language : with a preliminary dissertation” by John Crawfurd, published in 1852. This is accessed by hundreds of people every month – mostly from Malaysia, partly because it is featured on many product pages providing definitions of malaysian words – demonstrating the surprising reach and potential in digitising items and then making them freely available online, reaching out to a worldwide audience far beyond the geographical local of the library itself. Wonderful.

This left me pondering… what are the other most downloaded items at major institutions in the UK? So I sent out some feelers, and here are the results, demonstrating both the hidden complexity of the question, and the relationship of digitised heritage content to the current online audience landscape.

At Cambridge University Library, the most accessed collection overall is the Newton Papers, which was the first major digitised collection launched by the Library in 2010, and promoted widely. Within that, there is one particular notebook (which Newton acquired while he was an undergraduate at Trinity College and used from about 1661 to 1665 for his lecture notes) which is the most popular, featuring heavily in the initial promotion of the collection, and also in an In Our Time special series hosted my Melvyn Bragg on Radio 4.  But within that notebook there is one page that is accessed more than the others, with most of the traffic coming from Greece. Why? This page was picked up in the Greek press and pointed to on many websites, blogs, newspaper reports, and in social media as evidence that Newton knew Greek. The links that remain still direct thousands of users to view Newton’s jottings from his Greek lessons at the front of the book, showing the fascinating relationship between publicity, social media, linkage, and an item which reflects national pride, to a worldwide audience.

The most downloaded items at Cambridge also reflect the rapidly changing mentions of items on social media: in April 2014, an item downloaded/accessed more than 6000 times was the Breviary of Marie de Saint Pol, which went live this month. Why the sudden notice? On the 3rd of April, one of the Cambridge colleges with thousands of followers posted a link to it on Facebook followed by the Cambridge Digital Library Facebook and Twitter feed on the 4th of April. Retweeted a few times, these few postings led to the thousands of views of the document, demonstrating the growing importance of using social media to tell people about newly mounted digitised content.

Over at Trinity College Library, the most accessed item from their digital collection in general is the Book of Kells,  which again was their first major digitised item, heavily promoted in the press, and attracting a level of viewing that is unique due to general tourism and cultural heritage interest. The second most accessed digitised item is the surprise: a book of Lute music by William Ballet, from the 17th Century. There is much discussion of this item, and links to it online, posted by online communities of lute players, and those who blog about lutes worldwide. Interest and demand in at item can therefore be encouraged if interested online communities hear about it, and share with their membership.

A similar tale about the importance of publicity and social media emerges from the British Museum. There are popular items about the Viking exhibition which are linked from their home page at the moment given the current exhibition, but since the 1st January 2014 til now, the most popular item accessed in the digital collection (no, wait, go on, guess…. Rosetta stone? Vindolanda Tablets? …) is the Landscape Alphabet by Joseph Hulmandell (no? me neither). These were discovered and shared on social media by type enthusiasts on twitter  in mid February, and promoted by the cool-hunter the Laughing Squid who has almost half a million followers on twitter, which caused a sudden spike (I cant see the British Museum actually tweeting them out themselves on their timeline).  However, the initial swell of tens of thousands of hits has since dwindled to nothing, showing the fickleness of attention that comes with the social media stream. In 2013, the most single viewed item at the British Museum was… (go on, guess!)… a lead sling bullet, viewed 42,156 times in total. Why? It was picked up on reddit, due to the sarcastic inscription “some ancient sling bullets excavated from the city of Athens, Greece were inscribed with the word “ΔΕΞΑΙ” (dexai), which translates to “catch!”” which generated a lot of online LOLs (“Halt gentlemen. Do not yet partake of the feast before us, for I must capture the image of it with instagram whereupon I shalt bequeath it to my herald upon Facebook for all to see.” here) and this encouraged  – and still encourages – visitors to the British Museum website: some forms of posting on social media generate the long tail of usage more than others.

Things start to get more complicated when various digital asset management systems (DAMS) come into place – often institutions have more than one database of digitised content, from different suppliers, with different licensing restrictions and requirements, and so ascertaining the most viewed single item is not a simple question. Organisations also post and share content in various different places. The National Library of Wales are looking through their DAMS to see which items are the most accessed, but immediately know that the most popular item they hold that has been posted to Flickr (with no known copyright restrictions, contributed to Flickr Commons) is the photograph at the top of this post, Dog with a Pipe in its Mouth, from the P. B. Abery Collection. Again, this is an image which has been mentioned regularly on blogs, social media, and internet chats, as well as being a featured image on the 2013 anniversary of Flickr Commons: the fact that it has no copyright restrictions encourages its reuse – and therefore traffic towards its host institution’s site, if those users point back to it – online.

The libraries at Oxford University, including the Bodleian, have been digitising items for over twenty years, and so it is difficult to say what the most accessed or popular items are, due to the way the systems have been designed, implemented and integrated over the past two decades. Their most downloaded or accessed digitised book, scanned in collaboration with Google, is probably the “History of the Scott Monument, to which is prefixed a biographical sketch of Sir Walter Scott” by James Colston (published 1881) – a freely downloadable version is available from its library record (ignore the resellers offering printed versions generated from this for much cost on amazon and eBay!). As far as images are concerned, the most popular at Oxford are among those listed on Early Manuscripts at Oxford University, partly because many of them have been up continuously for twenty years (legacy data for the history of downloads of specific images are not available, indicating how difficult it is to access long term data about this. Server logs get very big very quickly and so are generally periodically discarded, and it is only recently that reporting facilities such as Google Analytics have allowed a quick and easy overview of the usage of websites). Currently popular digitisation projects at the University of Oxford Libraries are the Polonsky Foundation Digitization Project, and the recently launched digitized First Folio of Shakespeare’s works, but there isn’t sufficient data available from all the digital collections to be able to say one way or the other which is the one most popular project, never mind item. It was also pointed out, though, that you would probably struggle just as much (if not more so) to identify which has been the most requested book in the Bodleian’s collections!

This trend of databases complicating the question continues at the British Library, where their digitisation outputs and projects are made available via multiple platforms and viewers, some managed by the British Library, and others by commercial partners, with some content available for free, other content via subscription, or paying a fee per image. These are only some of the most popular different sites: https://imagesonline.bl.uk, http://www.bl.uk/treasures/treasuresinfull.html, http://www.bl.uk/manuscripts/, www.sounds.bl.uk, https://www.flickr.com/photos/britishlibrary/, http://www.britishnewspaperarchive.co.uk/, http://find.galegroup.com/bncn/, http://gdc.gale.com/products/17th-and-18th-century-burney-collection-newspapers/ and the BL module on http://www.biblioboard.com/libraries.html. In addition, there are BL digitisation partnerships with other content providers, for example http://idp.bl.uk/ and http://eap.bl.uk/. Finding out the most accessed digitised item from within this is tricky (but not impossible – they tell me they are looking into it). The fact that they cannot say immediately demonstrates the complexity of running many large databases of digitised content.

These results, from very different institutions, invite discussions on shallow versus deep engagement with digital collections. Some examples of commonly accessed material are what we would think of as part of the Canon of Digitised Content: Shakespeare, Newton, Medieval Manuscripts. Some examples of commonly accessed material here can be taken as little more than clickbait – LOL! History! – or free reference material – its a free Malaysian Dictionary! Bonus! – but is getting people through the virtual door to digitised collections in this way, and through these items, such a bad thing? Come for the Dog with the pipe in its mouth! stay for the genealogy, then the discussions on palaeographic method! One can also argue that some of the discussion surrounding these objects are exactly what we are trying to encourage – many of the hundreds of comments posted on the Reddit item about the British Museum sling shot bullet, although hilarious, show consideration of what it would mean to be human in the time of Ancient Greece, and relate their societal response to ours. Isn’t that the starting place (and in some cases, the ending place) of engagement with primary historical evidence?

Asking to see Digitisation’s most wanted opens up wider questions of public engagement, the impact of social networks on internet traffic to digitised collections (from highlights posted by the institution, to those identified and shared by others outside it, often quite unexpectedly), and the role of making images of primary historical sources open for others to discover, use and share. We also become aware of the complex and intertwined database systems which are in place in many large organisations undertaking digitisation and delivering digitised items to users, and the difficulties in reporting on individual items (be they physical or digital!) as a result. Digitisation’s most wanted is also a rapidly moving target, dependent on publicity, and changing interest and focus over time: social media can encourage large swings and changes in popular items very quickly. The act of posing this question has led to an interesting discussion on how we think about use of digitised content, and how we can build up evidence about usage. (I’d also like to thank the organisations listed above for responding to my query so promptly!)

Have you, or any organisation you work with, been affected by the discussion in this blog post? Do you have any evidence you can contribute to the investigation? Your help is needed to catch digitisation’s most wanted. Please do post your comments about your experiences below (comments are moderated so may take a few hours to appear), or email m dot terras at ucl.ac.uk for them to be integrated here. The internet is a place of busy traffic. Someone must have seen them…

Update 15/05/14: The British Library’s Endangered Archives’ most popular item is the St Helena Banns of Marriage, an item commonly pointed to on genealogy websites such as this and this.

Update 16/05/14:
-The National Library of Australia have a discussion of their 25 most viewed digitised newspapers, and why, here.
– The International Dunhuang Project at the British Library tell me that a redevelopment of their database and website is underway to improve reporting for them, their partners and users.
Glasgow University Library Special Collections tell me that their most popular item is the Curious Case of Mary Toft, from 1726, who supposedly gave birth to a litter of rabbits.  This was featured as a book of the month in 2009, but picked up by the social media site Mental Floss in January 2014, with that page being shared on facebook more than 4000 times, and garnering 30,000 hits in one day alone, and has since been posted on various other social media platforms, including Reddit.  Glasgow also say that there is a difficulty in measuring access counts as the content is held on various different servers, and it can be difficult to interpret Google Analytics in this case. They also point out that, from their perspective, there is a lack of benchmarks to compare usage of their items to that of other special collections.
– The National Archives tell me they point to the popular items as part of their navigation and as a result, these “most popular items” remain the most popular, in a virtuous circle. A very popular item at the moment is the The Security Service: Personal (PF Series) Files KV2 which hosts the records of spies such as Mata Hari. These were embargoed until Thursday 10 April 2014, then launched with an accompanying press release, which garnered significant press coverage worldwide, driving traffic to the site. The only frequently accessed item which is not in these lists is the muster roll of HMS Victory for the Battle of Trafalgar, which is commonly referred to in military and naval history websites (although interestingly few people link through directly to the page where it can be downloaded from, so those who read about it must come to TNA’s website and search themselves).

Update 19/05/14
– The Estonian Folklore Archives at the Estonian Literary Museum tell me that their most popular item is a leaflet from 1937 on how to preserve sealskins, although I can see no other webpages pointing to this item (perhaps because my Estonian search skills are weak!).
– UCLA Digital Library tell me their most viewed item is a Lyrical Map of the Concept of Los Angeles,  a 23-foot long hand-drawn and hand-lettered map of Los Angeles, using the words and images of dozens of L.A. authors, which was on display in a museum in 2011, and was featured widely on blogs  both at the time of the exhibit and since, which points people to the digital version now the display is no longer live in the museum space. Another popular item is the complete set of the 1582 Corpus Juris Canonici, the “Body of Canon Law,” particularly the table of contents, which is commonly linked to from those interested in Canon Law, such as this, thus driving subject specialists to the site.
– The History of Computing in Learning and Education Virtual Museum tells me the most viewed items are the writing competition and Historic Newsletters from the People’s Computer Company.
–  A Hack day carried out at the Zurich Hackathon 2014 looked at image analytics from the US National Archives and Record Administrations contributions to flickr commons, looking at 200 million hits in a 3 month period and identifying the most common images: a description of that hack is here, which also gives examples of the most commonly looked at images. “There is a spike on March 24. Further analysis shows that the biggest referral on that day is Dorothy Height. Turns out this lady was featured on a Google Doodle on that day.” Popular subjects (and referrer pages, generally from Wikipedia) were John F. Kennedy, World War II, Japanese American Internment, Vietnam War. A full list is available on the project page. This shows the importance of institutions linking their content from Wikipedia, and what can happen if you are featured by Google.
– There is also a useful tool in BaGLAMA which shows view counts for pages using Commons images in GLAM-related category trees.

Update 20/05/14
– The Bodleian also make the very good point that “With most browsers now defaulting to ‘do not track’ combined with the EU cookies legislation it is difficult to find any sort of data that one can ‘stand behind’ these days.”
– The Jüdischen Museums Berlin‘s most accessed items are the Sammeldatensatz: Orden, Ehrenzeichen und Embleme von Julius Fliess (1876-1955), but they say that most accesses come from searches for “jewish emblems”, and so there is a need to add emblem as synonym for symbol to thesaurus, to help users find what they are looking for. In this way, looking at search terms can help develop user paths through the system so they can find what they actually want.
– The University of Iowa Digital Libraries say that based on google analytics for the last year, the most popular item is a dada book, and the most popular collection is Iowa Maps, but the access numbers for different objects in the database themselves are hard to count, and they’ll get back to me on that. Based on recent web searches reported from the web master, a surprisingly high number of people find them via searches for Peter Rabbit: the digital book of which is linked through to their site from the Wikipedia page and various other websites featuring Peter Rabbit.
– The National Library of Wales tell me the most popular article on http://welshnewspapers.llgc.org.uk is a 1916 Cambria Daily Leader advert for ‘blouses’ and ‘hosiery’. To find out more about why may take some digging, though!
Hamlet Depot and Museums tell me that their most popular items are genealogical records, including railroad employees lists, and seniority records, and also historic pictures.

Update 22/05/14
– The New Zealand Electronic Text Collection tell me that reference works are their most used, including A Grammar and Dictionary of the Samoan Language, with English and Samoan vocabulary (which is linked to from thousands of different sources about New Zealand culture, and discussions on translation), New Zealand in the First World War (which is linked to from various history and genealogy sites) and The Official History of New Zealand in the Second World War (which is also popularly linked to online, including in reminiscing personal postings from soldiers who served, talking about the war on social media).
– The University of Otago Library provided me with a very detailed overview of the issues they face (thanks!). They are in the process of developing a repository to manage all of their digital collections that they want to curate, and the pilot will be live by November, but for the moment, they have a variety of different sites on which you can see digitised material, showing again the complex relationship of databases and content which many institutions have. For example, they have OUR Heritage which is a window across some collections.  Some records are pulled from OUR Heritage and displayed via Special Collections Online Exhibitions. There also is Hocken Collections who had their reader access collection digitised and made available online. They track this via Google Analytics, and also watching their own server stats: and these do not in any way match up. Google does not capture when someone goes directly to a file, so Analytics reports just a fraction of the over a million hits in the past year that they can track on their server. They digitise on request, and respond to community demand, and are trying to prioritise the digitisation process. From Google Analytics, the most heavily used collections are the History of the University and Botanical charts (which belong to the Department of Botany at Otago and some are still used in the Labs.  They digitised these, provided a copy for their use and deposited the originals in Hocken Collections.) The most popular items are “Key plan to Mr G.B. Shaw’s picture of Dunedin in 1851” which is mentioned on various genealogical sites online:  a Painting “Sangro, a rosary of olive trees, landscape of windswept manuka.” which appears linked from some other major federated collections online and a printed map of Rome “Mappa della campagna Romana del 1547” which is a commonly consulted map (there are various copies of it in libraries worldwide) so those searching online to see it must find the freely available copy here.

Inaugural Preparations

So, my inaugural lecture is coming up in a few weeks, and I’m starting to write it now, nervously… The event has already sold out, but will be streamed online live, and there is also another lecture theatre at UCL that it will be shown live in (The Terras Terrace?).  Dr Rudolf Ammann, UCLDH’s designer at large, has kindly provided some visuals for me… here’s the promotional flyer.

I plan to write the lecture out long hand once it is done, and of course, you will be the first to know about it (after I’ve given it…)