British Library to archive the web

2013-04-05 08:46
Internet. (Duncan Alfreds, News24)

Internet. (Duncan Alfreds, News24)

Multimedia   ·   User Galleries   ·   News in Pictures Send us your pictures  ·  Send us your stories

London - Capturing the unruly, ever-changing Internet is like trying to pin down a raging river.

But the British Library is going to try.

For centuries the library has kept a copy of every book, pamphlet, magazine and newspaper published in Britain. Starting on Saturday, it will also be bound to record every British website, e-book, online newsletter and blog in a bid to preserve the nation's "digital memory".

As if that's not a big enough task, the library also has to make this digital archive available to future researchers - come time, tide or technological change.

The library says the work is urgent. Ever since people began switching from paper and ink to computers and mobile phones, material that would fascinate future historians has been disappearing into a digital black hole. The library says firsthand accounts of everything from the 2005 London transit bombings to Britain's 2010 election campaign have already vanished.

Reference collections

"Stuff out there on the web is ephemeral," said Lucie Burgess, the library's head of content strategy. "The average life of a web page is only 75 days, because websites change, the contents get taken down.

"If we don't capture this material, a critical piece of the jigsaw puzzle of our understanding of the 21st century will be lost."

The library is publicising its new project by showcasing just a sliver of its content - 100 websites, selected to give a snapshot of British online life in 2013 and help people grasp the scope of what the new digital archive will hold.

They range from parenting resource Mumsnet to online bazaar Amazon Marketplace to a blog kept by a 9-year-old girl about her school lunches.

Like reference collections around the world, the British Library has been attempting to archive the web for years in a piecemeal way and has collected about 10 000 sites. Until now, though, it has had to get permission from website owners before taking a snapshot of their pages.

That began to change with a law passed in 2003, but it has taken a decade of legislative and technological preparation for the library to be ready to begin a vast trawl of all sites ending with the suffix .uk.

An automated web harvester will scan and record 4.8 million sites, a total of one billion web pages. Most will be captured once a year, but hundreds of thousands of fast-changing sites such as those of newspapers and magazines will be archived as often as once a day.


The library plans to make the content publicly available by the end of this year.

"We'll be collecting in a single year what it took 300 years for us to collect in our newspaper archive," which holds 750 million pages of newsprint, Burgess said.

And it is just the start. Librarians hope to expand the collection to include sites published in other countries with significant British content, as well as Twitter streams and other social media feeds from prominent Britons.

The archive will be preserved at the London institution and at five other British and Irish "legal deposit libraries" - the national libraries of Wales and Scotland, as well as university libraries at Oxford, Cambridge and Trinity College, Dublin.
Read more on:    internet

Join the conversation! encourages commentary submitted via MyNews24. Contributions of 200 words or more will be considered for publication.

We reserve editorial discretion to decide what will be published.
Read our comments policy for guidelines on contributions.
NEXT ON NEWS24X publishes all comments posted on articles provided that they adhere to our Comments Policy. Should you wish to report a comment for editorial review, please do so by clicking the 'Report Comment' button to the right of each comment.

Comment on this story
Comments have been closed for this article.

Inside News24


Book flights

Compare, Book, Fly

Traffic Alerts
There are new stories on the homepage. Click here to see them.


Create Profile

Creating your profile will enable you to submit photos and stories to get published on News24.

Please provide a username for your profile page:

This username must be unique, cannot be edited and will be used in the URL to your profile page across the entire network.


Location Settings

News24 allows you to edit the display of certain components based on a location. If you wish to personalise the page based on your preferences, please select a location for each component and click "Submit" in order for the changes to take affect.

Facebook Sign-In

Hi News addict,

Join the News24 Community to be involved in breaking the news.

Log in with Facebook to comment and personalise news, weather and listings.