Online Resources | Internet Archive Images on Flickr

Image from page 285 of “Science and literature in the Middle Ages and the Renaissance” (1878) via https://www.flickr.com/photos/internetarchivebookimages

The Internet Archive, already well-loved by many art historians as a source of high quality online versions of many out-of-print books, has a new project to make images from the uploaded books available online via flickr.

The BBC story reports that the process used involved inverting the way in which the software used to scan books worked:

As part of the process (of scanning books as texts), the software recognised which parts of a page were pictures in order to discard them.

Mr Leetaru’s code used this information to go back to the original scans, extract the regions the OCR program had ignored, and then save each one as a separate file in the Jpeg picture format.

The software also copied the caption for each image and the text from the paragraphs immediately preceding and following it in the book.

Each Jpeg and its associated text was then posted to a new Flickr page, allowing the public to hunt through the vast catalogue using the site’s search tool. More.

So far more than 2.5 million images have been uploaded and all are supposedly copyright free (the project stops with images from books published after 1922). The copyright restrictions on some images are probably a bit more complex than this but nonetheless it looks to be useful resource, especially (based on what I have looked at so far) for research on book illustrations, prints, early photographs and so on. The images are searchable by keywords based on the image captions, book titles, dates and other keywords.

- Katrina Grant