WA Secretary of State Blogs

Author Archive

Digitizing Newspapers: Part I – Source material

Tuesday, March 10th, 2009 Posted in Articles, For Libraries, Technology and Resources | 3 Comments »


We began a post about changes we’d recently implemented in our post-processing of digital images of newspaper pages. Of course, we found this hard to talk about without delving first into the process of digitizing newspapers. So we’ve decided to cover the topic more thoroughly through a series of posts.

These posts are not meant to be a “Steps A-Z” type of tutorial but rather a discussion of things we consider when scanning and processing newspaper pages. Please feel free to add to the discussion, ask questions, or leave comments.

Part I – Considering your source material

Some things to consider:

1. Newspapers are a difficult to organize. Newspapers have long been the historic record of a locale or group of people so there are often a lot of them (in the sense of a sheer quantity of actual pages). Also, newspapers constantly evolve; they change owners, editors, names, publication dates, publication frequency, etc. Collating historic newspapers and the information about them can be as much (or more) work than capturing the image of the pages themselves.

2. Many newspapers are old (the ones we can legally digitize anyway). An obvious but very important consideration. Newspapers were documents that were not meant to last centuries.  Fading, tearing, foxing (i.e. stains) and ink bleedthrough are some of the many problems encountered when dealing with the pages themselves. We’ll talk more about how we try to combat problems posed by the quality of the originals in a later post.

3. Most historic newspapers aren’t paper anymore. Unfortunately (or maybe fortunately depending on your persepctive), when scanning newspapers, we rarely deal with the original pages.  Nearly all of the NDNP material and most of our pioneer newspaper collection is scanned from microfilm (i.e. one type of microform) – and often old microfilm. Early microfilm quality (film created before microfilm standards and guidelines) varies wildly as the original photograph of a newspaper page depended as much upon the quality of source material as the equipment used, and the skill of the photographer.

Another consideration is that the film we scan is often not even the original film but rather duplicate negatives (i.e. duplicate masters). So we run the risk of working with a bad reproduction from a perfectly fine master reel.  Or, the film itself, like those old negatives you store in a cardboard box in the basement, can be damaged or scratched. Needless to say, any problems with the original paper materials only compound each generation away from the original and each time the image is reformatted (i.e. migrated from one medium or format to another).

And we haven’t even started scanning yet – another form of reformatting – which tends to even further magnify any of the problems mentioned above. Scanning transparent images has its own challenges. Film scanners require a more sensitive sensor – one that can capture the tonal values of a very small  transparent image. Often, because film is so small and yields so much detail, scanners have to operate at their maximum levels – resulting in artifacts and noise.

4. Newspapers are large (larger than your average document). Another obvious statement, but important when you consider the relational size of the page to the size of the smallest details and text. The problem becomes this: the larger the surface area of the image, the further away the camera head must be from the original, the larger the reduction ratio (the ratio of the film image in relation to the original), the smaller the text, the harder it is to re-capture the detail during scanning.

And did I mention that we haven’t even started scanning yet?!? We’ll talk about the scanning process in our next post in this series.

What’s New in Digital Collections: a list of the latest newspapers, books, maps

Wednesday, February 25th, 2009 Posted in Articles, Digital Collections, For Libraries, For the Public, News | Comments Off on What’s New in Digital Collections: a list of the latest newspapers, books, maps


newspapers_introHistorical Newspapers in Washington

Classics in Washington History
classics_intro

County and Regional History

  • Fort Colvile, 1826-1871 by U.S. Dept. of the Interior, National Park Service. Contents: This pamphlet summarizes the history of Fort Colvile, founded by the Hudson’s Bay Company in 1825.

Military History

  • 600 days’ service by Harold H. Burton. Contents: A history of the 361st Infantry Regiment of the 91st Division of the United States Army.
  • Camp Lewis. Contents: An early historical record of the Ninety-First Division at Camp Lewis.
  • Official history of the Thirteenth Division. Contents: The history of the 13th Division, organized at Camp Lewis, American Lake, Washington on July 16, 1918. The book contains photographs of troops and descriptions of their duties.

Native Americans

Natural History

  • Natural history of Washington territory and Oregon by George Suckley. Contents: Preface, including a brief narrative of the explorations from 1853 to 1857.–Errata, with additions and corrections up to 1860.-[pt. 1. Meteorology – not included (see Notes).–pt. 2. Botanical report.–pt. 3. Zoological report
  • Climate of the state of Washington by W.N. Allen. Contents: “A careful and elaborate treatise on the climactic conditions, with reference to temperature, winds, rainfall and snowfall.”

Miscellaneous

maps_introMaps

Washington Historic Newspapers Now Available in PDF

Thursday, February 12th, 2009 Posted in Articles, Digital Collections, For Libraries, For the Public, News | Comments Off on Washington Historic Newspapers Now Available in PDF


blog_newspaperWashington State’s Historical Newspapers as digitized by the Washington State Library are now available in PDF format. This means that teachers, students, and public library users no longer need to download the DJVU viewer in order to use the historical newspaper collection online. (DJVU format is still available for those who prefer it.) To view and/or search the newspaper collection, go to the Historic Newspapers in Washington site or search our Washington electronic newspaper holdings in the Washington State Library Catalog.

Historical newspapers from Washington State’s territorial period (1853-1889) are excellent primary source documents to support the new Social Studies CBA requirements. Teachers and students will particularly appreciate Moments in History, the pre-selected groups of articles on popular research topics. Additionally, Classics in Washington History, a digital collection of rare, out of print books, is also available in full-text for searching and viewing in PDF format.

Image metadata tools

Monday, February 9th, 2009 Posted in Articles, For Libraries, Technology and Resources | Comments Off on Image metadata tools


If you’ve ever labored under the wish that you could easily extract, edit, or read that wonderful data embedded in your image files, you’re not alone.

There are lots of reasons to work with the embedded metadata in your image or other media files. For instance, you may want to keep records about your collections or display the file size and pixel dimensions of images in your collection. Much of this data exists in tags embedded in image files. Some data formats you might see in your images include EXIF, XMP, IPTC-IIM. Each format has its own set of attributes and a lot of those attributes overlap.

<br /> Screen shot of output from both tools (Exiv2 left, ExifTool right)
Screen shot of output from both tools (Exiv2 left, ExifTool right)

Part of our work with the National Digital Newspaper Program is to deliver valid metadata and image files to the Library of Congress. I recently used two command line tools to read and edit the embedded metadata in these files.

You can read and write image metadata using Photoshop but for various reasons I needed a command-line tool (you can email me if you’re interested in why). Both ExifTool and Exiv2 met my criteria:

  • free
  • well documented
  • Unix and Windows OS compatible
  • read and write multiple metadata formats
  • command-line operable

Generally both were useful and required a little patience to install. Exiv2 was a breeze to install on a Windows machine but a bit trickier to build and install on a MAC OS X (using the “Source” pkg.). The full ExifTool install requires Perl but supports more metadata formats and I found that the commands were generally easier to understand and run. There is also a sort of “lite” feature where you just use drag and drop a file over the top of program file and it reads the metadata (I needed to read and write so I didn’t try this but it sounds interesting).

Conclusion: both got the job done and the differences might be negligable to most, but I seemed to prefer ExifTool for the reasons above.

Say good-bye Seattle PI?

Sunday, January 18th, 2009 Posted in Articles, For the Public | Comments Off on Say good-bye Seattle PI?


As part of the State Library’s participation in the National Digital Newspaper Program (NDNP) we’ve been researching Washington’s historic newspapers. In the process we  take a chronological snapshot of the life of a paper. It is a bit like doing genealogy work. For instance, we track when the paper was born, if it changed names, if it was sold or merged with another paper, and when it died – so to speak.

Seattle's Post-Intelligencer building
Seattle’s Post-Intelligencer. As seen from Olympic Sculpture Park, Seattle Art Museum, Seattle, WA. Photographed by afagen. Available on Flickr.com. Used under the Creative Commons license.

When the Washington State Library received news about the Seattle Post-Itelligencer’s demise – er, sale, like most other subscribers, we heard about it on the news and read about it in the PI. Besides the discussion it bears of newspapers and their business models today, it’s an interesting look at how a news organization breaks news about itself.  Over at Time.com, PI reporters become the interviewees and talk about the slim chance of the paper’s sale. They’ve begun to write their paper’s obituary.

A similar situation is happening with other newspapers around the US. Another NDNP colleague took note of the Detroit Free Press news and asked if this was the “middle of the end” of newspapers? Perhaps so, in their current format anyway. As someone who works to digitize and make accessible the pages of historic newspapers, I’m not surprised when I hear people express a desire to get information online and to be able to do research across various titles and media. Aside from the format or delivery, people still want well written, fact-checked, fast and professional information.

This isn’t the first time Seattle has lost a major newspaper. It’s been over 60 years so many people may not know that before there were two main papers, the Seattle Times and the Post-Intelligencer, Seattle had three major newspapers. The Seattle Star was considered one of the three big papers in Seattle until 1947, when its battle to be profitable ended.

Next time you visit your local research library, look at the drawers of microfilm (yes microfilm still exists and continues to be created – for now anyway) bearing the names of newspapers of the past. It speaks volumes about the history of newspapers and bears witness that while newspapers come and go, their importance lives on…

OCLC picked to digitize historic Washington newspapers

Tuesday, December 9th, 2008 Posted in Articles, Digital Collections, For Libraries, For the Public, News | Comments Off on OCLC picked to digitize historic Washington newspapers


Go to OCLC.org

The Washington State Library (WSL) recently awarded Online Computer Library Center, Inc. (OCLC) the contract to digitize 100,000 newspaper images from microfilm. The contract is part of WSL’s National Digital Newspaper (NDNP) grant, recently awarded by the National Endowment for Humanities (NEH), to digitize historic Washington newspapers.

OCLC has a long history of working with historic newspapers and currently maintains the database of  the U.S. Newspaper Program (USNP), an initiative to microfilm newspapers published in the United States from the 18th century to the present; the foundation for NDNP.

Titles that were published within the 1880-1922 timeframe will be selected by the WA-NDNP selection committee and microfilm will be evaluated by WSL staff before it is sent to OCLC for digitization and conversion to full-text, searchable files. The output files will then be evaluated by WSL staff to assure quality before a copy of the files are sent to the Library of Congress and published within the Chronicling America website.

OCLC has worked with other NDNP awardees and offers experience with and knowledge of the rigorous NDNP grant specifications. We look forward to working with OCLC on a project of this scope and importance.

WSL Receives NEH Newspaper Digitization Grant

Friday, November 21st, 2008 Posted in Articles, Digital Collections, For Libraries, For the Public, Grants and Funding, News | 2 Comments »


The Washington State Library (WSL) recently received a National Digital Newspaper Program (NDNP) grant. Washington’s NDNP grant is managed by the Research & Development (R&D) team within the State Library and will fund the digitization of 100,000 newspaper pages from microfilm.

NDNP is funded by the National Endowment for the Humanities (NEH) and is managed in part by the Library of Congress (LC). The NDNP is an initiative that began in 2005 and “builds on the foundation established by an earlier NEH initiative: the United States Newspaper Program (USNP).”

Library of Congress: Chronicling America site
Library of Congress: Chronicling America site

LC hopes to eventually have all historic American newspapers available online and searchable from their Chronicling America website.  

To accomplish Washington’s grant, we are working in partnership with the University of Washington Libraries and other academic and public libraries around the state. The main goal of the grant is to make the newspaper pages full-text searchable using OCR technology. Another important goal is to generate a sustainable and collaborative model for newspaper digitization in Washington State that can continue and build around the state, past this initial grant.

To find out more about Washington’s involvement in NDNP visit WA National Digital Newspaper Program Wiki or contact Laura Robinson, NDNP Coordinator at the Washington State Library.

Hello World!

Thursday, November 6th, 2008 Posted in Digital Collections, For the Public | Comments Off on Hello World!


home page

Some of our real estate on the State Library home page

The Research & Development team at the Washington State Library is dedicated to building online access to the library’s historical books, maps, newspapers, and other special collections. We also work to create sustainable access to Washington State’s government publications and make it easier for citizens to find government information (FindIt Washington and FindIt Consumer).

We’re here to reach out to the history loving public and the library community as a whole. With this blog our goal is to highlight historically significant and amusing items from our collections as well as attempt to demystify the digitization process for libraries through occasional discussions about project planning, material selection, and digital imaging resources.

Please engage in this conversation with us and stay tuned for more….

A new adventure

Thursday, August 28th, 2008 Posted in Articles, Digital Collections, For Libraries | 1 Comment »


Digitizing from a ladder at the Kettle Falls library.

Laura digitizing from a ladder at the Kettle Falls library.

I’m writing to bid farewell to all I’ve worked with in the process of setting up the Washington Rural Heritage initiative. It has been a fabulous year and I’m proud of all the work we’ve accomplished. I’m taking on new challenge here at the Washington State Library coordinating the effort to digitize Washington’s newspapers with the National Digital Newspaper Program grant we recently received.

I’m please to welcome two great folks that will be taking over the Washington Rural Heritage Project.

Please welcome Evan Robb the new project manager. He is a recent graduate of the University of Washington Information School, and comes to us from Cedar Mill Community Library (Washington County, OR) where he has been working in adult reference and circulation since 2005.

Also on board is Kirsten Furl as the project technical specialist. Kirsten just finished up her master’s degree in Library Science from the University of North Texas, and has spent the last year working for the State Library on the Emma Smith DeVoe Papers collection in partnership with the Washington Women’s History Consortium.

Driving the Prius after a late spring snow in Colville, WA

In the Prius after a late spring snow in Colville, WA - good memories.

I’ve been working with both Kirsten and Evan over the past three weeks and feel confident the project will successfully continue. You may contact Evan or Kirsten with questions regarding Washington Rural Heritage.

I also encourage them each to write and tell you all more about their interests and plans for the initiative.

Thanks again to all who have helped make a successful start to this great initiative. A shout out to the folks at the libraries and museums in Ritzville, Enumclaw, Kettle Falls, Grandview, Whitman County, Ellensburg, and Columbia County, and on the Lopez, Orcas, San Juan, Lummi and Vashon Islands. I’ve made some unforgettable memories. Onward and upward!

Lopez, Stevens County and Columbia County add collections

Wednesday, August 27th, 2008 Posted in Articles, Digital Collections, For Libraries | Comments Off on Lopez, Stevens County and Columbia County add collections


We’ve been busy publishing the collections that have been submitted from the 2007-2008 grant awardees.

Lopez Island Heritage - Cora Standley Graham and Frances Guard, dressed as hunters

Lopez Island Heritage - Cora Standley Graham and Frances Guard, dressed as hunters

The Lopez Island Public Library has published the Lopez Island Heritage collection, which features photographs that serve as windows into an earlier time in the Lopez Island farming and fishing community.

The Libraries of Stevens County has published the Stevens County Heritage collection, which contains images from the 1880s to the present that depict the region’s people and communities.

The Columbia County Rural Library District has published the Columbia County Heritage collection which features its first addition: the Early Columbia County School Photographs Collection; 51 original photographs of school houses of the Columbia County region as well as class pictures.

And the Whitman County Library has added two new additions to its Whitman County Heritage collection: the Colfax Postcard Collection of Sandy Jackson and the Photo Collection of Rural Library Service in Whitman County.

We’ll be adding new collections to the site in the coming weeks. To view these and other recently added collections please visit Washington Rural Heritage or subscribe to this blog to keep track of the project.