Digital Collections

New Stuffs on the Horizon...

Now that Historical Perspectives of Canadian Publishing is all finished up we have time, albeit a small amount of time, to concentrate on other portions of the Digital Collections site, and other collections.

World War, 1939-1945, German Concentration Camps and Prisons Collection is nearly complete. Only a few boxes remain to be scanned. The next portion of the project is World War, 1939-1945, Jewish Underground Resistance Collection. This collection is predominantly from 1941-1944 and will contain 325 items. The finding aid for this collection is located here. These collections are two parts of a larger overall project, The Virtual Museum of the Holocaust and Resistance, which is to come much later. That site will be a separate site which pulls from the digital collections site.

Another project that will take a bit more time, but will be an excellent resource once complete is the migration of the World War I Maps & Aerial Photography over to the digital collections site. This will also include approximately 900 more trench maps. The collection will retain the use of mrsid formatting, and the use of the Lizard Tech mrsid delivery server. But, we will also be including JPEG2000 versions of each map & aerial photo and those will be served up with a new Djaktoa image server that our team is working on implementing. Open source > Proprietary :D

The major background project that we will be working on implementing is an upgrade from Drupal 5.x to Drupal 6.x, and cleaning up our code base. Moving to Drupal will provide us with some major improvements. Namely, RDFa support which I am the most excited about! We will also be working on a solution that will allow our catalogue to pull from our collections. Thereby, allowing users to search all of our collections at once from the library catalogue.

Keep an eye on the site. I will announce stuff once we have implemented. Maybe there will be a site redesign in there too!

blog image

Historical Perspectives of Canadian Publishing - LAUNCH!!!

Oops, I was supposed to write about this last Thursday when we actually launched. Busy, busy week. So, without further ado - Historical Perspectives on Canadian Publishing!

So, here is the actual library news story. The site was a year in the making, and still has some content that will be added. An immense amount of hard work was put in by the team. I would like to give a special thanks for all the hard work put in by the project coordinator, Judy Donnelly, Bev Bayzat who handled the data management portion of the project, and Matt McCollow who took over the majority of development responsibilities on the site. Also, many thanks to all of our students who worked on the project - Belinda Hanson, Asiya Zareen, Sherry Sun, and Justina Chong.

Ok, now for the geeky stuff. There are 963 records in the site at the moment, covering approximately 3500 images, audio interviews, and a video tour of Coach House Press. Once again, this collection was built with Drupal and is a sub-site of the overall McMaster University Library Digital Collections site. Users can comment on records and case studies, and logged in users can tag records.

During the development phase of the project, we decided to use the Faceted Search module a lot more than we had used it previously. Most notably in the right hand navigation. When users are in a record, a variety of fields are exposed to the faceted search module, thereby allowing them to discover other similar content based on the metadata from the record.

Finally, Matt put in some hard work during the last week of the project to get Jplayer working in the records which had audio, and Galleriffic for galleries in the "Themes."

blog image
blog image
blog image
blog image
blog image
blog image
blog image
blog image
blog image
blog image
blog image

Drupal/Digital Collections/Images

The Image API was recently released for Drupal 5, and changed a lot of things. During my updates (and redesign), I thought it might be a good idea to provide a sort of "how to" for images in Digital Collections. First thing first, make sure you have CCK! Also of note, for each collection that I've setup with CCK, I created a new imagefield for each one, and provided it with a directory in the "files" directory. So for example, the Peace and War collection has its own image set, and the concentration camp correspondences have their own set.

Modules needed:

Eye Candy Modules:
ImageAPI Reflect

So, install and enable all of the modules listed above, then run update.php. After you update, browse to Administrator > Site Configuration and select ImageAPI. From here you can choose to use GD or ImageMagick. Then proceed to select Image Toolkit under the same section to select your image quality.



Next you just setup your image field for you content type (Administrator > Content Management), and select how you want to display it.


If you want to do the fancy eye candy like this:

Browse to your Imagecache settings (Administrator > Site Building > Imagecache) and setup a scale, and setup a reflection. If you have a white background, change the background RGB color to 255,255,255.



Drupal & Digital Collection Sites - 2

Ok, more Drupal stuff for Digital Collections site. I'll yammer on about "must have" modules in this one. Hit the snooze button if you'd like. Oh, and this is in addition to the ones I mentioned in the previous post... Community Tags, Tagadelic, Service Links, Faceted Search, Views, Zen Theme, Quicktabs, and of course CCK.

First thing first, CCK add-ons:

  • Audio Field - Defines audio field type for CCK content.
  • CCK Fieldgroup Tabs - Display CCK fieldgroups in tabs. Enables splitting up content onto tabs in both editing and display.
  • EMail - Defines an email field type for cck
  • File Field - Defines a file field type.
  • File Field Meta - add metadata gathering and storage to file field.
  • Image - Defines an image field type.
  • Media Field Display - Adds display options for media fields.
  • Node Reference - Defines a field type for referencing one node from another.
  • Number - Defines numeric field types.
  • Text - Defines simple text field types.
  • Video Field - Defines video field type for CCK content.
  • View field - Defines a field type that displays the contents of a view in a node.

Category - combined with CCK, are two absolute must have modules for doing digital collections. You *MUST* have some way to organize your collections - this is the module. Set up a container for each collection, the categories for your different "themes" and sub-categories, and sub-sub-categories. It goes on and on, but this is the way to do it.

Devel - is a great development module for... you guessed it DEVELOPMENT! Seriously, it is great. You get feed sooooooo much output. Arrays, beautiful arrays!

Feedback 2.0 - allows site visitors and users to report issues about this site. We use it internally for tracking issues with metadata creation, and technical issues during digital projects.

Node Import - this is perfect for those silly Filemaker pro databases "some" people start out with. Requires quite a bit of patching to get it to work with cck image field, but well worth it! Ingests the standard csv, tsv files and allows you to map them to specified cck fields.

Nice Menus - works with the standard drupal menus setup. It additionally provides the ability to have horizontal and vertical menus. It is mostly css, and little javascript. The css is highly customizable, so you can easily make it work with your theme.

Front Page - allows you to go further than declaring a node in your settings.php file for the front page. If you are feeling all early 21st century and want to rock a flash landing page, this is the way to do it. Or, can just use it for deciding what users see what front page.

ImageCache - allow you preprocess all your images on the site with ImageMagick.


Google Analytics - ...well you probably know all about google analytics, if not, for god's sake sign up for an account! Very nice features, track certain users, certain content, restrict content, etc.

Printer-friendly pages - is perfect for all of those text heavy case studies. Destroy the environment and print it out to read it later.

Sections - is perfect for theming "sections" (collections) of the site. Each section can have an installed template, theme or style attached to it.

Pathauto - ...I'll just quote from the description, "The Pathauto module automatically generates path aliases for various kinds of content (nodes, categories, users) without requiring the user to manually specify the path alias. This allows you to get aliases like /category/my-node-title.html instead of /node/123. The aliases are based upon a "pattern" system which the administrator can control."

Automatic Node Titles - is perfect for all those thousands of records that you don't want to scribe titles for. You can pull from fields and create a title. So, something like this: Creator, source, date - which is done with a php script:

$token = '[field_creator-formatted]';
if (empty($token)) {
return htmlspecialchars_decode("[field_source-formatted], [field_date-formatted]", ENT_QUOTES);
}else {
return htmlspecialchars_decode("[field_creator-formatted], [field_source-formatted], [field_date-formatted]", ENT_QUOTES);


Well, that is it. Next one will be on the custom/customized modules. Don't worry, I'm not going to beat a dead horse and talk about the OAI module again.

Drupal & Digital Collection Sites - 1

I have written about Drupal & the Digital Collections site ( a few times now, but haven't really explained how to make a digital collections site out of Drupal. So, without further ado...

What are the necessities of a digital collections site?

What are some additional features that have become necessary?

  • Tagging
  • Social Bookmarking
  • Faceted Searching
  • Visually rich environment
  • Profiles, internal site bookmarking
  • Contact forms, Image requests, Questions
  • Commenting
  • Content Recommendation

So how do you do all of this with Drupal - sans JPEG2000 support (working on that now)? Well, if you are familiar with Drupal, you should know that it is an open source, modular content management system with an amazing support & development community. A standard out of the box Drupal installation will not yield a digital collections site - additional modules are absolutely necessary. Time, effort, and some coding with have to be done, but it is well worth it. The key to all of it is the Content Construction Kit (CCK). Briefly, CCK allows you to create your own fields for a node. So, here is where we get the ability to have all your standard Dublin Core fields, and any other unique metadata a collection will need to be able to present. What I have done with my site is setup a Content Type for each collection. Each content type shares the standard Dublin Core fields (very helpful for massaging an OAI module for digital collections out of an available OAI module), then they have their own unique additional metadata. For example, the World War II German Concentration Camp and Prison Camp Collection has metadata fields for Prison, Sub-Prison, Prison Block, etc.

I have written about the OAI module a couple of times, but essentially what I did is take the OAI-PMH Module, which is an interface for the Bibliography Module, and rework it so it interfaces with the CCK fields I created for the standard Dublin Core fields. I have not had the time to generalize it, (I hope to in the future if time is willing!) so it is hard coded to my collections right now.

Searching is a built in feature of Drupal. Drupal does a pretty good job of creating a search index for itself, as well as advanced searching features. With content types for each collection, users can limit their search to a specific collection or a site wide search.

Browsing a collection can be done by setting up categories and containers for a collections, then placing each record under a specific collection when creating the records or doing a massive mysql update query if you have imported a number of records to start with. Also, for custom browsable options I have used the Views modules to create views for specific metadata fields, and limited them to a collection. Also, the Faceted Search module allows you list all of the fields you would like exposed to faceted search, thereby allowing a user to browse by a variety of field types.

Not too much to say about JPEG2000 support right now. There are two possible scenarios that I am brainstorming with. The first one is Lizard Tech. Before I started here, the Library had purchased a Lizard Tech Express Server license in order to display the mrsid images for the World War I trench maps. The new version of the Lizard Tech server supports JPEG2000, and has an API that I should be able to get Drupal to work with - fingers crossed! The other option is the aDORe djatoka open source JPEG2000 server. I planned on working on this at the Access 2008 Hackfest, but got distracted with SOPAC and Evergreen.

So, now for the rest - additional features...

Tagging is done with the Community Tags module, and tag clouds are created with Tagadelic.

Social Bookmarking is done with the

Faceted Searching is done with the Faceted Search module.

Visually rich environment is done with a variety of modules and custom template coding. Modules that assist in making this possible include; Views (and many views sub-modules), Zen Theme, jquery, Highslide, and Tabs & Quicktabs.

Profiles, internal site bookmarking... user accounts are a standard feature of any content management system. With Drupal we used a custom view and a user hook to allow registered users to bookmark any record to their account.

Contact forms, Image requests, Questions is done with the Contact Form module. Here users can ask questions about records, request images, a report and problems with the site or records.

Commenting is another build in feature of Drupal. Comments are allowed on every record on the site. Unregistered/Anonymous users have to deal with a CAPTCHA, where as registered users do not.

Content Recommendation is done with the Content Recommendation Engine (CRE). This modules interfaces with a number of other modules. The main one that I utilized is the Voting API. The Voting API combined with the CRE allows for a digg like feature on each record. Each record has a Curate It! link, items that have been "curated" are then featured on the Items Curated Page. Drupal also has a popular content feature as well that I utilize.

So, that is pretty much it for the bullet points listed above. I will have another post or two about Drupal in digital collections. Once featuring the all of the modules that I take advantage of, and another covering any questions anybody has.

PW20C Launch & Local Press Coverage

Here is the library press release:

The William Ready Division of Archives and Research Collections at McMaster University Library is launching the latest in a series of digital initiatives aimed at bringing its unique collections to a wider, online audience. The new site, Peace and War in the 20th Century, has been developed with the assistance of almost $100,000 in funding from the Department of Canadian Heritage, through the Canadian Memory Fund.

This website aims to create an immersive virtual environment which invites users to explore two of the most central and formative aspects of twentieth-century culture: peace and war. Foregrounding McMaster’s extensive, unique and world-renowned archival collections, incorporating advice from the best subject experts in the field and utilizing state of the art, robust digital technology, the site tells the compelling story of how these two contrary impulses have shaped our country and our world.

Organized into compact thematic modules, constructed to appeal to a wide range of users, content presented in digital form ranges from wrenchingly personal diaries, letters and photographs to the powerful public propaganda of recruiting posters, peace bulletins, and popular songs. The site includes some 3000 database entries and almost 50 individual case studies as well as audio and video segments, maps and an animation of a First World War trench raid, recreated from original archival documents.

The site is already winning praise. Dr. Ken Cruikshank, Chair of the Department of History says: “what makes this website exciting to me is that it introduces students to the exceptional archival resources available to them in their own backyard, at McMaster University Library. The online sources are an exciting addition to research materials currently available on the Internet, and will motivate students interested in studying efforts to make peace, or the social, political and cultural impact of war.”

The project is the first developed by McMaster University Library in collaboration with two community partners, Local History and Archives at Hamilton Public Library and Canadian Warplane Heritage Museum.

A launch event to celebrate the project is being held on Monday, September 29th at 10:30 am to 11:30am in Convocation Hall.

If you are interested in attending the launch or for more information, please contact Kathy Garay.

And the Hamilton Spectator story - History lessons online: Major collections contribute to Peace and War website

Got another grant!!! History of Canadian Publishing

Just we put the finishing touches on the Library and Archives Canada funded Peace & War in the 20th Century project, we received word from the granting agency that our newest grant application has been accepted. We have been awarded almost $100,000 to develop a state-of-the-art, interactive website on the history of Canadian publishing. The project will last for a year (same amount of time for the PW20C project), and will focus on the history of Canadian publishing houses, people in publishing, authorship, and aspects of unique to Canadian culture.

The William Ready Division of Archives and Research Collections houses one of the most prestigious collections of the subject of Canadian publishing. We will also be collaborating with the Thomas Fisher Rare Library at the University of Toronto and Queen's University, who both also hold extensive archives on the subject.


Here is the link to the McMaster Daily News story:

Judy Donnelly, project specialist, Rick Stapleton, archivist librarian, Nick Ruest, digital strategies librarian, and Carl Spadoni, research collections librarian, pose with some of the artifacts that will be available on a website about the history of Canadian publishing. Photo by Susan Bubak.
Photo by Susan Bubak.

Kirtas Launch Event Photos & Local Press Coverage

We've added pictures from our Mass Digitization and Publishing Launch to the blog, including pictures of or brand new Kirtas scanner. The event was a huge success, drawing over 100 guests, and press coverage. The Hamilton Spectator Catherine Baird, our Marketing, Communications, and Outreach Librarian and myself about the project. The article can be viewed here: released into the wild

Ok, the Digital Collections website is ready for beta testing. Registered users can comment, vote on comments, and tag records - and updated version of the "bookbag" will be added soon. Collections with content include; Peace and War in the 20th Century, Russell Library, and World War II Concentration Camp Correspondences. At this time, AICT, and Kirtas Book Collection are outlines for content to be added later.

For the technical nerds! The site runs on Drupal, and takes advantage of the CCK Module. Each collection has its own content type, allowing it to expose its own unique metadata. All of the collections share Dublin Core fields, which combined with a modified version of the OAI2 module, provide OAI2 compliance. As of the right now, there are approximately 165,000 nodes - with the great majority of those records being an experimental version of BRACERS (more of this some other time).

*the bookbag is a feature that allows registered users to bookmark records

Institutional Repository Update

ummmmm, I don't know how to make a post about the Institutional Repository funny or witty, so I'll go a little corporate sounding. bleh. I have been updating the repository quite a bit since the beginning of this year. We were apart of the first batch of upgrades from Bepress last month. We added categories and series, for most of the departments on campus so faculty can begin submitting scholarly output. Three additional journals that reside at McMaster are also in the process of being setup on the IR. Global Labor Journal, a completely open access journal that will launch on our IR. Energy Studies Review will follow an access model similar to the Russell Journal, where the current four years are subscription based, and the back catalogue is open access. 18th Century Fiction is in the works, the process of putting the journal in will begin in July. Finally, with the help of colleagues, I formed a Instituional Repository Steering Committee. The committee will:

1. Form a "collection" plan (strategy for getting specific types of materials into the DC (subject/formats ; McMaster faculty output, university archives, explore the possibility of creating a "subject" archive where none currently exists to serve a broad scholarly community
2. An infrastructure, whereby scholars can easily contribute materials
3. A communication plan for raising awareness about the IR on campus
4. Regular updates to stakeholders on progress (once/term)