Digital Collections

Fail, Fail, Fail, Success?

This past week I had the privilege of speaking on a panel at Access 2011 about failing entitled, "If you ain't failin', you ain't tryin'!" Amy Buckland moderated the panel where we each took five minutes to tell a library tech fail story to encourage the audience to share their failure stories. I think it went over great, and was cathartic to say the least.
 
I shared my story, and afterword I had that familiar feeling of "but, wait! I have even more to say!" There are so many lessons to be learned! So, I'll share the story again here and *all* of the lessons learned that given requisite time I would have said.
 
The story
 
Three years ago I was on an Access panel presentation to speak about a project we had just hit a critical milestone on. Ironically, I spoke at Access 2011 on a fail panel about that same project.
 
When I started at MPOW I was thrown to the wolves. We had received a Library and Archives Canada grant to digitize a large number of items from our collections and create a thematic, cutting edge, web 2.0 website for it. Think tag clouds a.k.a the mullets of the internet (attribution c4lirc). Guess what? We had no infrastructure. No policies or procedures for digitization. No workflows. No metadata policies. No standards. 
 
Given the short turn around time of the grant - 1 year - and the grant requirements, a vendor based drop-in solution would not cut it. So we did it all live! 
 
We took a month to do some rapid prototyping and pulled off a pretty cool proof of concept with Drupal. It worked, and continued to work. It was the basis of our infrastructure moving forward, and at the time it was perfect!
 
In the background of working on the PW20C project, we had the foresight to begin creating an overall "repository" to pull content from - Digital Collections @ Mac. A Drupal 5 based repository infrastructure loosely based on best practices and standards at the time. A standard Dublin Core field set created with CCK for records with our own enhanced metadata fields for collections, a hacked-together OAI-PMH module and some really cool timeline visualizations using the SIMILE project.
 
Flash forward a year, and we have secured another LAC grant for Historical Perspectives on Canadian Publishing; another thematic based digital collection site. Time crunch was in effect, and we pulled together another great project with probably 10x more case studies. My heart goes out for our project coordinator on this one pulling all of those case studies together. 
 
Flash forward another year, we have what I believed a pretty solid frame work for digital collections. We have a main digital collections site, and two heavily customized thematic sites. We are also about 8 months into a major upgrade of our digital collections infrastructure; migrating everything from Drupal 5 to Drupal 6. 
 
We upped our functional requirements. We wanted to hang with the cool kids: linked data, seemless JPEG2000 support, KML integration, and MediaRSS support. Yeah, MediaRSS.
 
Here is where the fail comes to fruition. Mistakes were made. Mistakes were made.
 
There is this what I suppose could be a called a koan in the Drupal community, "do it the Drupal way." Problem is the Drupal way changes depending on who you are talking to and what time of day it is, and what version you are on. Heavily customizing Drupal themes are definitely not the Drupal way to do things. Those two thematic sites became an albatross, and have sense been put out to pasture on their on virtual machines. (Note. Drupal 5 and PHP 5.3 really don't like each other.)
 
Lessons learned
 
Do *not* create custom thematic digital collections sites. To further clarify this, do not create custom thematic digital collections sites if you have limited personnel resources and actually have other *stuff* to do.
 
Do *not* create policy, procedures, workflows, best practices on the fly. However, given the title of the panel, sometimes you really need to fail to get those best practices down. So, how about, Do *not* create policy, procedures, workflows, best practices on the fly for mission critical projects.
 
Your data your precious. Think a technology a step later. For us, then past Drupal, think past Fedora. We need to be able to move from platform to platform with ease. Thankfully we had the wherewithal to structure our data in such a way that it was pretty painless to extract.
 
Sometimes when you think you are *not* reinventing the wheel, you are in-fact reinventing the wheel. Look the the community around you and get involved. Don't be afraid to ask stupid questions. Some of those questions that I thought were stupid and shouldn't be asked were in fact questions that were begging to be asked.
 
Also akin to reinventing the wheel, the hit-by-the-bus scenario. Your really awesome-homegrown-fantastic-full-of-awesomeness thing you build, you get hit by a bus, take another job, etc. your place of work is so entirely screwed. At the very least, DOCUMENT, DOCUMENT, DOCUMENT. 
 
The library tech community is pretty rad. We're all doing a lot of similar work that doesn't need to be replicated, or if it does, does not need to be completed reinvented. Again, engage, and interact.
 
Moving forward, making this fail into a success...
 
Over the past few months we have taken the time to sit down and write out our digitization/digital collections philosophy with stakeholders. What I thought might be a difficult and painful exercise turned out to be quite wonderful and we came up with a document that I am proud of. 
 
We also took the time to do a study of what digital preservation means at MPOW, and what we are capable of doing right now, what we can be doing in the near future, and what we should look to achieve in the long-term. This segued nicely into a functional requirements document for our repository infrastructure.
 
Right now, we are working on creating what I believe to be a solid infrastructure; heavily documented! Something we lacked all along, and what some of my colleagues know me for - that guy who walks around stamping his feet about infrastructure all the time. INFRASTRUCTURE. INFRASTRUCTURE. INFRASTRUCTURE.
 
Hopefully in a year or two I can come back to Access and present on a panel full of folks turning failures into success!

»

Node Import fails me | Hack the database!

Over on the dev version of our digital collections site we are working on lots of new features. One of them being JPEG2000 support for our World War I trench maps, World War I aerial photos, and World War II Italian topographical maps. Lightbox2 simply does not cut it when researchers would like to examine these wonderful images. Being that we are pretty short staffed here and don't have the wherewithall to whip up a Drupal module to do this "properly", we have come up with what I think is a pretty creative solution to adding the jp2 images to the records in Drupal.

blog image
»

Library day in the life - 5 - Day 4

Wow, day 4 already. This week seems to be going by fast. Worked from home for a bit this morning and then took the train in again. Email this week has been miraculously low. Probably from all the moves. The 5th floor eerily empty, absolutely bizarre up there now.

Morning

blog image
blog image
»

MEETINGS ALL DAY - ?_?

Meetings all day. Will everything go better than expected, or will I rage?

Morning:

email - nope, I'm in meetings all day.

Got into work and discovered the contract worker for the giant 25,000 object digitization project started yesterday and nobody told me.

LOOK OF DISAPPROVAL

Checked in the worker and made sure that she was provided with proper documentation regarding file-naming convention, scanning requirements, and storage.

blog image
»

The ultimate question when working from home - When do I put pants on?

I should just write a script that pulls from all of these librarydayinthelife & #libday4 tags and make it write a post for me.

Morning:

Email. Surprisingly not that much for the morning. Hopefully the trend stays that way through out the day.

Podcast Monday! TWiT, Spark, Quirks & Quarks. Anybody else find Calacanis really annoying when he is on TWiT?

blog image
»

Memes, Smemes, Email, SQL, and Galleries

Library Day in the Life - meme validation below

Theme of the day was SQL queries.

Morning:

Email, email, email, email.

Cleaned up some more raunchy code in a effort to make the theme migration to Drupal 6 less hectic. Lots of SQL queries to un-hack code, and write said code in a standardized fashion. Note to self, do not ever hire immature developers.

blog image
»

Concentration Camp Correspondences

After an entire year of scanning and meta data entry by a couple of amazing students, we have finished a portion of the World War, 1939-1945, German Concentration Camps and Prisons Collection. The entirety of the Concentration Camp Correspondences [http://digitalcollections.mcmaster.ca/concentration-camp-correspondence] - 1031 to be exact - are up online with full meta data records.

blog image
blog image
»

OMG! You Don't Need CONTENTdm!!!

So, I bet a lot of you are wondering what is up with my with my title? Well, I don’t plan on standing up here taking potshots at OCLC for 15 minutes, but I am sure some people in the crowd wouldn’t mind. Basically, the title should have had a very long sub-title along the lines of, like Dr. Strangelove or: How I learned to Stop Worrying and Embrace Open Source Software.

How many people here know what CONTENTdm is? Well, straight from the site - is a single software solution that handles the storage, management and delivery of your library’s digital collections to the Web.

»
Syndicate content