Right! That hackfest report I should have gave...

When I was at Islandora Camp trying to wrap my head around all things Islandora and Fedora, I was thinking ahead about a possible project in archives and research collections - migrating our collection/fonds descriptions and finding aids over to ICA AtoM.
ICA AtoM does some pretty cool stuff in terms of access to collection/fonds descriptions, integrates very nicely with Archivematica with accessioning born digital objects, and associating digital representations of item level objects with their respective collection/fonds. My greedy little brain wanted more! I wanted ICA AtoM to be able to pull in Fedora objects automatically and associate them with their respective collection/fonds. So, this is the hackfest proposal I submitted.
So what happened? What'd we end up doing?
The amazing Peter Van Garderen made absolutely sure Artefactual Systems staff was highly represented at hackfest, and I had two amazing people from Artefactual trying to parse my sleep-deprived-scatter-brained-state reasoning/logic behind what I wanted to do. David Juhasz and Jesús García Crespo, you rock!
We spent the first hour or so working through the Fedora REST API documentation looking for the best way to approach the "problem." After about an hour or so of working through a few conditional queries that would need to be strung together, Jesús jumped in and said, "Why aren't we using SWORD for this!?" Good question!
ICA AtoM can speak SWORD and Fedora and speak SWORD so long as you can get the module working. As things at hackfest generally go for me, it failed. I could not for the life of me get the module to build. Spend a some time going through build.xml and ant and I just weren't going to be friends that day.
Strike one - don't code conditional Fedora REST API queries - not sharable and scalable
Strike two - I couldn't get the SWORD module to build!
Strike three - ???
While brainstorming for other solutions to our "problem", David was looking for examples in which I could share records from our repository. Duh! OAI-PMH! ICA AtoM can harvest OAI. If we can map OAI sets to ICA AtoM collections/fonds, and set records to indivudual items in a collection/fonds we're set. Oh my, another use case of OAI-PMH! Yay!
Did we succeed? Not actually. Turns out the OAI-PMH harvesting code wasn't quite up to snuff at the time, and David, bless his heart, worked on trying to get it up to par before the end of the day. We were not able to pull together a working version, but the framework is there. It was there all along! (Ed, yes we could have and totally should have used atom :P )

Fail, Fail, Fail, Success?

This past week I had the privilege of speaking on a panel at Access 2011 about failing entitled, "If you ain't failin', you ain't tryin'!" Amy Buckland moderated the panel where we each took five minutes to tell a library tech fail story to encourage the audience to share their failure stories. I think it went over great, and was cathartic to say the least.
I shared my story, and afterword I had that familiar feeling of "but, wait! I have even more to say!" There are so many lessons to be learned! So, I'll share the story again here and *all* of the lessons learned that given requisite time I would have said.
The story
Three years ago I was on an Access panel presentation to speak about a project we had just hit a critical milestone on. Ironically, I spoke at Access 2011 on a fail panel about that same project.
When I started at MPOW I was thrown to the wolves. We had received a Library and Archives Canada grant to digitize a large number of items from our collections and create a thematic, cutting edge, web 2.0 website for it. Think tag clouds a.k.a the mullets of the internet (attribution c4lirc). Guess what? We had no infrastructure. No policies or procedures for digitization. No workflows. No metadata policies. No standards. 
Given the short turn around time of the grant - 1 year - and the grant requirements, a vendor based drop-in solution would not cut it. So we did it all live! 
We took a month to do some rapid prototyping and pulled off a pretty cool proof of concept with Drupal. It worked, and continued to work. It was the basis of our infrastructure moving forward, and at the time it was perfect!
In the background of working on the PW20C project, we had the foresight to begin creating an overall "repository" to pull content from - Digital Collections @ Mac. A Drupal 5 based repository infrastructure loosely based on best practices and standards at the time. A standard Dublin Core field set created with CCK for records with our own enhanced metadata fields for collections, a hacked-together OAI-PMH module and some really cool timeline visualizations using the SIMILE project.
Flash forward a year, and we have secured another LAC grant for Historical Perspectives on Canadian Publishing; another thematic based digital collection site. Time crunch was in effect, and we pulled together another great project with probably 10x more case studies. My heart goes out for our project coordinator on this one pulling all of those case studies together. 
Flash forward another year, we have what I believed a pretty solid frame work for digital collections. We have a main digital collections site, and two heavily customized thematic sites. We are also about 8 months into a major upgrade of our digital collections infrastructure; migrating everything from Drupal 5 to Drupal 6. 
We upped our functional requirements. We wanted to hang with the cool kids: linked data, seemless JPEG2000 support, KML integration, and MediaRSS support. Yeah, MediaRSS.
Here is where the fail comes to fruition. Mistakes were made. Mistakes were made.
There is this what I suppose could be a called a koan in the Drupal community, "do it the Drupal way." Problem is the Drupal way changes depending on who you are talking to and what time of day it is, and what version you are on. Heavily customizing Drupal themes are definitely not the Drupal way to do things. Those two thematic sites became an albatross, and have sense been put out to pasture on their on virtual machines. (Note. Drupal 5 and PHP 5.3 really don't like each other.)
Lessons learned
Do *not* create custom thematic digital collections sites. To further clarify this, do not create custom thematic digital collections sites if you have limited personnel resources and actually have other *stuff* to do.
Do *not* create policy, procedures, workflows, best practices on the fly. However, given the title of the panel, sometimes you really need to fail to get those best practices down. So, how about, Do *not* create policy, procedures, workflows, best practices on the fly for mission critical projects.
Your data your precious. Think a technology a step later. For us, then past Drupal, think past Fedora. We need to be able to move from platform to platform with ease. Thankfully we had the wherewithal to structure our data in such a way that it was pretty painless to extract.
Sometimes when you think you are *not* reinventing the wheel, you are in-fact reinventing the wheel. Look the the community around you and get involved. Don't be afraid to ask stupid questions. Some of those questions that I thought were stupid and shouldn't be asked were in fact questions that were begging to be asked.
Also akin to reinventing the wheel, the hit-by-the-bus scenario. Your really awesome-homegrown-fantastic-full-of-awesomeness thing you build, you get hit by a bus, take another job, etc. your place of work is so entirely screwed. At the very least, DOCUMENT, DOCUMENT, DOCUMENT. 
The library tech community is pretty rad. We're all doing a lot of similar work that doesn't need to be replicated, or if it does, does not need to be completed reinvented. Again, engage, and interact.
Moving forward, making this fail into a success...
Over the past few months we have taken the time to sit down and write out our digitization/digital collections philosophy with stakeholders. What I thought might be a difficult and painful exercise turned out to be quite wonderful and we came up with a document that I am proud of. 
We also took the time to do a study of what digital preservation means at MPOW, and what we are capable of doing right now, what we can be doing in the near future, and what we should look to achieve in the long-term. This segued nicely into a functional requirements document for our repository infrastructure.
Right now, we are working on creating what I believe to be a solid infrastructure; heavily documented! Something we lacked all along, and what some of my colleagues know me for - that guy who walks around stamping his feet about infrastructure all the time. INFRASTRUCTURE. INFRASTRUCTURE. INFRASTRUCTURE.
Hopefully in a year or two I can come back to Access and present on a panel full of folks turning failures into success!

TEDx LibrariansTO - Discussion group

On Saturday (June 25, 2011) I had the opportunity to attend TEDx LibrariansTO and was asked to facilitate an audience discussion group. I tried my best to guide our group through 9 discussion questions. We made it through 6. But! I think we were the only group that actually went through the questions. The group discussion was *really* great, and I figured I'd share the notes I took. So without further ado, here are my notes.

  1. What means should librarians choose to encourage their institutions to embrace change?

    * New formats are definitely an opportunity for change elements
    * Open Access -> change the publishing model
    * "Buy it and we'll have to use and adapt it!" [Not sure how I feel about this, but is an opportunity for change I suppose.]
    * "Just do it!" (Quoting Amy Buckland's talk)
    * Challenging and debating each other - we need to do this more, and stop being so stereotypically nice.
    * Embrace dissent
    * Challenge vendors, and take back/refine argument that access to information is a public good (curb external forces from defining libraries)
    * Advocate and share these stories (and events) to people outside of libraryland - stop talking to ourselves. [This aligns well with John Dupuis' Stealth librarianship manifesto]


  2. What are the similarities or characteristics of thought leaders that you know? Tell us about the attributes that your ideal thought leader would have.

    * The ability to extract and extrapolate common/shared ideas among people and push and drive other people [CHARISMA!]
    * Good thinker/motivator
    * Obsession with higher concepts
    * Dedicated, articulate, think a few steps ahead, and inspire
    * Good cat herder
    * Follow through with action
    * Natural curiosity
    * Radical collaboration combined with finding champions (relationship building)
    * Be able to an idea/project sell inside and outside the library (sell the benefit)
    * Leading edge *not* bleeding edge
    * Savvy -> ability to manage the opposition
    * Do the extra work
    * Thought leader for patrons and the community as well as in libraryland

  3. How can experience of failure contribute to making an effective thought leader?

    * You *must* fail before you succeed
    * Ability to learn from mistakes
    * Ability to recognize mistakes
    * Thinking who takes an unpopular view
    * Acknowledgement that failures can be as good as successes
    * Need a culture that will embrace failure
    * Analyze other failures to show it can succeed somewhere else
    * Build relationships
    * Learn to talk about mistakes openly

  4. What venues are available to us to constructively criticize each others ideas?

    * Yelling at vendors
    * Learning how to argue with each other
    * Step being so polite with each other
    * CLA great debate is a good idea, we should do this elsewhere and more often

  5. What should we expect/demand of our thought leaders?

    * Honesty with provocative statements
    * Sense of humour
    * Openmindedness
    * Dedication to life long learning
    * Push people out of the way
    * Though armada
    * Don't burn out

  6. We can't all be thought leaders all the time. Often, by necessity we are followers. So, what does it mean to follow a thought leader well?

    * Make sure they don't burn out
    * Help out
    * Step up when things can be delegated
    * Thought collaborators

  7. Name on thing we could do right now in order to be perceived as thought leaders outside the profession.
  8. How do we recognize a thought leader?
  9. Are the loudest voices online actually representative of important thought currents?

Overall, there was some dissent in the group over the term "thought leader". Not everybody agreed with the term, and felt like it encouraged centralized power. There was an overwhelming outcry against centralized power. Also, our group came up with what I think is an excellent critique to the whole thought leader idea. I can't remember who exactly said it, but whoever it was, it was great! "The thought is the leader, we are the thought supporters." If anything, it would be another great discussion piece.
Should have started off with this, but a *BIG* thank you to Shelly and Fiacre for putting this together. It was a great day with great people! Also, a big thank you to all the speakers!

This profession is worth fighting for

I may work at an institution with arguably the worst morale among librarians in Canada, but I love my job. I love the people I work with. I love my profession. 
There are people *looks left, looks right* actively attempting to de-professionalize librarianship. Yes, the times are a changing, but that does not mean we are not effective anymore and are useless. Technology has radically changed medicine - there are still doctors right? Unless I am missing anything. If anything, we have illustrated an amazing ability to adapt to technical change and remain relevant. This isn't a time when we should be losing librarians, and divving up work among the scattered remains. It is a time when we should retooling and reinforcing our ranks. To quote Karen Schneider: "In the end, what matters, and what we are about, are the ancient truths of librarianship: organizing, managing, making available, preserving, and celebrating the word in all of its manifestations; helping our users build skill sets the fundamentals of which (if not the ephemeral details) will last a lifetime; and celebrating and defending the right to read, however that word is interpreted. This is what we do. This is who we are. This makes us librarians."

 I dedicate each day to the profession. Will you?

blog image

Pile on IE!

IE is a $@#*) *$)#)@% non-standards-compliant bastard of a browser. This is news to absolutely nobody who has had to develop a website or on a website with standardized behaviour across browsers. Some say IE9 is a great move forward, but Microsoft has lost any benefit of doubt in my book. Anyway, my buddy Dan and I were chatting about this today and during the discussion I visualized Renton's rant about being colonized by the English from Trainspotting being adapted to the situation. So, in a horribly hamfisted attempt here it is: 

"It's SHITE coding for IE! It's the lowest of the low, the scum of the fucking earth, the most wretched, miserable, servile, pathetic trash that was ever shat into cilivization. Some people hate Microsoft, I don't. They're just wankers. We, on the other hand, are colonized by wankers. We can't even find a decent browser to be colonized by. We are ruled by effete arseholes. It's a shite state of affairs to be in..."

blog image

Podcasts for the nerd librarian

Every so often people ask me about what podcasts I listen to, and every so often I start listening to something new and get terribly excited about it and have to tell my colleagues all about it. Also, this past semester I taught my first course. It was an LIS course entitled, "Introduction to Technology." Instead of the normal plethora of weekly readings, I toned the readings down a little bit, and added a few podcasts as "suggested listening" for learning experiment. It went over well, so I figured I post something about my favourites, why I like them, and why I think they are relevant. I'm sticking to podcasts that tie into my profession - FilmJunk, Quirks and Quarks, and Linux Outlaws, not this time. If you have any recommendations, please share!

Digital Campus - "A biweekly discussion of how digital media and technology are affecting learning, teaching, and scholarship at colleges, universities, libraries, and museums." I stumbled across this podcast late last spring, and have been an eager listener every other week. The podcast is in a way an extension of the work being done at the Center for History and New Media at George Mason University and moreover, it is an excellent source of information regarding projects and trends in the digital humanities. At MPOW we are currently in the infant stages of creating a digital scholarship centre that is going to be integrated with our library, so the podcast is a great way to stay fresh with what is going on in digital humanites.
The Changelog - is a weekly podcast that covers new open source software projects. I constantly find myself a little lost with the many different open source projects and trends. This podcast allows me to listen to an episode on specific project so I can learn more about it. Sometimes it is exactly what I need, other times it is over my head, but hey, that happens.
This Week in Law - is a weekly podcast on the TWiT network hosted by Denise Howell and a panel which covers new issuses in technology law. Before I entered the library profession, I had intended on becoming a lawyer. Well, we all see how that went. But, I never lost interest in law, or issues that have always interested me. This  podcast, along with the next, allow me to keep up and stay fresh.
Free as in Freedom (previously, The Software Freedom Law Center podcast) - is a biweekly podcast featuring Bradley Kuhn and Karen Sandler covering legal issues and topics in the open source and free software world. This podcast again appeals to the legal issues sides of me, but more so in so far that it deals mainly with legal issues around open source and free software. Dan Scott tipped me off to the podcast one day, and I have been a regular listener ever since. 
The Command Line - is a weekly podcast consisting of a news episode, what the host Thomas Gideon calls a rant episode which is more or less an essay on a given topic, and the occasional interview. 
JISC - is an intermittent podcast put out by the Joint Information Systems Committee (JISC) touching on new issues and trends in library and information science. 
CBC Spark - is a combination of a weekly podcast, radio show and blog which centres around technology and culture, and is hosted by Nora Young. The show occasionally crosses over into information science territory, but I listen to it more so for the interesting topics each week.
TVO Search Engine - is a weekly podcast hosted by Jesse Brown which explores the Internet and technology's impact on culture and politics. This is one of my absolute favourite podcasts, and an interesting take on journalism in my opinion. 

The making of a god... THE GOD OF CAKE!

Node Import fails me | Hack the database!

Over on the dev version of our digital collections site we are working on lots of new features. One of them being JPEG2000 support for our World War I trench maps, World War I aerial photos, and World War II Italian topographical maps. Lightbox2 simply does not cut it when researchers would like to examine these wonderful images. Being that we are pretty short staffed here and don't have the wherewithall to whip up a Drupal module to do this "properly", we have come up with what I think is a pretty creative solution to adding the jp2 images to the records in Drupal.

First off we setup Adore-Djatoka and started converting all of our tifs to jp2 with the compress.sh utility that comes with Djatoka. We have all of the aerial photos and topographical maps converted. However, we are running into to some heap space problems with the trench maps. I am assuming it is because of their shear size - 400MB-2.5GB. Heap space errors aside, we setup the resolver and the IIPImage Viewer - sample.

Next we setup a CCK IFrame field for each content type and prepped our imports. This is where we ran into a bit of trouble - Node Import does not support CCK IFrame. Problem - time to get creative! We decided to import the records without the jp2 field, and would update then in the database which in turn presented us with a couple more problems... err, how about we say quirks. The update was fairly straight-forward, just like the following two MySQL queries:

UPDATE content_field_jp2 c JOIN content_field_identifier d ON (c.nid = d.nid) JOIN node n ON (c.nid = n.nid)SET field_jp2_url = CONCAT('http://digitalcollections.mcmaster.ca:8080/adore-djatoka/iipi/viewer.htm...', d.field_identifier_value) WHERE n.type = 'wwiap';

UPDATE content_field_jp2 c JOIN content_field_identifier d ON (c.nid = d.nid) SET field_jp2_attributes = 'a:4:{s:5:"width";s:4:"100%";s:6:"height";s:3:"768";s:11:"frameborder";s:1:"0";s:9:"scrolling";s:4:"auto";}';

Once the above queries were run the actual nodes were not technically updated due to actually needing to invoke the hooks needed to actually update them. Or as I like to call them, Drupal's rules, regulations and procedures. Basically we had to batch re-save all of them. Hooray for View Bulk Operations! However, we noticed a problem after we re-saved all of the nodes from a particular content type; it did not always reflect what we updated it to. We discovered that the CCK Cache was interfering. The solution was to wipe the 'content_cache' table, run our two update queries again, then batch re-save all of the records. The results are pretty nice, we have our embedded jp2 with our metadata. Now just to theme everything!

blog image

Library day in the life - 5 - Day 5

Here we are at the final day. Friday. Work from home. WIN. VPN, shell, type, type type, forward ports, oh man, email.


Morning soundtrack - Four Tet - Remixes, Plaid - Parts in the Post

Finally finished all of the field merges. Now on to some batch metadata field editing for the World War, 1939-1945, Jewish Underground Resistance Collection. Metadata must be accurate, metadata must be correct! Sorry, no link for this collection for the public yet since it is being populated on the dev version of the site. Hopefully it will be public by some time in the fall. *fingers crossed*

Batch published another set of about 100 theses and dissertations on Digital Commons. Taught my student workers how to publish the thesis and dissertations themselves since graduate studies will be using the same collection in digital commons to begin publishing new theses and dissertations.

Afternoon - nada. Flex. Holiday weekend in Canada.