Background I have been doing a fair bit of scale testing for York University Digital Library over the last couple weeks. Most of it has been focused on horizontal scaling of the traditional Islandora stack (Drupal, Fedora Commons, FedoraGSearch, Solr, and aDORe-Djtatoka). The stack is traditionally run with Apache2 in front of it, and it reverse proxies parts of the stack that are Tomcat webapps. I was curious if the stack would work with nginx, and if I would get any noticeable improvements by just switching from Apache2 to nginx.
The following is the text for a video that I was asked to record for the 2014 International Internet Preservation Consortium General Assembly Curator Tools Fair, on the Islandora Web ARChive solution pack.
My name is Nick Ruest. I am a librarian at York University, in Toronto, Ontario. I’m going to give a quick presentation on the Islandora Web ARChive solution pack. I only have a few minutes, so I’ll quickly cover what the module does, what areas of the web archiving life cycle it covers, and a provide a quick demonstration.
Incorporating a suite of digital preservation tools into various Islandora workflows has been a long-term goal of mine and a few other members in the community, and I’m really happy to see that it is now becoming more and more of a priority in the community.
A couple years ago, I cut my teeth on contributing to Islandora by creating a FITS plugin for the Drupal 6 version of Islandora.
Community Some pretty exciting stuff has been happening lately in the Islandora community. Earlier this year, Islandora began the transformation to a federally incorporated, community-driven soliciting non-profit. Making it, in my opinion, and much more sustainable project. Thanks to my organization joining on as a member, I’ve been provided the opporutinity to take part in the Roadmap Committe. Since I’ve joined, we have been hard at work creating transparent policies and processes software contributions, licenses, and resources.
Below is the text and slides of my presentation on the Web ARChive solution pack at Open Repositories 2013.
I have a really short amount of time to talk here. So, I am going to focus on the how and why for this solution pack and kinda put it in context of the Web Archiving Life Cycle Model proposed by the Internet Archive earlier this year. Maybe I shouldn’t have proposed a 7 minute talk!
What is it? The Islandora Web ARChive Solution Pack is yet another Islandora Solution Pack. This particular solution pack provides the necessary Fedora objects for persisting and disseminating web archive objects; warc files.
What does it do? Currently, the SP allows a user to upload a warc with an associated MODS form. Once the object is deposited, the associated metadata is displayed along with a download link to the warc file.
Hit a bit of a wall yesterday getting checksums working when ingesting content into Islandora, so I made a Gource video of the Islandora commits in my fork of the git repo.
Music by RipCD (@drichert) and myself.
How’d I do it?
I wanted to use the Gravatars, so I used this handy little perl script. Hopped into the Islandora git repo, and ran:
gource –user-image-dir .git/avatar/ -s 3 –auto-skip-seconds 0.
Digital preservationistas rejoice?
I managed to get FITS integration working in Islandora via a plugin. The plugin will automatically create a FITS xml datastream for an object upon ingest in the Islandora interface for a given solution pack. Right now I have it working with the Basic Image Solution Pack, Large Image Solution Pack, and PDF Solution Pack. You just have to make sure fits.sh is in your apache user’s path (thanks @adr).