oai-pmh

DPLA Appfest Drupal integration

Below is the output of the little project I worked on today at the DPLA Appfest. It definitely isn't a perfect solution to the problem. It is not a drop-in module to just grab a collection from the DPLA API and "curate" it in your library's Drupal site. I hate reinventing the wheel especially if there are existing modules that can solve the problem for you. Moreover, as one of the few people that still respects what OAI-PMH does, it would be worth considering using DPLA as and OAI-PMH provider. But, I'm not sure if that is technically legal in OAI-PMH terms given that they are most likely likely harvesting it via OAI-PMH. Don't want to get into and infinite regressing of metadata providers. S'up dawg? All jokes aside, I think OAI-PMH would be a better solution that I what I tossed together because it would make harvesting a "set" a hell of a lot easier. My 2¢.

I also have a live demo of it living on my EC2 instance. I've ingested 2000 items from the API, and decided to throw them into a solr index just to demonstrate the possibilities of what you can do with the ingested content.

Finally, I big giant thank you to DPLA and Chattanooga Public Library for putting this on and the wonderful hospitality. This was absolutely fantastic!

Idea

Drupal module or distribution

Your Name: Nate Hill

Type of app: Drupal CMS

Description of App: Many. many libraries choose to use Drupal as their content management system or as their application development framework. A contrib Drupal module that creates a simple interface for admin users to curate collections of DPLA content for display on a library website would be useful.

Workflow

Preamble:

I don't like recreating the wheel. So, let's see what contrib modules already exist, and see if we can just create a workflow to do this to start with. It would be really nice if DPLA had a OAI-PMH provider, then you could just use CCK + Feeds + Feeds OAI-PMH.

Example: bitly.com/VXMvMr

Requirements:

  • CCK

    drush pm-download cck

  • Feeds

    drush pm-download feeds

  • Feeds - JSON Parser

    drush pm-download feeds_jsonpath_parser cd sites/all/modules/feeds_jsonpath_parser && wget http://jsonpath.googlecode.com/files/jsonpath-0.8.1.php

Setup:

  • Create a Content Type for the DPLA content you would like to pull in (admin/content/types/add)
  • Create DPLA metadata fields for the Content Type (admin/content/node-type/YOURCONTENTYPE/fields)
  • Create a new feed importer (admin/build/feeds/create)
  • Configure the settings for you new feed importer
    • Basic settings:
    • Select the Content Type you would like to import into
    • Select a fequency you would like Feeds to ingest
    • Fetcher
    • HTTP Fetcher
    • Processor
    • Node processor
    • Select the Content Type you created
    • Mappings (create a mapping for each metadata field you created)
      • Source : jsonpath_parser:0
      • Target : Title
    • Parser
    • JSONPath Parser
    • Settings for JSONPath parser
      • Context: $.docs.*
  • Construct a search you would like to ingest using the DPLA API
    • ex: http://api.dp.la/v1/items?dplaContributor=%22Minnesota%20Digital%20Library%22
  • Start the import! (node/add/YOURCONTENTTYPE)
  • Give the import a title... whatever your heart desires.
  • Add a feed url
  • Click on JSONPath Parser settings, and start adding all of the JSONPaths
  • Click save, and watch the import go.
  • Check out your results
»

Right! That hackfest report I should have gave...

When I was at Islandora Camp trying to wrap my head around all things Islandora and Fedora, I was thinking ahead about a possible project in archives and research collections - migrating our collection/fonds descriptions and finding aids over to ICA AtoM.
 
ICA AtoM does some pretty cool stuff in terms of access to collection/fonds descriptions, integrates very nicely with Archivematica with accessioning born digital objects, and associating digital representations of item level objects with their respective collection/fonds. My greedy little brain wanted more! I wanted ICA AtoM to be able to pull in Fedora objects automatically and associate them with their respective collection/fonds. So, this is the hackfest proposal I submitted.
 
So what happened? What'd we end up doing?
 
The amazing Peter Van Garderen made absolutely sure Artefactual Systems staff was highly represented at hackfest, and I had two amazing people from Artefactual trying to parse my sleep-deprived-scatter-brained-state reasoning/logic behind what I wanted to do. David Juhasz and Jesús García Crespo, you rock!
 
We spent the first hour or so working through the Fedora REST API documentation looking for the best way to approach the "problem." After about an hour or so of working through a few conditional queries that would need to be strung together, Jesús jumped in and said, "Why aren't we using SWORD for this!?" Good question!
 
ICA AtoM can speak SWORD and Fedora and speak SWORD so long as you can get the module working. As things at hackfest generally go for me, it failed. I could not for the life of me get the module to build. Spend a some time going through build.xml and ant and I just weren't going to be friends that day.
 
Strike one - don't code conditional Fedora REST API queries - not sharable and scalable
Strike two - I couldn't get the SWORD module to build!
Strike three - ???
 
While brainstorming for other solutions to our "problem", David was looking for examples in which I could share records from our repository. Duh! OAI-PMH! ICA AtoM can harvest OAI. If we can map OAI sets to ICA AtoM collections/fonds, and set records to indivudual items in a collection/fonds we're set. Oh my, another use case of OAI-PMH! Yay!
 
Did we succeed? Not actually. Turns out the OAI-PMH harvesting code wasn't quite up to snuff at the time, and David, bless his heart, worked on trying to get it up to par before the end of the day. We were not able to pull together a working version, but the framework is there. It was there all along! (Ed, yes we could have and totally should have used atom :P )

»

Drupal & Digital Collection Sites - 1

I have written about Drupal & the Digital Collections site (http://digitalcollections.mcmaster.ca) a few times now, but haven't really explained how to make a digital collections site out of Drupal. So, without further ado...

What are the necessities of a digital collections site?

»
Syndicate content