Below is the output of the little project I worked on today at the DPLA Appfest. It definitely isn’t a perfect solution to the problem. It is not a drop-in module to just grab a collection from the DPLA API and "curate" it in your library’s Drupal site. I hate reinventing the wheel especially if there are existing modules that can solve the problem for you. Moreover, as one of the few people that still respects what OAI-PMH does, it would be worth considering using DPLA as and OAI-PMH provider. But, I’m not sure if that is technically legal in OAI-PMH terms given that they are most likely likely harvesting it via OAI-PMH. Don’t want to get into and infinite regressing of metadata providers. S’up dawg? All jokes aside, I think OAI-PMH would be a better solution that I what I tossed together because it would make harvesting a "set" a hell of a lot easier. My 2¢.
I also have a live demo of it living on my EC2 instance. I’ve ingested 2000 items from the API, and decided to throw them into a solr index just to demonstrate the possibilities of what you can do with the ingested content.
Finally, I big giant thank you to DPLA and Chattanooga Public Library for putting this on and the wonderful hospitality. This was absolutely fantastic!
Idea
Drupal module or distribution
Your Name: Nate Hill
Type of app: Drupal CMS
Description of App: Many. many libraries choose to use Drupal as their content management system or as their application development framework. A contrib Drupal module that creates a simple interface for admin users to curate collections of DPLA content for display on a library website would be useful.
Workflow
Preamble:
I don’t like recreating the wheel. So, let’s see what contrib modules already exist, and see if we can just create a workflow to do this to start with. It would be really nice if DPLA had a OAI-PMH provider, then you could just use CCK + Feeds + Feeds OAI-PMH.
Example: bitly.com/VXMvMr
Requirements:
-
drush pm-download cck
-
drush pm-download feeds
-
drush pm-download feeds_jsonpath_parser
cd sites/all/modules/feeds_jsonpath_parser && wget http://jsonpath.googlecode.com/files/jsonpath-0.8.1.php
Setup:
- Create a Content Type for the DPLA content you would like to pull in (admin/content/types/add)
- Create DPLA metadata fields for the Content Type (admin/content/node-type/YOURCONTENTYPE/fields)
- Create a new feed importer (admin/build/feeds/create)
- Configure the settings for you new feed importer
- Basic settings:
- Select the Content Type you would like to import into
- Select a fequency you would like Feeds to ingest
- Fetcher
- HTTP Fetcher
- Processor
- Node processor
- Select the Content Type you created
- Mappings (create a mapping for each metadata field you created)
- Source : jsonpath_parser:0
- Target : Title
- Parser
- JSONPath Parser
- Settings for JSONPath parser
- Context:
$.docs.*
- Context:
- Construct a search you would like to ingest using the DPLA API
- ex:
http://api.dp.la/v1/items?dplaContributor=%22Minnesota%20Digital%20Library%22
- ex:
- Start the import! (node/add/YOURCONTENTTYPE)
- Give the import a title… whatever your heart desires.
- Add a feed url
- Click on JSONPath Parser settings, and start adding all of the JSONPaths
- Click save, and watch the import go.
- Check out your results