[berkman][misc] Curated by the crowd

Posted on:: September 24th, 2013

I’m at a Berkman lunchtime talk on crowdsourcing curation. Jeffrey Schnapp, Matthew Battles [twitter:matthewBattles] , and Pablo Barria Urenda are leading the discussion. They’re from the Harvard metaLab.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Matthew Battles begins by inviting us all to visit the Harvard center for Renaissance studies in Florence, Italy. [Don’t toy with us, Matthew!] There’s a collection there, curated by Bernard Berenson, of 16,000 photos documenting art that can’t be located, which Berenson called “Homeless Paintings of the Italian Renaissance.” A few years ago, Mellon sponsored the digitization of this collection, to be made openly available. One young man, Chris Daley [sp?] has since found about 120 of the works. [This is blogged at the metaLab site.]

These 16,000 images are available at Harvard’s VIA image manager [I think]. VIA is showing its age. It doesn’t support annotation, etc. There are some cultural crowdsourcing projects already underway, e.g., Zooniverse’s Ancient Lives project for transcribing ancient manuscripts. metaLab is building a different platform: Curarium.com.

Matthew hands off to Jeffrey Schnapp. He says Curarium will allow a diverse set of communities (archivist, librarian, educator, the public, etc.) to animate digital collections by providing tools for doing a multiplicity things with those collections. We’re good at making collections, he says, but not as good at making those collections matter. Curarium should help take advantage of the expertise of distributed communities.

What sort of things will Curarium allow us to do? (A beta should be up in about a month.) Add metadata, add meaning to items…but also work with collections as aggregates. VIA doesn’t show relations among items. Curarium wants tomake collections visible and usable at the macro and micro levels, and to tell stories (“spotlights”).

Jeffrey hands off to Pablo, who walks us through the wireframes. Curarium will ingest records, and make them interoperable. They take in reords in JSON format, and extract the metadata they want. (They save the originals.) They’re working on how to give an overview of the collection; “When you have 11,000 records, thumbnails don’t help.” So, you’ll see a description and visualizations of the cloud of topic tags and items. (The “Homeless” collection has 2,000 tags.)

At the item level, you can annotate, create displays of selected content (“‘Spotlights’ are selections of records organized as thematized content”) in various formats (e.g., slideshow, more academic style, etc.). There will be a rich way of navigating and visualizing. There will be tools for the public, researchers, and teachers.

Q&A

Q: [me] How will you make the enhanced value available outside of Curarium? And, have you considered using Linked Data?

A: We’re looking into access. The data we have is coming from other places that have their own APIs, but we’re interested in this.

Q: You could take the Amazon route by having your own system use API’s, and then make those API’s open.

Q: How important is the community building? E.g., Zooniverse succeeds because people have incentives to participate.

A: Community-building is hugely important to us. We’ll be focusing on that over the next few months as we talk with people about what they want from this.

A: We want to expand the scope of conversation around cultural history. We’re just beginning. We’d love teachers in various areas — everything from art history to history of materials — to start experimenting with it as a teaching tool.

Q: The spotlight concept is powerful. Can it be used to tell the story of an individual object. E.g., suppose an object has been used in 200 different spotlights, and there might be a story in this fact.

A: Great question. Some of the richness of the prospect is perhap addressed by expectations we have for managing spotlights in the context of classrooms or networked teaching.

Q: To what extent are you thinking differently than a standard visual library?

A: On the design side, what’s crucial about our approach is the provision for a wide variety of activities, within the platform itself: curate, annotate, tell a story, present it… It’s a CMS or blogging platform as well. The annotation process includes bringing in content from outside of the environment. It’s a porous platform.

Q: To what extent can users suggest changes to the data model. E.g., Europeana has a very rigid data model.

A: We’d like a significant user contribution to metadata. [Linked Data!]

Q: Are we headed for a bifurcation of knowledge? Dedicated experts and episodic amateurs. Will there be a curator of curation? Am I unduly pessimistic?

A: I don’t know. If we can develop a system, maybe with Linked Data, we can have a more self-organizing space that is somewhere in between harmony and chaos. E.g., Wikimedia Loves Monuments is a wonderful crowd curatorial project.

Q: Is there anything this won’t do? What’s out of scope?

A: We’re not providing tools for creating animated gifs. We don’t want to become a platform for high-level presentations. [metaLab’s Zeega project does that.] And there’s a spectrum of media we’ll leave alone (e.g., audio) because integrating them with other media is difficult.

Q: How about shared search, i.e., searching other collections?

A: Great idea. We haven’t pursued this yet.

Q: Custodianship is not the same as meta-curation. Chris Daly could become a meta-curator. Also, there’s a lot of great art curation at Pinterist. Maybe you should be doing this on top of Pinterest? Maybe built spotlight tools for Pinteresters?

A: Great idea. We already do some work along those lines. This project happens to emerge from contact with a particular collection, one that doesn’t have an API.

Q: The fact that people are re-uploading the same images to Pinterest is due to the lack of standards.

Q: Are you going to be working on the vocabulary, or let someone else worry about that?

A: So far, we’re avoiding those questions…although it’s already a problem with the tags in this collection.

[Looks really interesting. I’d love to see it integrate with the work the Harvard Library Interoperability Initiative is doing.]

Follow me

Categories: libraries, misc dw

[berkman][misc] Curated by the crowd

Share this:

Leave a Reply