There’s been some conversation lately about using Wikipedia in authority work. Jonathan Rochkind recently blogged about about the potential of using Wikipedia Miner to do add subject authority information to catalog records, reasoning that the context and linkages provided in a Wikipedia article could provide better topical relevance. Then somebody on CODE4LIB asked for help using author information in a catalog record to look up said author in Wikipedia on-the-fly. The various approaches suggested on the list have been interesting although there hasn’t been an optimal solution. Although I couldn’t necessarily code such an application myself, it’s good to know how a programmer could go about doing such a thing. What I did learn was that Wikipedia has a way of marking up names with author identifiers. The Template:Authority Control gives an example of how to do it.
I haven’t done much authoring or editing at Wikipedia, so the existence of the “template” is news to me. I think it’s pretty nifty, so I just had to blog it. The template gets me thinking. Perhaps we’ll be able to leverage our faculty names Linked Data pilot into some sort of mash-up with Wikipedia, pushing our author identifiers into that space or pulling Wikipedia info into our work. Our group continues to make progress on getting all our current faculty are represented in the National Authority File, with an eye to exposing our set of authority records as Linked Data. We haven’t figured out yet precisely what we’re going to do with the Linked Data once we make it available. Build it and they will come is nice, but we need a demonstrable benefit (i.e. a cool project) to show the value of the Library’s author services.
VIAF already provides external links to Wikipedia and WorldCat Identities with its display of an author name. Ralph Levan explained how OCLC did it, in general fashion, in the CODE4LIB conversation. Near as I understand it, they do a data dump from Wikipedia, do some text mining, run their disambiguation algorithms over it, then add the Wikipedia page if they get a match. I don’t know if this computational approach is a Linked Data type of thing or not. I need to continue working my way through chapter 5 & chapter 6 Heath & Bizer’s Linked Data book (LOD-LAM prep!). Nonetheless, it’s a good way of showing how connections can be built between an author identity tool and another data source which enrich the final product. I have a hazy vision of morphing the Open Library’s “one web page for every book every published” into “one web page for every Caltech author.” More likely it will be “one web page tool for every Caltech author to incorporate into their personal web site,” given the extreme individualism and independence cherished within our institutional culture. But I digress. Yes. “One web page for every Caltech author” would at least give us the (metaphorical) space to build a killer app.