For the past few months I’ve been writing about our pilot Linked Open Data project to expose name identifiers for Caltech’s current faculty. So far we’ve been working on creating/obtaining the initial data set. I’m very happy to report that we’ve now got 372/412 faculty names in the LC/NAF and, by extension, the VIAF. We expect to complete the set within the next month or so, give or take our other production responsibilities. Meanwhile, I’ve been messing with the metadata so I can figure out what the heck to do next.
We have the data in full MARC21 authority records. From that I’ve made a set of MADS records (thanks MarcEdit XSLT!). I also created a tab delimited .txt file of the name heading and LCCN.
According to the basic linked data publishing pattern, it can be as simple as publishing a web page. We are able to put out the structured data under an open license and in a non-proprietary format and call ourselves done. This is what you would call 3 star linked open data. We’d like to do a bit better than that. In order to achieve 4 star Linked Open Data we need to do stuff like mint URIs and make the data available in forms more readily machine process-able.
This is where I get stuck.
Fortunately, there will be a hands-on linked data workshop at the DLF Forum (#dlfforum) next week. I’m highly looking forward to it. I’ve volunteered to give an overview of linked data to Maryland tech services librarians in December. Getting our data out there will provide some necessary street cred.