Archive for September, 2007

Undergraduate Orientation

The library is pleased to share with you the news of a successful undergraduate orientation day, Sunday, September 23, 2007. Over 200 students were signed up with library accounts!

Library Staff, pictured from left, Tony Diaz, Kris Jolley, and Lindsay Cleary (not pictured, Viet Nguyen) were busy on Sunday talking with new students, answering their questions and giving out information about library services while creating the new library accounts.

Kris Jolley with library handouts for the students.

Thanks to Viet, Lindsay, Tony and Kris for making the day a fun and instructive event for our incoming students.

Cambridge Crystallographic Database: Aug 07 Update Installed

The August 2007 update for the Cambridge Structural Database System, with 8,500 new entries, has been added to the Caltech Library Services CSD installation.

Peter Murray-Rust talk on eThesis, webcast is available

Data-driven science and Digital Repositories Monday, August 6th, 2007, 10:00am, NewMedia Classroom

by Peter Murray-Rust

Unilever Centre for Molecular Sciences Informatics, Department of Chemistry, University of Cambridge, UK

UPDATES: Video is available (9/5/2007)

For the best playback experience, viewers should be on a PC with Windows Media Player installed. More instructions to view the webcast.

For several years there has been excitement about the potential of the “data deluge” where much science will be practised not by doing experiments but by reusing the information we already have. Although in some subjects such as particle physics and parts of bioscience this is starting to happen, in many others such as chemistry and materials science there is little sign of an impending deluge. The data have been collected but they are not accessible.

We have shown most raw data is never communicated outside the scientists’ laboratories and that it rapidly decays. In crystallography and analytical chemistry between 80 and 99% of data which is carefully collected ends up on a CDROM which, in a few years’ time, will be unreadable. We shall present a vision where this data can be collected and preserved. The challenges are technical, semantic, but above all social where we have to change the mindset of scientists to “preserve and share”.

We have developed a system, SPECTRa, for capturing the raw data in chemistry departments, first into an embargo repository and then into the main Institutional Repository (IR). The software is Open Source and we are seeking collaborators.

The current publishing system is often a serious impediment to sharing data. It emphasises “full-text” over data, but also puts legal and procedural constrictions on data flow. To bypass this we have started to investigate PhD and Masters theses as a primary source of information as these are solely under the control of academia and, in many cases, are the direct source of data which leads to formal publications. Our system is devised to create rich metadata from theses, using RDF and SPARQL as query tools.

Hopefully a number of demonstrations will be given.

THIS LECTURE WILL BE RECORDED AND WEBCAST.