dsalo
Posts by this author
November 11, 2009
This is a pushmi-pullyu post. I need some help with an environmental scan, so I'll get us started and the rest of you smart folks can amplify my knowledge.
I want to understand what's going on where with data curation specifically at the institutional level (no NOAA, no ICPSR, none of that)…
November 10, 2009
So here's an interesting problem I ran into today. You have metadata in an XML file. You want to make the file self-describingly self-correcting, so you want to embed its checksum inside it. The problem is, you can't add the checksum to the XML file without changing the file's checksum!
Is there an…
November 10, 2009
Libraries do collaborative collection development, through consortia and increasingly via direct institution-to-institution arrangements. Reference and instruction are collaborative endeavors—look at any social-networking service with lots of librarians and you'll see on-the-spot crowdsourced…
November 9, 2009
Starting off the week with some juicy tidbits:
An extremely nerdy but (for nerds) fascinating examination of XML and its implications for data modeling. Do we have to reduce everything to a relational model? Really? Perhaps not… Notably, it seems to me, this article describes fairly nicely how…
November 5, 2009
I read the RIN report on life-sciences data with interest, a little cynicism, and much appreciation for the grounded and sensible approach I have come to expect from British reports. If you're interested in data services, you should read this report too.
A warning to avoid preconceptions: If you…
November 4, 2009
There is a certain kind of digital project that strikes terror and dismay into the hearts of digital preservationists everywhere. Not a one of us hasn't seen many exemplars. They make me myself feel sad and tired.
They're projects that, no matter their scholarly or design merit, are completely…
November 2, 2009
One phenomenon that will be—indeed, already is—utterly unavoidable in the data-curation space is the creation of standards. I once heard Andrew Pace say that standards are like toothbrushes: everybody thinks they're great, but nobody wants to use anybody else's.
Be that as it may, standards…
October 29, 2009
If you're not reading comments here, you're missing out. For reasons I don't entirely understand, some of the best in the business are seeing fit to comment here. They have more to teach than I do!
Chris Rusbridge (of, among other things, this thought-provoking meditation on digital preservation)…
October 28, 2009
I pointed out Mike Lesk's slideshow in my last tidbits post, finding it a good critical précis of the data problem. It's pleasantly aware of human problems, human problems many treatments of cyberinfrastructure (including, unfortunately, this otherwise useful call to action from Educause) wholly…
October 26, 2009
It's been a while since I did anything on my series about library ways of knowing. If you'd like to refresh your memory:
The classical librarian
The humble index
Classification
Today I'll finish my discussion of classification, and distinguish it from subject analysis, since that distinction…
October 23, 2009
My del.icio.us tag overfloweth…
A challenge to libraries from an information science professor: "I wish I could say that libraries were the obvious organization to take care of data… But… they have not been ambitious, they lack the subject area knowledge, they often lack the technical skills."…
October 20, 2009
I have intentionally steered Book of Trogool away from open access. I still believe in it; I still work for it. Toward the waning days of Caveat Lector, however, it became clear that I was shedding more heat than light on the subject, so I made a conscious decision not to repeat that mistake here.…
October 17, 2009
I've lived all my short career in academic libraries thus far on the new-service frontier. In so doing, I've looked around and learned a bit about how academic libraries, research libraries in particular, tend to manage new services. With apologies to all the botanists I am about to offend by…
October 16, 2009
I'm still buried in translating a presentation into Spanish for Monday and finishing another in English for Wednesday, but here's a small thought to tide folks over, a thought that came to me shortly before my presentation at Access.
At the data-curation workshops I've been to, it has been…
October 13, 2009
If you've been having trouble commenting, you're not alone—the comment form quit working for me a couple days ago.
I wrote in to Erin, and from where I'm sitting, the problem has been fixed. If you're not getting comment-form love, email me at dorothea.salo at gmail and I'll see what I can do.…
October 10, 2009
I will be speaking for UKSG's conference next April. They haven't given me a topic… but they want a talk title by the end of this month.
I have to write a paper alongside the talk, and I hate writing papers with every last fiber of my being, so if I have to do it, I want to make it count for…
October 7, 2009
Roy Tennant sent me an email about my Access presentation in which he asked what libraries should do about the laundry-list of data-curation challenges I presented. (If you're curious, you can go view the presentation yourself, courtesy of the wonderful A/V folk at Access. The less-than-an-hour-…
October 5, 2009
I am back from Access and feeling wonderful… and wonderfully exhausted.
Data and its care and feeding were the dominant themes at the conference. I strongly recommend reading the session summaries at Pete Zimmerman's blog. It's hard to pick star sessions out of so very many good ones, but the…
October 2, 2009
There's quite a bit going on at Access 2009 that's data-related in one way or another. Delegate Pete Zimmerman is taking excellent notes at his blog. The Twitter hashtag is #access2009pei.
I'm up right after lunch. There may or may not be a live video stream; if not, there will be canned video…
September 29, 2009
A comment Chris Rusbridge left on a previous post leads me to clarify the extent to which the subject matter of this blog draws on my own position in the institution where I work, and that institution's take on matters data-curational.
In brief: It doesn't. I don't talk about my place of work here…
September 29, 2009
One of the problems practically every nascent data-curation effort will have to deal with is what serials librarians call the backfile, though the rest of us use the blunter word backlog.
There's a lot of digital data (let's not even think about the analog for now) from old projects hanging around…
September 25, 2009
Sometimes it's worthwhile to let my "toblog" folder on del.icio.us marinate a bit. Posts I recently ran across on two different blogs illuminate the same point so well that they deserve their own post here!
Off the Map offers Huffman's Three Principles for Data Sharing, which are really principles…
September 24, 2009
I know I said I'd be neglecting the place for a bit… but I still feel bad about that!
Here's what I've been working on. I'm afraid this is sort of the Cliff's Notes version, but at least it looks pretty?
Grab a bucket! It's raining data!
If you're coming to Access 2009 next week, you'll see the…
September 19, 2009
In many of the data-curation talks and discussions I've attended, a distinction has been drawn between Big Science and small science, the latter sometimes being lumped with humanities research. I'm not sure this distinction completely holds up in practice—are the quantitative social sciences Big or…
September 17, 2009
I commented here earlier, not without frustration, about a pair of researchers who built and abandoned a disciplinary repository. I was particularly annoyed that they seemed to have done this purely for self-aggrandizement, apparently feeling no particular attachment to the resulting repository.…
September 16, 2009
The Book of Trogool turns another page...
Social scientists and medical researchers, pay attention to this: "Anonymized" data really isn't—and here's why not. If informaticists aren't starting to run similar analyses on their own "anonymized" data, they should be. This is a serious concern.
One…
September 14, 2009
When I was but grasshopper-knee tall, my father the anthropologist took me to his university's library to help him locate and photocopy articles in his area of study for his files. He had two or three file cabinets full of such copies. (He may still.)
I have similar file cabinets, two of them: my…
September 11, 2009
Many doctoral institutions now accept and archive (or are planning to accept and archive) theses and dissertations electronically. Virginia Tech pioneered this quite some time ago, and it has caught on slowly but steadily for reasons of cost, convenience, access, and necessity.
Necessity? Afraid so…
September 11, 2009
I wanted to call attention to this event at Harvard, which will be webcast live next Friday at 12:15 Central.
The difficulties in combining data and information from distributed sources, the multi-disciplinary nature of research and collaboration, and the need to move to present researchers with…
September 10, 2009
Many of my readers will already have seen the Nature special issue on data, data curation, and data sharing. If you haven't, go now and read; it's impossible to overestimate the importance of this issue turning up in such a widely-read venue.
I read the opening of "Data sharing: Empty archives"…