Librarians working in data management: How to avoid a data management nightmare

I thought I would quickly share a video that I created in collaboration with two of my excellent colleagues — Karen Hanson and Alisa Surkis. We are developing short data management modules for a clinical department at our institution that will cover everything from selecting the right data collection tool to file naming conventions. This first video was developed to serve as a teaser for the more focused modules.

The first module has already been well received within the department we are working with, and we hope that this will catch on with other departments as we move forward. As always, I’m happy to hear feedback or answer questions about how we developed the module or what we’re using it for in more detail. In the meantime, I hope you enjoy:


Why librarians can’t ignore data anymore

It’s here, and I have to say it came quicker than I expected – the first big stick for researchers – the PLoS data sharing policy. What this policy means for researchers is that if they refuse to share the data accompanying their publication – they can’t publish in PLoS. It also means that if they get published, but then hide their data after the fact they can have their publication retracted. This is an example of a really firm hand in an area where there hasn’t been one before. My first thought when I read this: what an amazing opportunity for librarians! It’s no secret that researchers have mixed feelings about the policy; some are angry and frustrated, others see the light and understand that this has been a long time coming. What librarians can do is ease the pain a little bit and try as best they can to reduce the burden on these researchers and provide them with options that will make this transition as easy as possible. While PLoS is the only publisher providing this big stick for data sharing, I expect Nature, Science and others will be following suit before long. Not to mention the various federal policies from the NSF, NIH, and now finally Canada with the Tri-Council looking to capitalize on Big Data.

So what can you do, even if you aren’t that familiar with data management, or data sharing policies?

Familiarize yourself with data repositories and their policies

One of the requirements of this data policy from PLoS is that they strongly recommend that researchers deposit their data in a public repository, so that their data can receive a DOI, accession number, or any other unique identifier. This is a simple step to providing researchers with valuable information. There are so many different options out there for researchers, and they may only know of a few different options – if any. Learning about what is available to researchers with respect to subject specific or even general data repositories (as well as any fees that may apply) can go a long way towards steering them in the right direction. Here are some options:

A recent blog post by John Kratz and Natsuko Nicholls on DataPub also provides valuable information about finding a suitable repository, and they do a very good job of outlining the differences between Databib and re3data.

Find out what your own institution offers

Our institution spoke with PLoS and found out that they also accept handles as a form of unique identifier. What this means is if you have an institutional repository that supports handles, DOIs, or any other type of unique identifier, you may already have a solution for your researchers. Check with those responsible for your institutional repositories to see if they can handle supporting researchers data. Questions to ask would include: does the metadata support data? What is the maximum file size you accept? Can you link multiple records in the repository together?

Let researchers know you are aware of the policy, and that the library is there to support them

At our institution, our first instinct was to see how many of our researchers have published in PLoS over the years. We found that in total, we had over 800 – but when you narrowed it down to first and last authors, we got a number closer to 130. We made an active decision to reach out to these authors know about the policy, and made an effort to find out what our institution has to offer, as well as other options they could pursue. The goal here was to let everyone know that the library was on top of it, and if they need support in this effort, we were going to be there.

Additionally, the library decided to send out a broadcast email to the entire institution to let them know about PLoS’s new policy, and that we were on top of it. We wanted to do this quickly to make sure everyone knew that the library was the place to go for these types of questions.

Go out and talk to your researchers

If anything, the one thing that can’t hurt is to try and reach out to the various areas you have – either as subject librarians or liaison librarians. We’ve just finished an exercise where we met and interviewed 30+ researchers with active grants at our institution (results to be published later this year) to learn more about issues surrounding how they manage, organize, store, preserve, reuse and share data. This exercise was invaluable as it provided us with multiple scenarios where they could be supported by the library. Even starting the conversation around how they feel about the PLoS data sharing policy is a good idea. More of these policies are going to emerge, so it’s best to start now. 

What if I don’t feel comfortable with the content yet?

That’s fine, but you could start by reading the plethora of literature out there on the topics of data management, sharing, storage, preservation, reuse – the list goes on and on. It’s also great to speak with other librarians who have been active in this area – I’m always open for a talk about research data! I’ve included some resources below that are a good start:

For librarians concerned about their role in the library, or looking for new opportunities to branch out and stake a claim in another area of the information profession – this is your chance. Talk to your researchers, learn to provide them with the support they need, and stay active in this area because research data management and sharing is only going to grow, and I know I am one librarian who does not want to be left behind. 

I’d love to hear in the comments about how your library is tackling this issue – if at all. I would also be keen to know the reasons why you won’t be pursuing this issue. Thanks for reading!

PLOS’s open data fever dream

I wanted to bring attention to this post on the fear’s of PLOS’s new open data policy from the blog of a neuroscience researcher. It addresses many of the concerns from the scientific research community concerning sharing data, and also highlights several ways that libraries can contribute. I encourage you to read through the comments section to learn more about additional (and innovative) ways researchers are working towards meeting this requirement. The PLOS policy is only the beginning, as many other requirements  will begin to emerge in the near future – including government mandates.

The publisher of the largest scientific journal in the world, PLOS, recently announced that all data relevant to every paper must be accessible in a stable repository, with a DOI and everything. Some discussion of this is going on over at Drugmonkey, and this is a comment that got out of hand, so I posted it here instead.

What is the purpose of this policy? I don’t see how anyone could be fooled into thinking this could somehow help eliminate fraud. Fraud is about intent to deceive, and one can deceive with a selective dataset as easily (or, actually, much more easily) than with Photoshop.

What else? Well, you could comb through the data of that pesky competitor or some other closely related work, looking for mistakes or things they missed that you could take advantage of. Frankly, I can’t imagine bothering. I mean, how could you not have…

View original post 692 more words