Open Access & Open Data: Projects that librarians should know about (and share with others!)

Last week I had the opportunity to attend a presentation by Heather Joseph –  a representative of SPARC (Scholarly Publishing and Academic Resources Coalition) – to hear about some of the great open access journal publishing initiatives taking place. There are a variety of publishing platforms that have emerged as of late that offer their own unique way of promoting open access and supporting research sharing. I thought I would share with you some of the initiatives that Heather highlighted in her talk. 

To extend the discussion into the realm of open access data, I also want to discuss a few of the data sharing initiatives I have found while working on my current projects. I believe that these data sharing resources represent an ideal  future for research and data publication; they offer platforms where investigators can share data, collaborate and modify data with other researchers and even use software to transform their datasets into education materials. To access each resource, click on the images to link to their respective webpages.

Open Access Publishers

Public Library of Science (PLOS)

PLOS

The most obvious on the list but I feel like I would have heard about it from colleagues if I didn’t include it. PLOS is the initiative that provides multiple platforms for scientific journals that are completely open access. They are strong advocates of sharing research and have 9 core principles that promote sharing, community engagement and scientific excellence. PLOS hosts many excellent journals such asPLOS ONE, which publishes across the full range of life and health sciences; community journals (PLOS GeneticsPLOS Computational BiologyPLOS Pathogensand PLOS Neglected Tropical Diseases); and  PLOS Medicine and PLOS Biology. PLOS Blogs and Currents also make for some excellent reading, focused mainly on the issues of research sharing and open access. I read PLOS blogs and currents on a regular basis, as they provide excellent information on open access and focus on many publication issues that librarians need to be aware of.

eLIFE

eLife   the funder researcher collaboration and forthcoming journal for the best in life science and biomedicine

eLIFE is one of the new actors in the realm of open access publishing, and prides itself on being:

a researcher-led digital publication for outstanding work, a platform to maximise the reach and influence of new findings and a showcase for new approaches for the presentation and assessment of research.

Working with the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust among 200 others, eLIFe is focusing its attention to early-career researchers. Their goal is to make researchers first foray into publishing a constructive and fair exercise by providing a fair, transparent, and supportive author experience. eLIFE is also interested in promoting data sharing, but I don’t think it has been fully realized yet. I look forward to see what will come out of eLIFE as it continues to grow.

PeerJ

PeerJ

PeerJ offers a different model from eLife and PLOS in that it costs money to sign up, but for a small sum a publisher can be set up with a publication platform for life. $99 allows a researcher to publish one article per year for life; $199 allows a researcher to publish twice a year for life; and $299 provides the researcher with the opportunity to publish as many articles as they want per year. There is still a rigorous peer review process and paying this amount does not guarantee that their papers will be accepted. It is also important to note that all authors of an article must be members of PeerJ to submit. PeerJ has a set list of criteria that need to be met and provides an extensive list of editors from various disciplines that review submissions. Furthermore, every PeerJ member is required to review at least one paper each year or participate in post-publication peer review.

A news article in Nature comments on PeerJ as one of the cheapest options for this type of publishing. I highly encourage everyone to read the news article as it provides some insight into the emerging nature of open access publishing platforms. PeerJ seems like a good idea, but we’ll have to see if it will generate enough of a following to remain sustainable over time.

Open Humanities Alliance

Open Humanities Alliance

For my humanities friends out there, I had to include the Open Humanities Alliance in this list. The Alliance is a community-building project of thOpen Humanities Press. It aims to overcome some of the common technical barriers to open access in the humanities by linking students and faculty with resources such as open source software, hosting and archiving. The Open Humanities Alliance is a way for like-minded people from inside or outside the academy to work together in opening humanities scholarship to the world.

The one project that is sponsored by the Alliance that I want to talk about is the Open Access Journal Incubator ibiblio. This project is designed to provide researchers with a place to access a wide variety of research (music, art, literature, politics, etc.) as well as share their own. Contributors to ibiblio have to meet their set of criteria before they can share their research, but the requirements are clear and easy to follow. I had a lot of fun rooting around the site looking at the 900+ collections.

Data Sharing Projects

As a result of the discussions of research data sharing within the scientific community, projects such as HUBzero, Cytobank, and WebPAX have emerged to broach the subject through online communities that encourage the sharing of research data, foster research collaboration, and promote collective data analysis. I discuss a little bit about each one below.

Cytobank

Cytobank

Cytobank is a data sharing repository designed to manage, share, and analyze flow cytometry data from any researcher. Cytobank prides itself on being a platform for researchers, collaborators, lab and core facility managers, developers and statisticians, educators and trainers, and vendors.

What is great about Cytobank is that it allows researchers to manage their own data and host it on a cloud server; share experiment data and details quickly and easily through the web to other Cytobank users; foster interactive discussions around particular experiments; and allow researchers to turn their cytometry data into education materials. I believe that we will be seeing more repositories like Cytobank as data sharing becomes more common among researchers. This type of repository represents the potential benefits of data sharing by providing researchers with a place where they can store and manage their research as well as collaborate with others to achieve new scientific discovery.

HubZERO

HUBzero   Platform for Scientific Collaboration

HubZERO is an open source software platform for building powerful Web sites that support scientific discovery, learning, and collaboration. The scientific community has started to refer to web sites like this as “collaboratories” supporting “team science.” HubZERO differs from Cytobank in that it provides a content management system that is  built to support scientific activities. Using this system researchers can work together in projects, publish datasets and computational tools with Digital Object Identifiers (DOIs), and make these publications available for others to use as live, interactive digital resources. HubZERO’s datasets and tools run on cloud computing resources, campus clusters, and other national high-performance computing (HPC) facilities. You can take a look at some existing hubs here.

These hubs represent new and exciting innovations in data sharing. These sites are dynamic with options to build animations with data; download data; take courses to understand various datasets; view publications associated with the data;  observe online presentations about the data; and even create online simulations based on the data.

WebPax

WebPAX.com   Share Your Medical Images

WebPax is exciting because it focuses primarily on sharing medical imagery. Researchers can host and manage their medical images on the site and share them with colleagues for further analysis. Researchers create an account and have full control over who can view their images. They can then share their images with a select group of people or post them to where all members can see them. In case you were wondering about privacy, all images are anonymized and encrypted using secure socket layer (SSL) encryption technologies to make sure that third parties are unable to access this sensitive information. Because so many physicians come into the library wanting to see images on a particular topic, I think WebPax would be an excellent resource to point them to. Not only will it give them another option for viewing images, but it might even encourage them to share some of their own.

A Data Management and Data Sharing Bibliography for Librarians

It has been a while since I last posted. December was a pretty crazy month and I’ve been working on some excellent projects (more to come on the blog in a few weeks). In the meantime, a colleague of mine – the talented @fsayre - and I have been working hard to compile all of the literature on data management that we thought would be useful for librarians. Since we are both medical librarians, there are quite a few articles that are health-focused, but the majority should be useful for any librarian. 

The two of us are hoping to start a Mendeley group where more librarians can join and share their experiences and ideas about working with data management. We would love to have the input of more librarians, so please let us know via this blog or on Twitter if you would be interested in joining our Mendeley group.

As for this bibliography, while we’ve tried to make it as comprehensive as possible, we encourage people who read this to add additional material in case we’ve missed some resources.  Also, if you’re interested in looking at some other resources, check out my posts on the Data Curation Lifecycle and data management resources for librarians. Happy reading!

**Update** The Mendeley Group is now up and running and you can request to join it here: http://www.mendeley.com/groups/2956801/data-management-for-librarians/. We encourage all of those who are interested to sign up, and you are not required to contribute if you do not want to. Otherwise, we hope that librarians will share resources as well as their experiences working with data.

1. Advisor E, Committee WP, Attribution S. Report on the International Workshop on Contributorship and Scholarly Attribution Report written by Irene Hames , Editorial Advisor and Consultant , with input and some facilitators Workshop Planning Committee Executive summary. 2012;2012(May):1–29.

2. Allard S. DataONE: Facilitating eScience through Collaboration. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):4–17. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/3/

3. Auckland M. Re-skilling for Research. RLUK Research Libraries UK. 2012. Available from: http://www.rluk.ac.uk/files/RLUK%20Re-skilling.pdf

4. Baker M. Gene data to hit milestone. Nature [Internet]. 2012 Jul 19 [cited 2012 Nov 1];487(7407):282–3. Available from: http://www.nature.com/news/gene-data-to-hit-milestone-1.11019

5. Bloom T. Dealing with data. PLOS Biologue [Internet]. 2012 [cited 2012 Nov 9]; Available from: http://blogs.plos.org/biologue/2012/07/13/dealing-with-data/

6. National Science Board. Digital Research Data Sharing and Management. National Science Foundation. Arlington, VA; 2011. Available from: http://www.nsf.gov/nsb/publications/2011/nsb1124.pdf

7. Borgman CL. Research Data : Who will share what, with whom, when, and why ? China-North American Library Conference. Beijing; 2010. p. 21. Available from: http://works.bepress.com/borgman/238/

8. Charles W. Bailey J. Research Data Curation Bibliography [Internet]. Houston: Charles W. Bailey, Jr.; 2012 [cited 2012 Nov 9]. Available from: http://digital-scholarship.org/rdcb/rdcb.htm

9. Christensen-Dalsgaard B. Ten recommendations for libraries to get started with research data management. Wirtschaftsforschung, Berlin; 2012 p. 3. Available from: http://www.libereurope.eu/sites/default/files/The%20research%20data%20group%202012%20v7%20final.pdf

10. Creamer A. Creating an Online Research Data Management Course: A Conversation with Data Librarians Robin Rice and Stuart Macdonals. Worcester, MA; 2011. Available from: http://esciencecommunity.umassmed.edu/2012/10/09/creating-an-online-research-data-management-course-a-conversation-with-data-librarians-robin-rice-and-stuart-macdonald/

11. Creamer A, Morales M, Crespo J, Kafel D, Martin E. An Assessment of Needed Competencies to Promote the Data Curation and Management Librarianship of Health Sciences and Science and Technology Librarians in New England. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):18–26. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/4/

12. Creamer A, Morales M, Crespo J, Kafel D, Martin E. Data Curation and Management Competencies of New England Region Health Sciences and Science and Technology Librarians [Internet]. University of Massachusetts and New England Area Librarian e-Science Symposium 2011. Available from: http://escholarship.umassmed.edu/escience_symposium/2011/posters/8

13. Crosas M. The Dataverse Network. The Institute of Quantitative Social Science 2012. Available from: http://thedata.org/

14. D’Ignazio J, Qin J, Kitlas J. Using internship experience to evaluate a new program in eScience librarianship. Proceedings of the 2012 iConference on – iConference  ’12 [Internet]. New York, New York, USA: ACM Press; 2012;601–2. Available from: http://dl.acm.org/citation.cfm?doid=2132176.2132304

15. Dukes P. Maximising value of population health sciences data The role for Data Management Plans MRC data strategy. 2012;(November). Available from: http://blogs.lshtm.ac.uk/rdmss/files/2012/11/4-Dukes-MRC1.pdf

16. Eynden AV Van Den, Corti L, Bishop L, Horton L. Managing and Sharing Data: Best Practices for Researchers. UK Data Arrchive; 2011. Available from: http://data-archive.ac.uk/media/2894/managingsharing.pdf

17. Ferguson J. Description and Annotation of Biomedical Data Sets. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):51–6. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/9/

18. Godlee F. Clinical trial data for all drugs in current use. BMJ [Internet]. 2012 Oct 29 [cited 2012 Nov 2];345(oct29 2):e7304–e7304. Available from: http://www.bmj.com/content/345/bmj.e7304

19. Gore S a. e-Science and data management resources on the Web. Medical reference services quarterly [Internet]. 2011 Jan;30(2):167–77. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21534116

20. Hackett Y. A National Research Data Management Strategy for Canada: The Work of the National Data Archive Consultation Working Group. 2001. Available from: http://www.interpares.org/display_file.cfm?doc=ip1_dissemination_janr_hackett_iassist_quarterly_25_2001.pdf

21. Heidorn PB. The Emerging Role of Libraries in Data Curation and E-science. Journal of Library Administration [Internet]. Routledge; 2011 Oct [cited 2012 Nov 9];51(7-8):662–72. Available from: http://dx.doi.org/10.1080/01930826.2011.601269

22. Hey A, Tansley S, Tolle K. The fourth paradigm: data-intensive scientific discovery [Internet]. Microsoft Research; 2009 [cited 2012 Nov 9]. Available from: http://iw.fh-potsdam.de/fileadmin/FB5/Dokumente/forschung/tagungen/i-science/TonyHey_-__eScience_Potsdam__Mar2010____complete_.pdf

23. Hswe P, Holt A. Guide for Research Libraries: The NSF Data Sharing Policy [Internet]. Association of Research Libraries. 2011 [cited 2012 Oct 11]. Available from: http://www.arl.org/rtl/eresearch/escien/nsf/index.shtml

24. Inouye D, Scheiner S. Some Simple Guidelines for Effective Data Management. Bulletin of the Ecological Society of America. 2009;2:1–10. Available from: http://www.nceas.ucsb.edu/files/computing/EffectiveDataMgmt.pdf

25. Interview with Svetia Baykoucheva and James Mullin: What Do Libraries Have to Do with e-Science ? ACS Division of Chemical Information (CINF ). 2011;1–2. Available from: http://drum.lib.umd.edu/bitstream/1903/11843/1/Baykoucheva_Mullins_eScience.pdf

26. Jahnke L, Asher A, Keralis SDC. The Problem of Data. Washington, DC: Council on Library and Information Resources; 2012. Available from: http://www.clir.org/pubs/reports/pub154/pub154.pdf

27. Johnston L, Lafferty M, Petsan B. Training Researchers on Data Management: A Scalable, Cross-Disciplinary Approach. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 8];1(2). Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss2/2/

28. Kafel D, Morales M, Vander Hart R, Gore S, Creamer A, Crespo J, et al. Building an e-Science Portal for Librarians: A Model of Collaboration. Journal of eScience Librarianship [Internet]. 2012;1(1):41–5. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/7/

29. LeFurgy B. Data-Intensive Librarians for Data-Intensive Research [Internet]. The Signal: Digital Preservation. 2012 [cited 2012 Nov 9]. Available from: http://blogs.loc.gov/digitalpreservation/2012/07/data-intensive-librarians-for-data-intensive-research/

30. Lamar Soutter Library, University of Massachusetts Medical School and the George C. Gordon Library, Worcester Polytechnic Institute. Frameworks for a Data Management Curriculum [Internet]. Worcester; 2011 p. 1–67. Available from: http://library.umassmed.edu/data_management_frameworks.pdf

31. Lesk M. Data curation : just in time , or just in case ? International Association of Scientific and Technological University Libraries, 31st Annual Conference. West Lafayette, IN; 2010. Available from: http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1021&context=iatul2010

32. Mayernik MS. The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation [Internet]. D-Lib Magazine. 2012. Available from: http://www.dlib.org/dlib/september12/mayernik/09mayernik.html

33. Minnesota U of. Data Management 101 – Planning Checklist.

34. Most WC. Keeping Research Data Safe: Cost issues in digital preservation of research data. 2:5–6. Available from: http://www.beagrie.com/KRDS_Factsheet_0910.pdf

35. NISO. Linked Data for Libraries, Archives and Museums. Information Standards Quarterly. 2012;24(2/3). Available from: http://www.niso.org/apps/group_public/download.php/9422/isqv24no2-3.pdf

36. Pathak J, Wang J, Kashyap S, Basford M, Li R, Masys DR, et al. Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience. Journal of the American Medical Informatics Association : JAMIA [Internet]. [cited 2012 Oct 29];18(4):376–86. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3128396&tool=pmcentrez&rendertype=abstract

37. Piorun M, Kafel D, Leger-Hornby T, Najafi S, Martin E, Colombo P, et al. Teaching Research Data Management: An Undergraduate/Graduate Curriculum. Journal of eScience Librarianship [Internet]. 2012;1(1):46–50. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/8/

38. Piwowar HA, Vision TJ, Whitlock MC. Data archiving is a good investment. Nature [Internet]. Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.; 2011 May 19 [cited 2012 Nov 9];473(7347):285. Available from: http://dx.doi.org/10.1038/473285a

39. Piwowar H a., Day RS, Fridsma DB. Sharing Detailed Research Data Is Associated with Increased Citation Rate. Ioannidis J, editor. PLoS ONE [Internet]. 2007 Mar 21 [cited 2012 Oct 25];2(3):e308. Available from: http://dx.plos.org/10.1371/journal.pone.0000308

40. Pryor G. Managing Research Data [Internet]. Facet Publishing; 2012 [cited 2012 Nov 9]. p. 224. Available from: http://www.amazon.com/Managing-Research-Data-Graham-Pryor/dp/1856047563

41. Rajaraman A, Ullman JD. Mining of Massive Datasets. Cambridge: Cambridge University Press; 2011; Available from: http://ebooks.cambridge.org/ref/id/CBO9781139058452

42. Reznik-Zellen R, Adamick J, McGinty S. Tiers of Research Data Support Services. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):27–35. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss1/5/

43. Rosenthal DSH, Vargas DL. LOCKSS Boxes in the Cloud. 2012. Available from: http://www.lockss.org/locksswp/wp-content/uploads/2012/09/LC-final-2012.pdf

44. Rosenthal D, Rosenthal D, Miller E. The Economics of Long-Term Digital Storage. fsl.cs.sunysb.edu [Internet]. [cited 2012 Dec 2];1–8. Available from: http://www.fsl.cs.sunysb.edu/docs/unesco12/UNESCO2012-storage-econ.pdf

45. Salo D. Retooling Libraries for the Data Challenge [Internet]. Web Magazine for Information Professionals. 2010 [cited 2012 Nov 9]. Available from: http://www.ariadne.ac.uk/issue64/salo

46. Schemes M. Understanding Metadata. Bethesa, MD: NISO Press; 2004. Available from: http://www.niso.org/publications/press/UnderstandingMetadata.pdf

47. Society TR. Science as an open enterprise. London: The Royal Society; 2012. Available from: http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012-06-20-SAOE.pdf

48. Starr J, Willett P, Federer L, Horning C, Bergstrom M. A Collaborative Framework for Data Management Services: The Experience of the University of California. Journal of eScience Librarianship [Internet]. 2012 Oct 3 [cited 2012 Nov 10];1(2):109–14. Available from: http://escholarship.umassmed.edu/jeslib/vol1/iss2/7

49. Strasser C, Cook R, Michener W, Budden A. Primer on Data Management: What you always wanted to know [Internet]. 2012. p. 1–11. Available from: http://www.dataone.org/sites/all/documents/DataONE_BP_Primer_020212.pdf

50. Tenopir C, Birch B, Allard S. Academic Libraries and Research Data Services: Current Practices and Plans for the Future [Internet]. 2012. Available from: http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf

51. Thibodeau K. Certificate of Advanced Study in Digital Preservation. Proceedings of the 1st International Digital Preservation Interoperability Framework Symposium on – INTL-DPIF  ’10 [Internet]. New York, New York, USA: ACM Press; 2010;1–9. Available from: http://dl.acm.org/citation.cfm?doid=2039263.2039264

52. Trinidad SB, Fullerton SM, Bares JM, Jarvik GP, Larson EB, Burke W. Genomic research and wide data sharing: views of prospective participants. Genetics in medicine : official journal of the American College of Medical Genetics [Internet]. 2010 Aug [cited 2012 Oct 29];12(8):486–95. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3045967&tool=pmcentrez&rendertype=abstract