Open Access & Open Data: Projects that librarians should know about (and share with others!)

Last week I had the opportunity to attend a presentation by Heather Joseph –  a representative of SPARC (Scholarly Publishing and Academic Resources Coalition) – to hear about some of the great open access journal publishing initiatives taking place. There are a variety of publishing platforms that have emerged as of late that offer their own unique way of promoting open access and supporting research sharing. I thought I would share with you some of the initiatives that Heather highlighted in her talk. 

To extend the discussion into the realm of open access data, I also want to discuss a few of the data sharing initiatives I have found while working on my current projects. I believe that these data sharing resources represent an ideal  future for research and data publication; they offer platforms where investigators can share data, collaborate and modify data with other researchers and even use software to transform their datasets into education materials. To access each resource, click on the images to link to their respective webpages.

Open Access Publishers

Public Library of Science (PLOS)


The most obvious on the list but I feel like I would have heard about it from colleagues if I didn’t include it. PLOS is the initiative that provides multiple platforms for scientific journals that are completely open access. They are strong advocates of sharing research and have 9 core principles that promote sharing, community engagement and scientific excellence. PLOS hosts many excellent journals such asPLOS ONE, which publishes across the full range of life and health sciences; community journals (PLOS GeneticsPLOS Computational BiologyPLOS Pathogensand PLOS Neglected Tropical Diseases); and  PLOS Medicine and PLOS Biology. PLOS Blogs and Currents also make for some excellent reading, focused mainly on the issues of research sharing and open access. I read PLOS blogs and currents on a regular basis, as they provide excellent information on open access and focus on many publication issues that librarians need to be aware of.


eLife   the funder researcher collaboration and forthcoming journal for the best in life science and biomedicine

eLIFE is one of the new actors in the realm of open access publishing, and prides itself on being:

a researcher-led digital publication for outstanding work, a platform to maximise the reach and influence of new findings and a showcase for new approaches for the presentation and assessment of research.

Working with the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust among 200 others, eLIFe is focusing its attention to early-career researchers. Their goal is to make researchers first foray into publishing a constructive and fair exercise by providing a fair, transparent, and supportive author experience. eLIFE is also interested in promoting data sharing, but I don’t think it has been fully realized yet. I look forward to see what will come out of eLIFE as it continues to grow.



PeerJ offers a different model from eLife and PLOS in that it costs money to sign up, but for a small sum a publisher can be set up with a publication platform for life. $99 allows a researcher to publish one article per year for life; $199 allows a researcher to publish twice a year for life; and $299 provides the researcher with the opportunity to publish as many articles as they want per year. There is still a rigorous peer review process and paying this amount does not guarantee that their papers will be accepted. It is also important to note that all authors of an article must be members of PeerJ to submit. PeerJ has a set list of criteria that need to be met and provides an extensive list of editors from various disciplines that review submissions. Furthermore, every PeerJ member is required to review at least one paper each year or participate in post-publication peer review.

A news article in Nature comments on PeerJ as one of the cheapest options for this type of publishing. I highly encourage everyone to read the news article as it provides some insight into the emerging nature of open access publishing platforms. PeerJ seems like a good idea, but we’ll have to see if it will generate enough of a following to remain sustainable over time.

Open Humanities Alliance

Open Humanities Alliance

For my humanities friends out there, I had to include the Open Humanities Alliance in this list. The Alliance is a community-building project of thOpen Humanities Press. It aims to overcome some of the common technical barriers to open access in the humanities by linking students and faculty with resources such as open source software, hosting and archiving. The Open Humanities Alliance is a way for like-minded people from inside or outside the academy to work together in opening humanities scholarship to the world.

The one project that is sponsored by the Alliance that I want to talk about is the Open Access Journal Incubator ibiblio. This project is designed to provide researchers with a place to access a wide variety of research (music, art, literature, politics, etc.) as well as share their own. Contributors to ibiblio have to meet their set of criteria before they can share their research, but the requirements are clear and easy to follow. I had a lot of fun rooting around the site looking at the 900+ collections.

Data Sharing Projects

As a result of the discussions of research data sharing within the scientific community, projects such as HUBzero, Cytobank, and WebPAX have emerged to broach the subject through online communities that encourage the sharing of research data, foster research collaboration, and promote collective data analysis. I discuss a little bit about each one below.



Cytobank is a data sharing repository designed to manage, share, and analyze flow cytometry data from any researcher. Cytobank prides itself on being a platform for researchers, collaborators, lab and core facility managers, developers and statisticians, educators and trainers, and vendors.

What is great about Cytobank is that it allows researchers to manage their own data and host it on a cloud server; share experiment data and details quickly and easily through the web to other Cytobank users; foster interactive discussions around particular experiments; and allow researchers to turn their cytometry data into education materials. I believe that we will be seeing more repositories like Cytobank as data sharing becomes more common among researchers. This type of repository represents the potential benefits of data sharing by providing researchers with a place where they can store and manage their research as well as collaborate with others to achieve new scientific discovery.


HUBzero   Platform for Scientific Collaboration

HubZERO is an open source software platform for building powerful Web sites that support scientific discovery, learning, and collaboration. The scientific community has started to refer to web sites like this as “collaboratories” supporting “team science.” HubZERO differs from Cytobank in that it provides a content management system that is  built to support scientific activities. Using this system researchers can work together in projects, publish datasets and computational tools with Digital Object Identifiers (DOIs), and make these publications available for others to use as live, interactive digital resources. HubZERO’s datasets and tools run on cloud computing resources, campus clusters, and other national high-performance computing (HPC) facilities. You can take a look at some existing hubs here.

These hubs represent new and exciting innovations in data sharing. These sites are dynamic with options to build animations with data; download data; take courses to understand various datasets; view publications associated with the data;  observe online presentations about the data; and even create online simulations based on the data.

WebPax   Share Your Medical Images

WebPax is exciting because it focuses primarily on sharing medical imagery. Researchers can host and manage their medical images on the site and share them with colleagues for further analysis. Researchers create an account and have full control over who can view their images. They can then share their images with a select group of people or post them to where all members can see them. In case you were wondering about privacy, all images are anonymized and encrypted using secure socket layer (SSL) encryption technologies to make sure that third parties are unable to access this sensitive information. Because so many physicians come into the library wanting to see images on a particular topic, I think WebPax would be an excellent resource to point them to. Not only will it give them another option for viewing images, but it might even encourage them to share some of their own.

A Data Management and Data Sharing Bibliography for Librarians

It has been a while since I last posted. December was a pretty crazy month and I’ve been working on some excellent projects (more to come on the blog in a few weeks). In the meantime, a colleague of mine – the talented @fsayre – and I have been working hard to compile all of the literature on data management that we thought would be useful for librarians. Since we are both medical librarians, there are quite a few articles that are health-focused, but the majority should be useful for any librarian. 

The two of us are hoping to start a Mendeley group where more librarians can join and share their experiences and ideas about working with data management. We would love to have the input of more librarians, so please let us know via this blog or on Twitter if you would be interested in joining our Mendeley group.

As for this bibliography, while we’ve tried to make it as comprehensive as possible, we encourage people who read this to add additional material in case we’ve missed some resources.  Also, if you’re interested in looking at some other resources, check out my posts on the Data Curation Lifecycle and data management resources for librarians. Happy reading!

**Update** The Mendeley Group is now up and running and you can request to join it here: We encourage all of those who are interested to sign up, and you are not required to contribute if you do not want to. Otherwise, we hope that librarians will share resources as well as their experiences working with data.

1. Advisor E, Committee WP, Attribution S. Report on the International Workshop on Contributorship and Scholarly Attribution Report written by Irene Hames , Editorial Advisor and Consultant , with input and some facilitators Workshop Planning Committee Executive summary. 2012;2012(May):1–29.

2. Allard S. DataONE: Facilitating eScience through Collaboration. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):4–17. Available from:

3. Auckland M. Re-skilling for Research. RLUK Research Libraries UK. 2012. Available from:

4. Baker M. Gene data to hit milestone. Nature [Internet]. 2012 Jul 19 [cited 2012 Nov 1];487(7407):282–3. Available from:

5. Bloom T. Dealing with data. PLOS Biologue [Internet]. 2012 [cited 2012 Nov 9]; Available from:

6. National Science Board. Digital Research Data Sharing and Management. National Science Foundation. Arlington, VA; 2011. Available from:

7. Borgman CL. Research Data : Who will share what, with whom, when, and why ? China-North American Library Conference. Beijing; 2010. p. 21. Available from:

8. Charles W. Bailey J. Research Data Curation Bibliography [Internet]. Houston: Charles W. Bailey, Jr.; 2012 [cited 2012 Nov 9]. Available from:

9. Christensen-Dalsgaard B. Ten recommendations for libraries to get started with research data management. Wirtschaftsforschung, Berlin; 2012 p. 3. Available from:

10. Creamer A. Creating an Online Research Data Management Course: A Conversation with Data Librarians Robin Rice and Stuart Macdonals. Worcester, MA; 2011. Available from:

11. Creamer A, Morales M, Crespo J, Kafel D, Martin E. An Assessment of Needed Competencies to Promote the Data Curation and Management Librarianship of Health Sciences and Science and Technology Librarians in New England. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):18–26. Available from:

12. Creamer A, Morales M, Crespo J, Kafel D, Martin E. Data Curation and Management Competencies of New England Region Health Sciences and Science and Technology Librarians [Internet]. University of Massachusetts and New England Area Librarian e-Science Symposium 2011. Available from:

13. Crosas M. The Dataverse Network. The Institute of Quantitative Social Science 2012. Available from:

14. D’Ignazio J, Qin J, Kitlas J. Using internship experience to evaluate a new program in eScience librarianship. Proceedings of the 2012 iConference on – iConference  ’12 [Internet]. New York, New York, USA: ACM Press; 2012;601–2. Available from:

15. Dukes P. Maximising value of population health sciences data The role for Data Management Plans MRC data strategy. 2012;(November). Available from:

16. Eynden AV Van Den, Corti L, Bishop L, Horton L. Managing and Sharing Data: Best Practices for Researchers. UK Data Arrchive; 2011. Available from:

17. Ferguson J. Description and Annotation of Biomedical Data Sets. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):51–6. Available from:

18. Godlee F. Clinical trial data for all drugs in current use. BMJ [Internet]. 2012 Oct 29 [cited 2012 Nov 2];345(oct29 2):e7304–e7304. Available from:

19. Gore S a. e-Science and data management resources on the Web. Medical reference services quarterly [Internet]. 2011 Jan;30(2):167–77. Available from:

20. Hackett Y. A National Research Data Management Strategy for Canada: The Work of the National Data Archive Consultation Working Group. 2001. Available from:

21. Heidorn PB. The Emerging Role of Libraries in Data Curation and E-science. Journal of Library Administration [Internet]. Routledge; 2011 Oct [cited 2012 Nov 9];51(7-8):662–72. Available from:

22. Hey A, Tansley S, Tolle K. The fourth paradigm: data-intensive scientific discovery [Internet]. Microsoft Research; 2009 [cited 2012 Nov 9]. Available from:

23. Hswe P, Holt A. Guide for Research Libraries: The NSF Data Sharing Policy [Internet]. Association of Research Libraries. 2011 [cited 2012 Oct 11]. Available from:

24. Inouye D, Scheiner S. Some Simple Guidelines for Effective Data Management. Bulletin of the Ecological Society of America. 2009;2:1–10. Available from:

25. Interview with Svetia Baykoucheva and James Mullin: What Do Libraries Have to Do with e-Science ? ACS Division of Chemical Information (CINF ). 2011;1–2. Available from:

26. Jahnke L, Asher A, Keralis SDC. The Problem of Data. Washington, DC: Council on Library and Information Resources; 2012. Available from:

27. Johnston L, Lafferty M, Petsan B. Training Researchers on Data Management: A Scalable, Cross-Disciplinary Approach. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 8];1(2). Available from:

28. Kafel D, Morales M, Vander Hart R, Gore S, Creamer A, Crespo J, et al. Building an e-Science Portal for Librarians: A Model of Collaboration. Journal of eScience Librarianship [Internet]. 2012;1(1):41–5. Available from:

29. LeFurgy B. Data-Intensive Librarians for Data-Intensive Research [Internet]. The Signal: Digital Preservation. 2012 [cited 2012 Nov 9]. Available from:

30. Lamar Soutter Library, University of Massachusetts Medical School and the George C. Gordon Library, Worcester Polytechnic Institute. Frameworks for a Data Management Curriculum [Internet]. Worcester; 2011 p. 1–67. Available from:

31. Lesk M. Data curation : just in time , or just in case ? International Association of Scientific and Technological University Libraries, 31st Annual Conference. West Lafayette, IN; 2010. Available from:

32. Mayernik MS. The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation [Internet]. D-Lib Magazine. 2012. Available from:

33. Minnesota U of. Data Management 101 – Planning Checklist.

34. Most WC. Keeping Research Data Safe: Cost issues in digital preservation of research data. 2:5–6. Available from:

35. NISO. Linked Data for Libraries, Archives and Museums. Information Standards Quarterly. 2012;24(2/3). Available from:

36. Pathak J, Wang J, Kashyap S, Basford M, Li R, Masys DR, et al. Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience. Journal of the American Medical Informatics Association : JAMIA [Internet]. [cited 2012 Oct 29];18(4):376–86. Available from:

37. Piorun M, Kafel D, Leger-Hornby T, Najafi S, Martin E, Colombo P, et al. Teaching Research Data Management: An Undergraduate/Graduate Curriculum. Journal of eScience Librarianship [Internet]. 2012;1(1):46–50. Available from:

38. Piwowar HA, Vision TJ, Whitlock MC. Data archiving is a good investment. Nature [Internet]. Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.; 2011 May 19 [cited 2012 Nov 9];473(7347):285. Available from:

39. Piwowar H a., Day RS, Fridsma DB. Sharing Detailed Research Data Is Associated with Increased Citation Rate. Ioannidis J, editor. PLoS ONE [Internet]. 2007 Mar 21 [cited 2012 Oct 25];2(3):e308. Available from:

40. Pryor G. Managing Research Data [Internet]. Facet Publishing; 2012 [cited 2012 Nov 9]. p. 224. Available from:

41. Rajaraman A, Ullman JD. Mining of Massive Datasets. Cambridge: Cambridge University Press; 2011; Available from:

42. Reznik-Zellen R, Adamick J, McGinty S. Tiers of Research Data Support Services. Journal of eScience Librarianship [Internet]. 2012 [cited 2012 Nov 10];1(1):27–35. Available from:

43. Rosenthal DSH, Vargas DL. LOCKSS Boxes in the Cloud. 2012. Available from:

44. Rosenthal D, Rosenthal D, Miller E. The Economics of Long-Term Digital Storage. [Internet]. [cited 2012 Dec 2];1–8. Available from:

45. Salo D. Retooling Libraries for the Data Challenge [Internet]. Web Magazine for Information Professionals. 2010 [cited 2012 Nov 9]. Available from:

46. Schemes M. Understanding Metadata. Bethesa, MD: NISO Press; 2004. Available from:

47. Society TR. Science as an open enterprise. London: The Royal Society; 2012. Available from:

48. Starr J, Willett P, Federer L, Horning C, Bergstrom M. A Collaborative Framework for Data Management Services: The Experience of the University of California. Journal of eScience Librarianship [Internet]. 2012 Oct 3 [cited 2012 Nov 10];1(2):109–14. Available from:

49. Strasser C, Cook R, Michener W, Budden A. Primer on Data Management: What you always wanted to know [Internet]. 2012. p. 1–11. Available from:

50. Tenopir C, Birch B, Allard S. Academic Libraries and Research Data Services: Current Practices and Plans for the Future [Internet]. 2012. Available from:

51. Thibodeau K. Certificate of Advanced Study in Digital Preservation. Proceedings of the 1st International Digital Preservation Interoperability Framework Symposium on – INTL-DPIF  ’10 [Internet]. New York, New York, USA: ACM Press; 2010;1–9. Available from:

52. Trinidad SB, Fullerton SM, Bares JM, Jarvik GP, Larson EB, Burke W. Genomic research and wide data sharing: views of prospective participants. Genetics in medicine : official journal of the American College of Medical Genetics [Internet]. 2010 Aug [cited 2012 Oct 29];12(8):486–95. Available from: