Last week DataCite - the international registry of data citations - released a new tool designed to allow users to create metadata using text inputs through a quick and easy form in HTML. What's great about this tool is that it doesn't require any software installation whatsoever, and it represents DataCite's most recent version of their metadata schema - version 3…
Recently I’ve been working on a survey of studies that focus on how libraries are reaching out to their institutions’ faculty and researchers about how they produce, share and store their data. Where I’m currently working we are trying to implement the same time type of research, but wanted to see what other libraries have done before launching into a project. I was even optimistic that some of the research I turned up might even give me the answers to our questions:
What type of data are biomedical researchers creating in a variety of disciplines?
Where do they stand in terms of sharing data?
How are they currently storing their data?
While I was pleased to find a number of articles that were excellent and exactly the type of research I was looking for (see the end of the post), I was ultimately disappointed in the content that I found. Let me explain the good first however, before I start with the bad.
The methodology used in many of the articles I found was comprehensive, highly detailed, and provided me with a wealth of information about how I could go about finding out the answers to my data-related amongst my institution’s researchers and facutly. For example, many of the research studies described and provided (in detail) the interview questions that they used (Bardyn et al; Westra); focus group strategies (Adamick et al.; Jones et al.; and bibliographic analyses (Williams et al.; Xia et al.) – this was excellent material for me that I could reuse to structure my own institution’s approach to developing data-related services.
Where everything came apart for me was in several of the authors’ approach to the results section of their research. Very few articles excluding (Lage et al.; Scaramozzino et al.; Walters; Westra; Xia et al.) provided full results from their interviews or focus groups, and quantitative data was scarce. The reason I chose to survey existing research in the first place was to find out answers to my questions, and when I turn to research in my field, I expect to read concrete findings that will inform my own research.
For example, if I am reading articles that state in the methodology that they surveyed their school of medicine researchers about their data-related habits, I am hoping to find data pertaining to the types and size of data their institution creates. This would be especially helpful if my institution serves similar biomedical disciplines and could ideally supplement a lot of work that would be required by a number of different libraries across the globe. Why wasn’t all of the data included in the article? Is there an underlying understanding that if I actually want to see full results I need to contact the author(s) directly to get it? This has to change.
The lack of results reporting is also a concern of mine because I have no evidence that these studies were actually completed. Sure you can say that the research study interviewed X number of people, and based on their responses you started a data management service. But what does that tell other people in our field about the behaviour, and work practices of researchers and faculty? Why omit the most interesting and useful data from the article?
Fortunately, I was able to find some excellent information from a select number of articles; Walters and Westra both provided articles that gave me a full indication of the types, size and department from which their data came from. Furthermore their description of their interviews were comprehensive, and strong quantitative data about their responses was collected and presented in the paper. This is what I come to expect from strong library-related research. We need to start thinking about presenting our data more clearly, and presenting all of it to our fellow information professionals.
Let it be known that I am not trying to condemn a large portion of library research because it does not provide the comprehensive level of data and results one comes to expect from quality research. Instead, I am hoping to encourage us all (myself included) to be more thorough in our data collection and results reporting, and think about who our research can be useful for. Is the purpose of publishing research just to publish? Or is it to help others advance the profession and implement products and services that have been proven to be effective? We are a profession that prides itself on our encouragement and passion for information sharing; by following this mantra in our research more effectively I believe we have the capacity to produce outstanding research that will be of direct benefit to librarians in their work, and ultimately to the institutions that we serve. Thanks for reading – I’m happy to discuss this further in the comments if anyone is interested.
Adamick, Jessica, MJ Canavan, Steven McGinty, Rebecca Reznik-Zellen, Maxine Schmidt, and Robert Stevens. 2011. Building as We Climb: The Data Working Group at the University of Massachusetts Amherst. University of Massachusetts and New England Area Librarian e-Science Symposium. http://escholarship.umassmed.edu/escience_symposium/2011/posters/3.
Bardyn, Tania P., Taryn Resnick, and Susan K. Camina. 2012. “Translational Researchers’ Perceptions of Data Management Practices and Data Curation Needs: Findings from a Focus Group in an Academic Health Sciences Library.” Journal of Web Librarianship 6 (4) (October): 274–287. http://www.tandfonline.com/doi/abs/10.1080/19322909.2012.730375.
Carlson, Jacob, Michael Fosmire, C.C. Miller, and Megan Sapp Nelson. 2011. “Determining Data Information Literacy Needs: A Study of Students and Research Faculty.” Portal: Libraries and the Academy 11 (2): 629 – 657.
Delserone, Leslie M. 2008. “At the Watershed: Preparing for Research Data Management and Stewardship at the University of Minnesota Libraries.” In Library Trends, 57:202–210. Urbana-Champaign, Illinois: John Hopkins University Press and the Graduate School of Library and Information Science. https://www.ideals.illinois.edu/handle/2142/10670.
Harrison, Andrew, and Sam Searle. 2010. “Not Drowning , Ingesting : Dealing with the Research Data Deluge at an Institutional Level.” In VALA2010 Proceedings. http://www.vala.org.au/vala2010/papers2010/VALA2010_43_Harrison_Final.pdf.
Hruby, Gregory William, James McKiernan, Suzanne Bakken, and Chunhua Weng. 2013. “A Centralized Research Data Repository Enhances Retrospective Outcomes Research Capacity: a Case Report.” Journal of the American Medical Informatics Association : JAMIA (January 15): 1–5. doi:10.1136/amiajnl-2012-001302. http://www.ncbi.nlm.nih.gov/pubmed/23322812.
Johnson, Layne M., John T. Butler, and Lisa R. Johnston. 2012. “Developing E-Science and Research Services and Support at the University of Minnesota Health Sciences Libraries.” Journal of Library Administration 52 (8) (November): 754–769. http://dx.doi.org/10.1080/01930826.2012.751291.
Jones, Sarah, Seamus Ross, and Raivo Ruusalepp. 2009. “Data Audit Framework Methodology”. Glasgow. http://www.data-audit.eu/DAF_Methodology.pdf.
Lage, Kathryn, Barbara Losoff, and Jack Maness. 2011. “Receptivity to Library Involvement in Scientific Data Curation: A Case Study at the University of Colorado Boulder.” Portal: Libraries and the Academy 11 (4): 915–937. http://muse.jhu.edu/journals/portal_libraries_and_the_academy/v011/11.4.lage.html.
Newton, Mark P, C C Miller, and Marianne Stowell Bracke. 2011. “Librarian Roles in Institutional Repository Data Set Collecting: Outcomes of a Research Library Task Force.” Collection Management 36 (1): 53–67.
Peters, Christie, and Anita Riley Dryden. 2011. “Assessing the Academic Library’s Role in Campus-Wide Research Data Management: A First Step at the University of Houston.” Science & Technology Libraries 30 (4) (September): 387–403. http://dx.doi.org/10.1080/0194262X.2011.626340.
Piwowar, Heather a. 2011. “Who Shares? Who Doesn’t? Factors Associated with Openly Archiving Raw Research Data.” PloS One 6 (7) (January): e18657. doi:10.1371/journal.pone.0018657. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3135593&tool=pmcentrez&rendertype=abstract.
Raboin, Regina, Rebecca C. Reznik-Zellen, and Dorothea Salo. 2012. “Forging New Service Paths: Institutional Approaches to Providing Research Data Management Services.” Journal of eScience Librarianship 1 (3). http://escholarship.umassmed.edu/jeslib/vol1/iss3/2/.
Reznik-Zellen, Rebecca, Jessica Adamick, and Stephen McGinty. 2012. “Tiers of Research Data Support Services.” Journal of eScience Librarianship 1 (1): 27–35. doi:10.7191/jeslib.2012.1002. http://escholarship.umassmed.edu/jeslib/vol1/iss1/5/.
Scaramozzino, Jeanine Marie, Marisa L. Ramirez, and Karen J. McGaughey. 2012. “A Study of Faculty Data Curation Behaviors and Attitudes at a Teaching-Centered University.” College & Research Libraries 73 (4) (July 1): 349–365. http://crl.acrl.org/content/73/4/349.abstract.
Soehner, Catherine, Catherine Steeves, and Jennifer Ward. 2010. “E-Science and Data Support Services” (August). http://www.arl.org/storage/documents/publications/escience-report-2010.pdf.
Trinidad, Susan Brown, Stephanie M Fullerton, Julie M Bares, Gail P Jarvik, Eric B Larson, and Wylie Burke. 2010. “Genomic Research and Wide Data Sharing: Views of Prospective Participants.” Genetics in Medicine : Official Journal of the American College of Medical Genetics 12 (8) (August): 486–95. doi:10.1097/GIM.0b013e3181e38f9e.
Walters, Tyler O. 2009. “Data Curation Program Development in U.S. Universities: The Georgia Institute of Technology Example.” International Journal of Digital Curation 4 (3): 83–92. http://www.ijdc.net/index.php/ijdc/article/viewFile/136/153.
Westra, Brian. 2010. “Data Services for the Sciences: A Needs Assessment.” Ariadne (64). http://www.ariadne.ac.uk/issue64/westra.
Williams, Sarah C. 2013. “Using a Bibliographic Study to Identify Faculty Candidates for Data Services.” Science & Technology Libraries (May 9): 1–8. http://dx.doi.org/10.1080/0194262X.2013.774622.
Xia, Jingfeng, and Ying Liu. 2013. “Usage Patterns of Open Genomic Data.” College & Research Libraries 74 (2) (March 1): 195–207. http://crl.acrl.org/content/74/2/195.abstract.
This isn’t something i’ve done before, but one of my fellow colleagues – Diana Almader-Douglas, has spent the last 6+ months updating some excellent resources on culture and health literacy at the National Library of Medicine. Diana is incredibly knowledgeable about these issues, and has asked if I would be willing to let her write a short post on my blog. You can read the post in its entirety below, and it is full of useful information about this issue – especially for health sciences librarians. I will make a disclaimer that this post is more focused on issues in the US, but I think that issues surrounding culture and health literacy presented here are applicable to Canada as well. Enjoy!
Through a National Library of Medicine Associate Fellowship Project, I evaluated and enhanced the National Network of Libraries of Medicine’s (NN/LM) Health Literacy resource by adding content and resources related to culture in the context of health literacy.
By providing information about the relationship between culture and health literacy, the highly-utilized resource has the ability to impact a wider audience by encouraging the dissemination of culturally relevant health information by librarians and information professionals.
Through this project, I aimed to raise awareness about vulnerable and special populations while highlighting the connection to health disparities and health literacy.
Culture is one component of health literacy, but it is also a critical element of the complex topic of health literacy. Culture shapes communication, beliefs, and the comprehension of health information. By enhancing the NN/LM Health Literacy Web page with content about health literacy in a cultural context, users of the page, and end users will be able to better meet the health information needs of vulnerable and diverse population groups they are serving.
For more information about culture and health literacy, visit:
Benjamin RM. Improving Health by Improving Health Literacy. Public Health Rep. 2010, Nov-Dec; 125(6):784-785. Available from:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966655/pdf/phr125000784.pdf
United States Department of Health & Human Services. Health Resources and Services Administration (HSRA). Culture, Language and Health Literacy. Available from: http://www.hrsa.gov/culturalcompetence/index.html
United States Department of Health & Human Services. National Library of Medicine Specialized Information Services Outreach Activities & Resources.Multi-cultural Resources for Health Information. Available from:http://sis.nlm.nih.gov/outreach/multicultural.html
Thanks for reading. I hope health sciences librarians will find this information to be useful. Just to add a bit of Canadian content, I have included some Canadian health literacy resources below – many of which could use the cultural focus that Diana has implemented for the NNLM:
Canadian Public Health Association Health Literacy Portal:http://www.cpha.ca/en/portals/h-l.aspx
Canadian Council on Learning. Health Literacy in Canada: A Healthy Understanding: http://www.ccl-cca.ca/ccl/Reports/HealthLiteracy.html
Health Literacy Council of Canada:http://healthliteracy.ca/
Public Health Agency of Canada: http://www.phac-aspc.gc.ca/cd-mc/hl-ls/index-eng.php
Podcast on Health Literacy and Cultural Competence. Centre for Literacy: http://www.centreforliteracy.qc.ca/news/podcast-health-literacy-and-cultural-competence
I thought I would take this opportunity to weigh in on the deal between Library and Archives Canada and Canadiana, which calls for the transfer and digitization of the largest collection of Canadian archival records in history. I want to make it clear that in the grand scheme of things I think that this project is all in all a very good thing for archives in Canada, and is long overdue. What worries me is that the details surrounding this deal are largely unclear, and I think it is important for us, being Canadian archivists and librarians, to ask specific questions about this deal to ensure that this heritage collection is safe, and will ultimately be freely available to all Canadians who want to view it.
Canadiana has already tried to quell some of the hysteria surrounding the deal with their recently published FAQ, but if I’m honest there are a lot of questions that I have that are still largely left unanswered. I even asked Canadiana on Twitter the other day to clarify the issues surrounding the ‘Premium’ payment that would be required if I wanted to have access to the search and discovery features they will be developing, but I have yet to hear a reply. I think this line from the FAQ deserves a more detailed explanation:
Until the completion of the project, this searchable, full-text data will be one of the premium services.
Does this mean that once the project is completed everyone will have free access to these features? If this is only one of the premium features, what else will we be missing out on if we don’t pay? These are just some of the questions I have about the deal, but more importantly, I think it is crucial that we start asking those involved (CRKN, CARL, LAC, Canadiana) how they plan to manage, describe and preserve this enormous amount of information and make sure that it will be available to Canadians for years to come. A lot of these questions have been discussed in Bibliocracy’s blog posts on the issue, but I would like to reiterate and request that the library and archives community start asking Canadiana and LAC their own questions to hopefully spur on more details about the project. To start it off, I have outlined below the questions that I would like to have answered:
How will this information be stored, and consequently transferred back to LAC once the full digitization process is complete?
Information architecture is obviously a crucial component of this project, as the collection will need to be stored someplace where it can be accessed by all. I think it is more important that we receive an answer about how all of this content will be transferred back to LAC. There are many methods and avenues this project can take in terms of placing the material in a repository or content management system to hold of all this material, and I think that both parties owe it to us to explain how this work will be completed. Will Canadiana use something like CLOCKSS to ensure that this material is preserved and made freely available forever? Or will this be the responsibility of LAC once the project is done? I would like to know that themigration of digital documents will be easily transferred back to LAC once this is over. Which brings me to my next question:
What measures will be taken regarding the digital preservation of the finalized, newly described content?
I’m hoping that having the responsibility of managing Canada’s largest archival collection will spearhead Canadiana to take measures to ensure the preservation not only of the physical content, but the newly digitized content as well. I would like to know where they plan on storing all of this information – will copies be held in a dark archive to ensure its long-term preservation? Will they use an Open Archival Information System (OAIS)? Will they use the Trusted Digital Repository model? It would be nice to see something akin to a Trustworthy Repository Audit and Certification (TRAC) so that Canadian information professionals feel confident that the proper steps are being taken to preserve this digital content.
What type of metadata schemas will be used?
This one is pretty self explanatory, but seeing as this is a Canadian initiative one would have to assume that Canada’s RAD archival description schema will be used. Seeing as linked data has become so prominent as of late, does Canadiana have plans to use RDF to encourage and support linked data within this collection? Because one of the main goals of this project is to make this content more discoverable and searchable, I think it would be helpful for us to understand how all of this transcription and metadata tagging will take place.
What do you really mean when you say that all of the content will be open access?
When I hear the term open access used to describe information content I always get excited. If this effort is truly going to make all of this digitized archival material open access, then that is fantastic. However with this deal, there are details of how open access is being described in this context that have me scratching my head. For a definition of open access, I like to use SPARC’s definition, which they define (in a nutshell) as material that has:
immediate, free availability on the public internet, permitting any users to read, download, copy, distribute, print, search or link to the full text of collections, crawl them for indexing, pass them as data to software or use them for any other lawful purpose
There have been a lot of discussions around Canadiana’s statement that they will be making the digital content available for free via a Creative Commons license. What I don’t understand is that in order to access certain features of this content, you will have to pay a premium fee. That doesn’t sound very open access to me, but a simple clarification would help with this fact. Which leads me to:
Can you please elaborate on the fees that are involved with premium access, and how this will work with the 10% of digital material released per year for 10 years?
This question has been on my mind since I heard about this deal (as I described above). What I would like to know if how this premium fee will work: what will it cost? what features are involved? Will the premium features become freely available once every 10% of the digitization process is completed?
I understand that in order to create high quality descriptive metadata for digitization you need money to do it. I don’t have as much of a problem with that, but what worries me is that these details have no been provided to us. By not answering this one glaring questions, Canadiana has made me nervous that I will have to pay, or my institution will have to pay for content over the long term? How do I know that these charges won’t continue once you finish the project?
What experts are going to be consulted for this project?
I know that CRKN and CARL have both supplied money for this project, but it would be very comforting to know that highly skilled, expert personnel will be working on this project. As a librarian and archivist, I want this effort to succeed at the highest level. In order to feel confident that this will be the case, I think it would be wise to inform the library and archival community in Canada as to who will be advising this effort. I always like specifics, and knowing that the best people are working on this effort will go a long way towards easing my mind.
In the end, all I’m asking for is a little bit of transparency. This project will have an effect on a huge number of information professionals, researchers, and the general public. I think that this project shows a lot of promise, and should be a cause for excitement amongst the Canadian information community. However, until Canadiana or LAC provide specifics about this deal, I will be holding my excitement. The lack of explanation, and vagueness of this project should be a cause of concern for everyone. Ultimately, I don’t think an open and transparent explanation of a project that affects so many Canadian people is too much to ask for.
I encourage other Canadian archivists and librarians to ask their own questions about this deal through blogs, social media, or email in hopes that it will generate enough demand that Canadiana and LAC will have to respond. I am only a small voice in this, and it would be great to see others get involved. Using #heritagedeal on Twitter could help synthesize all of this information in one place.
Thanks for reading.
I realize I haven’t written a post in over a month, and I feel horribly guilty about it. The one good thing about not having the time to write blog posts frequently is that I now have a stockpile of ideas, and plenty of material to write more frequent posts.
What I would like to address in today’s post is some of the ongoing efforts from journals, government agencies, and open source communities have taken to address the need to publish data, in all of its messy and intricate formats. Similar to my previous posts, I will describe each of the efforts that I find to be promising in terms of their ability to tackle this massive, and complicated task. In case readers are unfamiliar with the concept of a data publication, I define the concept based on a hybrid of different viewpoints from papers by Borgman, Lynch, Reilly et al., Smith, and White:
A data publication takes data that has been used for research and expands on the ‘why, when and how’ of its collection and processing, leaving an account of the analysis and conclusions to a conventional article. A data publication should include metadata describing the data in detail such as who created the data, the description of the type of data, the versioning of the data, and most importantly where the data can be accessed (if it can be accessed at all). The main purpose of a data publication is to provide adequate information about the data so that it can be reused by another researcher in the future, as well as provide a way to attribute data to its respective creator. Knowing who creates data provides an added layer of transparency, as researchers will have to be held accountable for how they collect and present their data. Ideally, a data publication would be linked with its associated journal article to provide more information about the research.
Nature Publishing Group – Scientific Data
Scientific Data is the first of its kind in that it is an open access, online-only publication that is specifically designed to describe scientific data sets. Because the description of scientific data can be a complicated and exhaustive, this publication does an excellent job of addressing all of the questions that need to be asked of researchers before they even think of submitting their data. Scientific Data just came out with their criteria for publication today, and the questions they ask are exactly what is needed to ensure that the data publication will be able to be reused through appropriate description.
Then comes the next great component – the metadata. Scientific Data uses a ‘Data Descriptor’ model that requires narrative content about a data set such as the more traditional descriptors librarians are familiar with such as Title, Abstract and Methodology. What is excellent about the Data Descriptor model is that it also requires structured content about the data. This structured content uses the an ‘Investigation’, ‘Study’ and ‘Assay’ (ISA) open source metadata format to describe aspects of the data in detail. These major categories are apparently designed to be ‘generic and extensible’, and serve to address all scientific data types and technologies. You can check ISA out HERE.
Overall I think that Scientific Data is the beginning of a new trend in publishing where major journals will begin to publish data publications more frequently on top of traditional research articles. This publication is the first step towards making research data available, reusable and transparent within the scientific research community.
F1000Research – Making Data Inclusion a Requirement
F1000Research is an excellent new open science journal that has caught my attention for its foray into systematic reviews and meta analyses and for its recent ‘grace period’ to encourage researchers to submit their negative results for publication. I think that this publication that medical librarians should be aware of, and potentially encourage researchers to submit to should they be looking for a more frugal option. What really impresses me with F1000Research though, is their commitment to ensuring that data associated with research articles is made readily available.
Currently, F1000Research reviews data that is submitted in conjunction with an article, and then offers to deposit the data on the authors behalf in an appropriate data repository. The journal is open to placing in data in any repository, but they work mainly with figshare - a popular platform for sharing data. Together figshare and F1000Research have created a ‘data widget’ that allows figshare to link data files with its associated article in F1000Research – which is excellent! There was a recent blog post written about this widget here that can give it the attention it deserves: http://blog.f1000research.com/2013/05/23/new-f1000research-figshare-portal-and-widget-design/). F1000Research is also apparently working on a similar project with Dryad. I think that moving forward we will see more efforts from journals like F1000Research to seamlessly connect their publications with associated data. This is a crucial component to publishing data as the journal article provides the context in terms of how the data was used.
Dryad – Integrated Journals
Dryad is a data repository and service that offers journals the option of submission integration with their system. The service is completely free and is designed to simplify the process of submitting data, and ensure biodirectional links between the article and the data. Currently Dryad provides an option for data to be opened up to peer review, but I would like to see that become more of a requirement going forward. Here is a link to Dryad’s journal integration page: http://datadryad.org/pages/journalIntegration
Currently there are a number of journals currently participating in this effort, and a complete list of them can be seen HERE. Carly Strasser also did a great job of outlining other journals that require data sharing in her post about data sharing on the excellent blog Data Pub. I think Dryad is a perfect example of the other side of traditional publishing. We need data repositories like Dryad and figshare to continue supporting data publication and storage, as they represent half of the picture that will allow articles and data to be connected.
The Dataverse Network
The Dataverse Network is a data repository designed for sharing, citing and archiving research data. Developed by Harvard and the Data Science team at the Institute for Quantitative Social Science, Dataverse is open to researchers in all scientific fields. As a service, Dataverse organizes its data sets into studies; each study contains cataloguing information along with the data, and provides a persistent way to cite the data that has been deposited.
Dataverse also uses Zelig (an R statistical package) software that provide statistical modeling of the data that is submitted. Finally, Dataverse can also be installed as a software program into their own institutional data repositories. I see the ability to download Dataverse for institutional purposes to be an excellent prospective strategy; as more academic institutions begin to develop data storage capabilities to their institutional repositories, Dataverse will provide some much needed assistance in this arena.
GitHub: Git for Data Publishing
Although I would not call myself an expert of the GitHub world, I will say that I recognize a fruitful initiative to publish data when I see one. In a recent blog post by James Smith talking about how the tools of open source could potentially revolutionize open data publishing. The post is great and you can read it here: http://theodi.org/blog/gitdatapublishingutm_source=buffer&utm_medium=twitter&utm_campaign=Buffer&utm_content=buffer6c57f James’ idea is to upload data to GitHub repositories and use a DataPackage to attach metadata that will sufficiently describe the data. Ultimately the goal of using GitHub for data publication would enable sharing and reuse of data within a supporting and collaborative community. While some of this can get complicated, working through the links from his post really provides you with a sense of how an open source community is coming together to address the need to publish data.
Biositemaps is a working group within the NIH that is designed to:
(i) locating, (ii) querying, (iii) composing or combining, and (iv) mining biomedical resources
‘Biomedical resources’, in this case can be defined as anything from data sets to software packages to computer models. What is most interesting about Biositemaps is that they provide an Information Model that outlines a set of metadata that can be used to describe data. Using the Information Model as a base for data description, it then uses a Biomedical Resource Ontology (BRO); BRO is a controlled terminology for the ‘resource_type’, ‘area of research’, and ‘activity’ to help provide more information about how data is used, and how it can be described in detail using biomedical terminology. I will admit this resource is still pretty raw, but I think it has a lot of potential for being an excellent resource moving forward. The basic idea behind Biositemaps is that a researcher fills in a lengthy auto-complete form describing themselves, their data, and the methodology used to create the data. Once the form is complete, it produces an RDF file that is uploaded to a registry where it can be linked to, and from anywhere. If you are a medical librarian and you have researchers interested in publishing data, I encourage you to take a look at this resource.
SHARE Program – Association of Research Libraries (ARL), Association of American Universities (AAU), the Association of Public and Land-grant Universities (APLU)
This effort just came out last week, but the ARL, AAU and APLU are joining together to create a shared vision of universities collaborating with the Federal government and others to host institutional repositories across the the memberships to provide access to public access research – including data. While it is not entirely clear how this will be achieved – especially in the realm of data – I think that this is the type of collaboration that will provide a well researched, evidence based solution moving forward. I hope that SHARE continues to expand beyond the response to the OSTP memo, as I think Canadian academic institutions could benefit greatly from this effort. Here is a link to the development draft for SHARE: http://www.arl.org/storage/documents/publications/share-proposal-07june13.pdf
For Medical Librarians
My goal in presenting these data publication efforts is an attempt to get medical librarians to think more about the options that are available for data publication. Journals, government agencies and open source communities are all trying to address the issues surrounding data publication, and I think it is our duty as medical librarians to familiarize ourselves with journal policies around data sharing; data publication initiatives like DataCite, Dryad, and figshare; and new government efforts like Biositemaps that are becoming more heavily used every day, and will be relevant for our liaison and research areas of practice moving forward. I have tried to provide a lot of links within this post, but I’ve included some more reading below that may be useful. I’d also like to mention that this is by no means an exhaustive list, but rather some of the interesting efforts i’ve seen throughout my work with data. Please feel free to add as you wish in the comments section.
1. Borgman CL, Wallis JC, Enyedy N. Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. International Journal of Digital Libraries [Internet]. 2007;7:17–30. Available from: http://escholarship.org/uc/item/6fs4559s#
2. Lynch C. The shape of the scientific article in the developing cyberinfrastructure. CT Watch Quarterly [Internet]. 2007;3(3):5–10. Available from: http://www.ctwatch.org/quarterly/articles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastructure/
3. Piowowar H, Chapman W. A review of journal policies for sharing research data. Nature Precedings [Internet]. 2008. Available from: http://www.academia.edu/904922/A_review_of_journal_policies_for_sharing_research_data
4. Reilly S, Schallier W, Schrimpf S, Smit E, Wilkinson M. Report on Integration of Data and Publications [Internet]. 2011: p. 1–7. Available from: http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/10/ODE-ReportOnIntegrationOfDataAndPublications-exesummary.pdf
5. Smith VS. Data publication: towards a database of everything. BMC research notes [Internet]. 2009 Jan [cited 2013 Mar 3];2:113. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2702265&tool=pmcentrez&rendertype=abstract
6. Whyte A. IDCC13 Data Publication: generating trust around data sharing. Digital Curation Centre [Internet]. 2013 Jan 23; Available from: http://www.dcc.ac.uk/blog/idcc13-data-publication-generating-trust-around-data-sharing
I have spent the last 2+ years of my young medical library career pondering this question. I have benefitted from interacting with medical librarians on Twitter through the fantastic #medlibs hashtag – I use it for finding new information about the field, interacting with colleagues, and sharing great information I find those I know will find it useful. When it comes to Canadian library content however, I have no easy way of sharing this information. The same can be said for when I want to find useful information for Canadian medical librarians – I have no easy way to look for it on Twitter. That is not to say traditional methods of finding useful information on websites is a bad thing, but I personally think Canadian medical librarians would really benefit from a hashtag that would synthesize all of this great content and news. I’m going to spend the rest of this post trying to point to some great content i’ve found from Canadian medical librarians on Twitter, and hopefully prove my own point as to why a #canmedlibs hashtag would be useful. Here it goes.
Canadian Medical Librarian Tweeters
Below is a perfect example of a tweet that would benefit from a Twitter hashtag for Canadian medical librarians, and Dean Giustini has tried to incorporate the #canmedlibs hashtag to make it more searchable. This is a great tweet about the research papers published from the Canadian Health Library Association journal – what #canmedlibs wouldn’t want to know about that?
— Dean Giustini (@giustini) June 29, 2012
Here is a tweet from Natalie Clairoux, a medical librarian from the University of Montreal. Here she is posting some wonderful information about registration for the upcoming Canadian Health Library Association conference in May. This tweet would be incredibly useful for #canmedlibs, but Natalie has to use the #medlibs hashtag where a Canadian might have a hard time finding the information amongst the rest of the American-focused tweets. Natalie also posts excellent information related to bioinformatics, data management and medical information that would be very useful for #canmedlibs.
— Natalie Clairoux (@natalieclairoux) February 14, 2013
Another tweet from Mary-Doug Wright that introduces a new health innovation portal – this is a perfect opportunity to share this information with other #canmedlibs.
Aa tweet from Carol Cooke that provides a link to her health sciences library subject guide. This is another chance to provide #canmedlibs with insight into how other libraries are building their guides and providing services.
Below is another example of a Canadian medical librarian – Karen Neves – tweeting about Dalhousie University’s work with patron driven acquisitions, this provides more useful information about what other Canadian institutions are doing with their library services.
Doug Salzwedel is another great #canmedlibs tweeter who works at Cochrane and always provides great information with a Canadian focus. He also posts and retweets Cochrane-related information which I find useful:
Canadian Association for Health Services and Policy Research (CAHSPR) – 2013 Conference (Vancouver): cahspr.ca/en/conference/…
— Doug Salzwedel (@DougSalzwedel) March 11, 2013
Sarah McGill provides excellent tweets about systematic reviews and local Ottawa library-related events. I always enjoy her Twitter feed and I think a lot of other #canmedlibs would too.
— Sarah McGill (@SarahCMcGill) February 27, 2013
Franklin Sayre is another colleague and relatively new medical librarian in Canada that tweets a lot of useful information about medical librarianship:
Assessing availability of scientific journals, databases, and health library services in Canadian Health Ministries buff.ly/10nCA8d
— Franklin Sayre (@fsayre) March 21, 2013
Canadian Health Library Associations and Libraries
The other obvious group that provides useful information about Canadian medical libraries are all of the wonderful medical libraries and professional associations across Canada; if they were using a common hashtag like #canmedlibs it would provide a one stop shop for information. The Canadian Health Libraries Association (CHLA-ASBC) is the most obvious Twitter feed that would do well to provide a #canmedlibs hashtag, as they offer some of the premier and seminal information in the field:
Protesting Libraries and Archives Canada Cutbacks and Policies: CHLA/ABSC has added its voice to protest the r… bit.ly/114G5zG
— CHLA/ABSC (@chlaabsc) April 12, 2013
The Health Library Association of British Columbia also provides some great tweets that have a more local Canadian focus:
— HLABC (@HLABC) June 1, 2011
The University of Toronto Gerstein Health Sciences Library has a great feed that offers student experience pieces from time to time about their time spent within the library:
Planning for the future at UTL. Our own Bonnie Horne writes about her experiences with library space and student… fb.me/1WpzuKCQ7
— Gerstein Library UTL (@GersteinLibrary) March 22, 2013
The University of Alberta John W. Scott Health Sciences Library has a great Twitter account that introduces new library databases, discusses ongoing health research at the U of A, and provides retweets with a Canadian focus:
NEW: Paediatric Economic Database Evaluation (PEDE) – Registry of econ eval citations & state utility weights- bit.ly/ZUXxae
— J. W. Scott Library (@jwslibrary) March 15, 2013
There are many more examples I could include, but for the sake of brevity I will stop it there as I hope there are enough examples to prove my point that Canadian medical librarians would benefit from a #candmedlibs hashtag.
Why is this important?
I think it is important to have an official #canmedlibs hashtag because it took me almost TWO FULL HOURS to find all of this great library-related information with a Canadian focus. If I had the hashtag, it would have taken me less than a minute. That should be reason enough for us all to start using #canmedlibs.
Another reason is I think that because we as Canadian medical librarians are so dispersed across the country (and in my case across the continent), that the use of a hashtag could really bring us together more easily and start a new collaborative culture. I know it already exists on the #medlibs chat, so why shouldn’t we have it too? I already talk to #canmedlibs regularly on Twitter, but it would be great to get more people in on the conversation.
Finally it is important because I love sharing information, and I think other librarians do too. If I have found some useful piece of information that I know will be of interest to #canmedlibs, I want to make sure that I know they are going to see it. Using a hashtag would at least help this process along. The same idea can be said for the other way around; I’m always looking for medical library material with a Canadian focus but it is exceedingly hard to find. In the most selfish way possible, #canmedlibs would really help me find the information I need.
Currently only myself and Dean Giustini have used the #canmedlibs tag on our tweets – but I’m hoping that this blog post might encourage other Canadian medical librarians to do the same. I know there are lots of us out there because many of them are listed on the HLWIKI International website. Sharing is caring after all! I would love to hear from any #canmedlibs who might think this is a good (or bad) idea. Feel free to weigh in!
****I’m sorry if I missed any fantastic Canadian medical librarian tweeters, if you use #canmedlibs next time you tweet i’ll be able to find you more easily :)*****
Alternative metrics (altmetrics) – better known as new ways to measure research impact – raise a lot of questions amongst the scientific community. What do these metrics actually mean? And more importantly, what do they actually measure? It’s hard to measure the impact of a research article based on how many times it has been tweeted or posted to facebook: how does that prove that the person posting it actually read the article? Or used it within their own research?
Personally, I love the idea of altmetrics, but I don’t think it has quite reached the point where we can compare it to the impact-factor or the h-index of a journal article (although these are ultimately flawed as well). Heather Piowowar does an excellent job of describing altmetrics from her article in Nature and it aligns well with my own ideas of what altmetrics try to achieve:
“Altmetrics give a fuller picture of how research products have influenced conversation, thought and behaviour.”
I like to think of the “fuller picture” of altmetrics as the evolving story of a journal article. Altmetrics doesn’t necessarily tell us how influential or prominent a journal article has been, but it tells us about how it has been used, shared and communicated over time via social media, the web and the scholarly community. Eventually, I think that the emergence of several prominent altmetric platforms there will eventually lead to a more effective way to evaluate scholarly impact in the form of a hybrid system. In fact, an article written yesterday by Pat Loria from LSE blogs states that “as more systems incorporate altmetrics into their platforms, institutions will benefit from creating an impact management system to interpret these metrics, pulling in information from research managers, ICT and systems staff, and those creating the research impact”. His post is definitely worth a read and would be a great follow up to the content I will present here. He even compares several of the altmetrics platforms that I will outline in this post.
For this post, I thought it would be a good idea to introduce some of the most prominent altmetric platforms within the scholarly publication ecosystem. Below I will describe each altmetric platform and explain how it communicates the impact and metrics of scholarly research to hopefully provide a better understanding of how this type of measurement works.
ImpactStory aligns well with my idea of altmetrics because its goal is to tell the story of how research and scholarly publications are shared and discussed. ImpactStory tracks metrics across a variety of commonly used services such as Delicious, Scopus, Mendeley, PubMed and even SlideShare (among many others). You can import your Google Scholar profile, or even your Dryad records. Once you have imported the service you want to measure, Impact Story tells you how many times an article has been saved by scholars, how many times it has been cited by scholars, how many people have discussed it in public (via Twitter, Facebook, etc.) and how many times it has been cited by the public (eg. Wikipedia article, Blog post).
Anyone who has research material in any of the platforms that ImpactStory supports can view their metrics very easily by creating their own collection. Researchers can also embed a widget into their websites that will attach ImpactStory metrics to their citations, indicating if an article is highly discussed or cited by scholars and the public. I think ImpactStory is an excellent model for altmetrics because it is comprised of traditional metrics and new, social metrics suitable for discovering web impact.
Perhaps the most well known of the altmetrics tools, Altmetric provides three main products that provide embeddable content about particular journal articles. The most prominent product from Altmetric is their Explorer program; this program is comprehensive in that it provides information about how many times an article has been viewed and the rankings from the journal they are from. Explorer also provides a list of social components like how many times an article has been picked up on a news feed; how often it has been tweeted; who has discussed it on Google+ and several other social media platforms. Using Explorer a researcher can even see the demographics of who has seen their article. This is an excellent feature as it provides people with an idea of who is looking at the material. As a librarian, I would be interested to know who is looking at my research: librarians? doctors? the scientific research community?
Altmetric also provides services for publishers where they can embed Altmetric badges that will provide additional information about their articles. Publishers can customize their pages that present the metrics so that their branding can be included.
Finally, Altmetric has a bookmarklet that will provide altmetrics about an article you’re reading. I personally use this feature for fun because it is interesting to learn a little bit more about how an article has been used.. The only problem is that Altmetric does not have the data for every single journal publication. This means that a large portion of the time I’m clicking on the bookmarklet for an article that I’m reading and there is no data available. This is the case especially with library literature – this could be incentive to try and get the LISA and LISTA databases on board. Either way, if you’re interested you can add the bookmarklet HERE.
Plum Analytics is the third power player in the altmetrics arena. The goal of Plum Analytics is to “ to give researchers and funders a data advantage when it come to conveying a more comprehensive and time impact of their output”. Plum collects altmetrics and categorizes their metrics into five different groups: usage, captures, mentions, social media, and citations.
For usage, Plum looks at downloads, views, book holdings, ILL, and document delivery. This is where the library component comes in. If altmetric platforms like Plum are tracking ILL’s and document delivery requests for research literature, librarians should be aware of this and look to contribute to the effort.
The second category, captures, provides information about the favorites, bookmarks, saves, readers, groups, and watchers of an article.
Mentions cover the blog posts, news stories, Wikipedia articles, comments, and reviews of research articles.
Social media refers to the tweets, shares, +1′s and likes based on a research article, and finally citations in Plum Analytics currently cover PubMed, Scopus and Patent citations. You can look at their information page to see how they define all of their terminology.
Peer Evaluation is a different sort of altmetric platform in that it is designed an open peer review service where researchers can curate their own peer review process for scholarly publications. The goal of peer evaluation is for researchers to make their work visible within their community, and be able to track the impact and reuse of what they share. Researchers can submit their articles, data, working papers, books, etc. to Peer Evaluation and have other researchers review their work. Furthermore, because this is a community effort the researcher can in turn review other peoples work as well. Peer Evaluation provides qualitative and quantitative metrics that help the researcher understand the impact of their work, and then be able to share their feedback with others in their community. This idea is very unique within the altmetrics realm, and there has been a considerable amount of participation from the scientific community.
Research Scorecard is a company devoted to ”characterizing and quantifying scientific expertise to facilitate scientific collaborations”. Focusing primarily on the biotechnology and pharmaceutical domains, Research Scorecard builds reports and databases for researchers and academic institutions to evaluate the products that they use and how they are used, the people that they collaborate with, the metrics about a specific scientist or researcher, and the funding history of an individual or organization. Research Scorecard is slightly more commercialized than the other platforms that I’ve mentioned here, but I still think it provides valuable information about products, services and researchers within the scientific community.
Librarians! How can we participate?
Librarians should be thinking about how we can best incorporate altmetrics into our own work lives. Librarians working in research environments will need to keep up with altmetrics to evaluate the impact of literature needed for their collection, and to direct researchers to high impact journals for publishing. The shift towards open access publishing will also make altmetrics a valuable tool for librarians to evaluate the impact and quality of these publications. As an academic librarian, I would love to see tools like Altmetric Explorer embedded into a university’s discovery search system or institutional repository.
I think that as altmetrics start to develop a more comprehensive picture of scholarly impact, we will begin to see wider adoption from the scientific community. As Loria states in his blog post, the combination of several platforms in what he calls an Impact Management System (IMS) will be the turning point for altmetrics. If an IMS service can combine all of these research outputs and impacts into one system, it can facilitate the dissemination of a more complete set of research metrics including everything from community and academic impacts to social communication indicators.
Loria makes the point that: ”Librarians can help, with their data management skills and aptitude for storytelling.” I have no doubt in my mind that librarians can help, but it is up to us to reach out to these altmetric communities early on so that we can contribute in any way we can. I think it is at least our duty to educate ourselves on the benefits of altmetrics and their potential significance for informing the patrons that we serve.
Other Altmetric Platforms
1. Loria P. The new metrics cannot be ignored – we need to implement centralised impact management systems to understand what these numbers mean [Internet]. London School of Economics and Political Science Blog. 2013. Available from: http://blogs.lse.ac.uk/impactofsocialsciences/2013/03/05/the-new-metrics-cannot-be-ignored/
2. Piwowar H. Altmetrics: Value of all research products [Internet]. Nature. 2013 Jan;493(159).Available from: http://www.nature.com/nature/journal/v493/n7431/full/493159a.html