Concerning the deal between LAC and Canadiana: We ask for transparency

I thought I would take this opportunity to weigh in on the deal between Library and Archives Canada and Canadiana, which calls for the transfer and digitization of the largest collection of Canadian archival records in history. I want to make it clear that in the grand scheme of things I think that this project is all in all a very good thing for archives in Canada, and is long overdue. What worries me is that the details surrounding this deal are largely unclear, and I think it is important for us, being Canadian archivists and librarians, to ask specific questions about this deal to ensure that this heritage collection is safe, and will ultimately be freely available to all Canadians who want to view it.

Canadiana has already tried to quell some of the hysteria surrounding the deal with their recently published FAQ, but if I’m honest there are a lot of  questions that I have that are still largely left unanswered. I even asked Canadiana on Twitter the other day to clarify the issues surrounding the ‘Premium’ payment that would be required if I wanted to have access to the search and discovery features they will be developing, but I have yet to hear a reply. I think this line from the FAQ deserves a more detailed explanation:

Until the completion of the project, this searchable, full-text data will be one of the premium services.

Does this mean that once the project is completed everyone will have free access to these features? If this is only one of the premium features, what else will we be missing out on if we don’t pay? These are just some of the questions I have about the deal, but more importantly, I think it is crucial that we start asking those involved (CRKN, CARL, LAC, Canadiana) how they plan to manage, describe and preserve this enormous amount of information and make sure that it will be available to Canadians for years to come. A lot of these questions have been discussed in Bibliocracy’s blog posts on the issue, but I would like to reiterate and request that the library and archives community start asking Canadiana and LAC their own questions to hopefully spur on more details about the project. To start it off, I have outlined below the questions that I would like to have answered:

How will this information be stored, and consequently transferred back to LAC once the full digitization process is complete?

Information architecture is obviously a crucial component of this project, as the collection will need to be stored someplace where it can be accessed by all. I think it is more important that we receive an answer about how all of this content will be transferred back to LAC. There are many methods and avenues this project can take in terms of placing the material in a repository or content management system to hold of all this material, and I think that both parties owe it to us to explain how this work will be completed. Will Canadiana use something like CLOCKSS to ensure that this material is preserved and made freely available forever? Or will this be the responsibility of LAC once the project is done? I would like to know that themigration of digital documents will be easily transferred back to LAC once this is over. Which brings me to my next question:

What measures will be taken regarding the digital preservation of the finalized, newly described content?

I’m hoping that having the responsibility of managing Canada’s largest archival collection will spearhead Canadiana to take measures to ensure the preservation not only of the physical content, but the newly digitized content as well. I would like to know where they plan on storing all of this information – will copies be held in a dark archive to ensure its long-term preservation? Will they use an Open Archival Information System (OAIS)? Will they use the Trusted Digital Repository model? It would be nice to see something akin to a Trustworthy Repository Audit and Certification (TRAC) so that Canadian information professionals feel confident that the proper steps are being taken to preserve this digital content.

What type of metadata schemas will be used?

This one is pretty self explanatory, but seeing as this is a Canadian initiative one would have to assume that Canada’s RAD archival description schema will be used. Seeing as linked data has become so prominent as of late, does Canadiana have plans to use RDF to encourage and support linked data within this collection? Because one of the main goals of this project is to make this content more discoverable and searchable, I think it would be helpful for us to understand how all of this transcription and metadata tagging will take place.

What do you really mean when you say that all of the content will be open access?

When I hear the term open access used to describe information content I always get excited. If this effort is truly going to make all of this digitized archival material open access, then that is fantastic. However with this deal, there are details of how open access is being described in this context that have me scratching my head. For a definition of open access, I like to use SPARC’s definition, which they define (in a nutshell) as material that has:

immediate, free availability on the public internet, permitting any users to read, download, copy, distribute, print, search or link to the full text of collections, crawl them for indexing, pass them as data to software or use them for any other lawful purpose

There have been a lot of discussions around Canadiana’s statement that they will be making the digital content available for free via a Creative Commons license. What I don’t understand is that in order to access certain features of this content, you will have to pay a premium fee. That doesn’t sound very open access to me, but a simple clarification would help with this fact. Which leads me to:

Can you please elaborate on the fees that are involved with premium access, and how this will work with the 10% of digital material released per year for 10 years?

This question has been on my mind since I heard about this deal (as I described above). What I would like to know if how this premium fee will work: what will it cost? what features are involved? Will the premium features become freely available once every 10% of the digitization process is completed?

I understand that in order to create high quality descriptive metadata for digitization you need money to do it. I don’t have as much of a problem with that, but what worries me is that these details have no been provided to us. By not answering this one glaring questions, Canadiana has made me nervous that I will have to pay, or my institution will have to pay for content over the long term? How do I know that these charges won’t continue once you finish the project?

What experts are going to be consulted for this project?

I know that CRKN and CARL have both supplied money for this project, but it would be very comforting to know that highly skilled, expert personnel will be working on this project. As a librarian and archivist, I want this effort to succeed at the highest level. In order to feel confident that this will be the case, I think it would be wise to inform the library and archival community in Canada as to who will be advising this effort. I always like specifics, and knowing that the best people are working on this effort will go a long way towards easing my mind.

In the end, all I’m asking for is a little bit of transparency. This project will have an effect on a huge number of information professionals, researchers, and the general public. I think that this project shows a lot of promise, and should be a cause for excitement amongst the Canadian information community. However, until Canadiana or LAC provide specifics about this deal, I will be holding my excitement. The lack of explanation, and vagueness of this project should be a cause of concern for everyone. Ultimately, I don’t think an open and transparent explanation of a project that affects so many Canadian people is too much to ask for.

I encourage other Canadian archivists and librarians to ask their own questions about this deal through blogs, social media, or email in hopes that it will generate enough demand that Canadiana and LAC will have to respond. I am only a small voice in this, and it would be great to see others get involved. Using #heritagedeal on Twitter could help synthesize all of this information in one place.

Thanks for reading.

Data Publishing: Who is meeting this need?

I realize I haven’t written a post in over a month, and I feel horribly guilty about it. The one good thing about not having the time to write blog posts frequently is that I now have a stockpile of ideas, and plenty of material to write more frequent posts.

What I would like to address in today’s post is some of the ongoing efforts from journals, government agencies, and open source communities have taken to address the need to publish data, in all of its messy and intricate formats. Similar to my previous posts, I will describe each of the efforts that I find to be promising in terms of their ability to tackle this massive, and complicated task. In case readers are unfamiliar with the concept of a data publication, I define the concept based on a hybrid of different viewpoints from papers by Borgman, Lynch, Reilly et al., Smith, and White:

A data publication takes data that has been used for research and expands on the ‘why, when and how’ of its collection and processing, leaving an account of the analysis and conclusions to a conventional article. A data publication should  include metadata describing the data in detail such as who created the data, the description of the type of data, the versioning of the data, and most importantly where the data can be accessed (if it can be accessed at all). The main purpose of a data publication is to provide adequate information about the data so that it can be reused by another researcher in the future, as well as provide a way to attribute data to its respective creator. Knowing who creates data provides an added layer of transparency, as researchers will have to be held accountable for how they collect and present their data. Ideally, a data publication would be linked with its associated journal article to provide more information about the research.

With all that being said, lets take a look at some of the efforts that currently exist in the data publishing realm. Note that clicking on the images will take you to the homepages of each resource.

Nature Publishing Group – Scientific Data

Scientific Data

Scientific Data is the first of its kind in that it is an open access, online-only publication that is specifically designed to describe scientific data sets. Because the description of scientific data can be a complicated and exhaustive, this publication does an excellent job of addressing all of the questions that need to be asked of researchers before they even think of submitting their data. Scientific Data just came out with their criteria for publication today, and the questions they ask are exactly what is needed to ensure that the data publication will be able to be reused through appropriate description.

Then comes the next great component – the metadata. Scientific Data uses aData Descriptor’ model that requires narrative content about a data set such as the more traditional descriptors librarians are familiar with such as Title, Abstract and Methodology. What is excellent about the Data Descriptor model is that it also requires structured content about the data.  This structured content uses the an ‘Investigation’, ‘Study’ and ‘Assay’ (ISA) open source metadata format to describe aspects of the data in detail. These major categories are apparently designed to be ‘generic and extensible’, and serve to address all scientific data types and technologies. You can check ISA out HERE.

Overall I think that Scientific Data is the beginning of a new trend in publishing where major journals will begin to publish data publications more frequently on top of traditional research articles. This publication is the first step towards making research data available, reusable and transparent within the scientific research community.

F1000Research – Making Data Inclusion a Requirement

F1000Research   An innovative OA journal offering immediate publication and open peer review.

F1000Research is an excellent new open science journal that has caught my attention for its foray into systematic reviews and meta analyses and for its recent ‘grace period’ to encourage researchers to submit their negative results for publication. I think that this publication that medical librarians should be aware of, and potentially encourage researchers to submit to should they be looking for a more frugal option. What really impresses me with F1000Research though, is their commitment to ensuring that data associated with research articles is made readily available.

Currently, F1000Research reviews data that is submitted in conjunction with an article, and then offers to deposit the data on the authors behalf in an appropriate data repository. The journal is open to placing in data in any repository, but they work mainly with figshare - a popular platform for sharing data.  Together figshare and F1000Research have created a ‘data widget’ that allows figshare to link data files with its associated article in F1000Research – which is excellent! There was a recent blog post written about this widget here that can give it the attention it deserveshttp://blog.f1000research.com/2013/05/23/new-f1000research-figshare-portal-and-widget-design/). F1000Research is also apparently working on a similar project with Dryad. I think that moving forward we will see more efforts from journals like F1000Research to seamlessly connect their publications with associated data. This is a crucial component to publishing data as the journal article provides the context in terms of how the data was used. 

Dryad – Integrated Journals

Dryad Digital Repository   Dryad

Dryad is a data repository and service that offers journals the option of submission integration with their system. The service is completely free and is designed to simplify the process of submitting data, and ensure biodirectional links between the article and the data. Currently Dryad provides an option for data to be opened up to peer review, but I would like to see that become more of a requirement going forward. Here is a link to Dryad’s journal integration page: http://datadryad.org/pages/journalIntegration

Currently there are a number of journals currently participating in this effort, and a complete list of them can be seen HERE. Carly Strasser also did a great job of outlining other journals that require data sharing in her post about data sharing on the excellent blog Data Pub. I think Dryad is a perfect example of the other side of traditional publishing. We need data repositories like Dryad and figshare to continue supporting data publication and storage, as they represent half of the picture that will allow articles and data to be connected.

The Dataverse Network

Screenshot_1The Dataverse Network is a data repository designed for sharing, citing and archiving research data. Developed by Harvard and the Data Science team at the Institute for Quantitative Social Science, Dataverse is open to researchers in all scientific fields. As a service, Dataverse organizes its data sets into studies; each study contains cataloguing information along with the data, and provides a persistent way to cite the data that has been deposited.

Dataverse also uses Zelig (an R statistical package) software that provide statistical modeling of the data that is submitted. Finally, Dataverse can also be installed as a software program into their own institutional data repositories. I see the ability to download Dataverse for institutional purposes to be an excellent prospective strategy; as more academic institutions begin to develop data storage capabilities to their institutional repositories, Dataverse will provide some much needed assistance in this arena.

GitHub: Git for Data Publishing

GitHub · Build software better  together.

Although I would not call myself an expert of the GitHub world, I will say that I recognize a fruitful initiative to publish data when I see one. In a recent blog post by James Smith talking about how the tools of open source could potentially revolutionize open data publishing. The post is great and you can read it here: http://theodi.org/blog/gitdatapublishingutm_source=buffer&utm_medium=twitter&utm_campaign=Buffer&utm_content=buffer6c57f James’ idea is to upload data to GitHub repositories and use a DataPackage to attach metadata that will sufficiently describe the data. Ultimately the goal of using GitHub for data publication would enable sharing and reuse of data within a supporting and collaborative community. While some of this can get complicated, working through the links from his post really provides you with a sense of how an open source community is coming together to address the need to publish data.

Biositemaps

National Centers for Biomedical Computing

Biositemaps is a working group within the NIH that is designed to: 

(i) locating, (ii) querying, (iii) composing or combining, and (iv) mining biomedical resources

‘Biomedical resources’, in this case can be defined as anything from data sets to software packages to computer models. What is most interesting about Biositemaps is that they provide an Information Model that outlines a set of metadata that can be used to describe data. Using the Information Model as a base for data description, it then uses a Biomedical Resource Ontology (BRO); BRO is a controlled terminology for the ‘resource_type’, ‘area of research’, and ‘activity’ to help provide more information about how  data is used, and how it can be described in detail using biomedical terminology. I will admit this resource is still pretty raw, but I think it has a lot of potential for being an excellent resource moving forward. The basic idea behind Biositemaps is that a researcher fills in a lengthy auto-complete form describing themselves, their data, and the methodology used to create the data. Once the form is complete, it produces an RDF file that is uploaded to a registry where it can be linked to, and from anywhere. If you are a medical librarian and you have researchers interested in publishing data, I encourage you to take a look at this resource.

SHARE Program – Association of Research Libraries (ARL), Association of American Universities (AAU), the Association of Public and Land-grant Universities (APLU)

This effort just came out last week, but the ARL, AAU and APLU are joining together to create a shared vision of universities collaborating with the Federal government and others to host institutional repositories across the the memberships to provide access to public access research – including data. While it is not entirely clear how this will be achieved – especially in the realm of data – I think that this is the type of collaboration that will provide a well researched, evidence based solution moving forward. I hope that SHARE continues to expand beyond the response to the OSTP memo, as I think Canadian academic institutions could benefit greatly from this effort. Here is a link to the development draft for SHARE: http://www.arl.org/storage/documents/publications/share-proposal-07june13.pdf

For Medical Librarians

My goal in presenting these data publication efforts is an attempt to get medical librarians to think more about the options that are available for data publication. Journals, government agencies and open source communities are all trying to address the issues surrounding data publication, and I think it is our duty as medical librarians to familiarize ourselves with journal policies around data sharing; data publication initiatives like DataCite, Dryad, and figshare; and new government efforts like Biositemaps that are becoming more heavily used every day, and will be relevant for our liaison and research areas of practice moving forward. I have tried to provide a lot of links within this post, but I’ve included some more reading below that may be useful. I’d also like to mention that this is by no means an exhaustive list, but rather some of the interesting efforts i’ve seen throughout my work with data. Please feel free to add as you wish in the comments section.

Readings/References

1. Borgman CL, Wallis JC, Enyedy N. Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. International Journal of Digital Libraries [Internet]. 2007;7:17–30. Available from: http://escholarship.org/uc/item/6fs4559s#  

2. Lynch C. The shape of the scientific article in the developing cyberinfrastructure. CT Watch Quarterly [Internet]. 2007;3(3):5–10. Available from: http://www.ctwatch.org/quarterly/articles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastructure/  

3. Piowowar H, Chapman W. A review of journal policies for sharing research data. Nature Precedings [Internet]. 2008. Available from: http://www.academia.edu/904922/A_review_of_journal_policies_for_sharing_research_data

4. Reilly S, Schallier W, Schrimpf S, Smit E, Wilkinson M. Report on Integration of Data and Publications [Internet]. 2011: p. 1–7. Available from: http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/10/ODE-ReportOnIntegrationOfDataAndPublications-exesummary.pdf  

5. Smith VS. Data publication: towards a database of everything. BMC research notes [Internet]. 2009 Jan [cited 2013 Mar 3];2:113. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2702265&tool=pmcentrez&rendertype=abstract  

6. Whyte A. IDCC13 Data Publication: generating trust around data sharing. Digital Curation Centre [Internet]. 2013 Jan 23; Available from: http://www.dcc.ac.uk/blog/idcc13-data-publication-generating-trust-around-data-sharing

Altmetrics and Evaluating Scholarly Impact: What’s out there and how can we participate?

Alternative metrics (altmetrics) – better known as new ways to measure research impact – raise a lot of questions amongst the scientific community. What do these metrics actually mean? And more importantly, what do they actually measure? It’s hard to measure the impact of a research article based on how many times it has been tweeted or posted to facebook: how does that prove that the person posting it actually read the article? Or used it within their own research?

Personally, I love the idea of altmetrics, but I don’t think it has quite reached the point where we can compare it to the impact-factor or the h-index of a journal article (although these are ultimately flawed as well). Heather Piowowar does an excellent job of describing altmetrics from her article in Nature and it aligns well with my own ideas of what altmetrics try to achieve:

“Altmetrics give a fuller picture of how research products have influenced conversation, thought and behaviour.”

I like to think of the “fuller picture” of altmetrics as the evolving story of a journal article. Altmetrics doesn’t necessarily tell us how influential or prominent a journal article has been, but it tells us about how it has been used, shared and communicated over time via social media, the web and the scholarly community. Eventually, I think that the emergence of several prominent altmetric platforms there will eventually lead to a more effective way to evaluate scholarly impact in the form of a hybrid system. In fact, an article written yesterday by Pat Loria from LSE blogs states that “as more systems incorporate altmetrics into their platforms, institutions will benefit from creating an impact management system to interpret these metrics, pulling in information from research managers, ICT and systems staff, and those creating the research impact”. His post is definitely worth a read and would be a great follow up to the content I will present here. He even compares several of the altmetrics platforms that I will outline in this post.

For this post, I thought it would be a good idea to introduce some of the most prominent altmetric platforms within the scholarly publication ecosystem. Below I will describe each altmetric platform and explain how it communicates the impact and metrics of scholarly research to hopefully provide a better understanding of how this type of measurement works.

Impact Story

impactstory

ImpactStory aligns well with my idea of altmetrics because its goal is to tell the story of how research and scholarly publications are shared and discussed. ImpactStory tracks metrics across a variety of commonly used services such as Delicious, Scopus, Mendeley, PubMed and even SlideShare (among many others). You can import your Google Scholar profile, or even your Dryad records. Once you have imported the service you want to measure, Impact Story tells you how many times an article has been saved by scholars, how many times it has been cited by scholars, how many people have discussed it in public (via Twitter, Facebook, etc.) and how many times it has been cited by the public (eg. Wikipedia article, Blog post).

Anyone who has research material in any of the platforms that ImpactStory supports can view their metrics very easily by creating their own collection. Researchers can also embed a widget into their websites that will attach ImpactStory metrics to their citations, indicating if an article is highly discussed or cited by scholars and the public. I think ImpactStory is an excellent model for altmetrics because it is comprised of traditional metrics and new, social metrics suitable for discovering web impact.

Altmetric

altmetric

Perhaps the most well known of the altmetrics tools, Altmetric provides three main products that provide embeddable content about particular journal articles. The most prominent product from Altmetric is their Explorer program; this program is comprehensive in that it provides information about how many times an article has been viewed and the rankings from the journal they are from. Explorer also provides a list of social components like how many times an article has been picked up on a news feed; how often it has been tweeted; who has discussed it on Google+ and several other social media platforms. Using Explorer a researcher can even see the demographics of who has seen their article. This is an excellent feature as it provides people with an idea of who is looking at the material. As a librarian, I would be interested to know who is looking at my research: librarians? doctors? the scientific research community? 

Altmetric also provides services for publishers where they can embed Altmetric badges that will provide additional information about their articles. Publishers can customize their pages that present the metrics so that their branding can be included.

Finally, Altmetric has a bookmarklet that will provide altmetrics about an article you’re reading. I personally use this feature for fun because it is interesting to learn a little bit more about how an article has been used.. The only problem is that Altmetric does not have the data for every single journal publication. This means that a large portion of the time I’m clicking on the bookmarklet for an article that I’m reading and there is no data available. This is the case especially with library literature – this could be incentive to try and get the LISA and LISTA databases on board. Either way, if you’re interested you can add the bookmarklet HERE.

Plum Analytics

Plum Analytics

Plum Analytics is the third power player in the altmetrics arena. The goal of Plum Analytics  is to ” to give researchers and funders a data advantage when it come to conveying a more comprehensive and time impact of their output”. Plum collects altmetrics and categorizes their metrics into five different groups: usage, captures, mentions, social media, and citations.

For usage, Plum looks at downloads, views, book holdings, ILL, and document delivery. This is where the library component comes in. If altmetric platforms like Plum are tracking ILL’s and document delivery requests for research literature, librarians should be aware of this and look to contribute to the effort.

The second category, captures, provides information about the favorites, bookmarks, saves, readers, groups, and watchers of an article.

Mentions cover the blog posts, news stories, Wikipedia articles, comments, and reviews of research articles.

Social media refers to the tweets, shares, +1’s and likes based on a research article, and finally citations in Plum Analytics currently cover PubMed, Scopus and Patent citations. You can look at their information page to see how they define all of their terminology.

Peer Evaluation

peerevaluation

Peer Evaluation is a different sort of altmetric platform in that it is designed an open peer review service where researchers can curate their own peer review process for scholarly publications. The goal of peer evaluation is for researchers to make their work visible within their community, and be able to track the impact and reuse of what they share. Researchers can submit their articles, data, working papers, books, etc. to Peer Evaluation and have other researchers review their work. Furthermore, because this is a community effort the researcher can in turn review other peoples work as well. Peer Evaluation provides qualitative and quantitative metrics that help the researcher understand the impact of their work, and then be able to share their feedback with others in their community. This idea is very unique within the altmetrics realm, and there has been a considerable amount of participation from the scientific community.

Research Scorecard

researchscorecard

Research Scorecard is a company devoted to “characterizing and quantifying scientific expertise to facilitate scientific collaborations”. Focusing primarily on the biotechnology and pharmaceutical domains, Research Scorecard builds reports and databases for researchers and academic institutions to evaluate the products that they use and how they are used, the people that they collaborate with, the metrics about a specific scientist or researcher, and the funding history of an individual or organization. Research Scorecard is slightly more commercialized than the other platforms that I’ve mentioned here, but I still think it provides valuable information about products, services and researchers within the scientific community.

Librarians! How can we participate?

Librarians should be thinking about how we can best incorporate altmetrics into our own work lives. Librarians working in research environments will need to keep up with altmetrics to evaluate the impact of literature needed for their collection, and to direct researchers to high impact journals for publishing. The shift towards open access publishing will also make altmetrics a valuable tool for librarians to evaluate the impact and quality of these publications. As an academic librarian, I would love to see tools like Altmetric Explorer embedded into a university’s discovery search system or institutional repository.

I think that as altmetrics start to develop a more comprehensive picture of scholarly impact, we will begin to see wider adoption from the scientific community. As Loria states in his blog post, the combination of several platforms in what he calls an Impact Management System (IMS) will be the turning point for altmetrics. If an IMS service can combine all of these research outputs and impacts into one system, it can facilitate the dissemination of a more complete set of research metrics including everything from community and academic impacts to social communication indicators.

Loria makes the point that: “Librarians can help, with their data management skills and aptitude for storytelling.” I have no doubt in my mind that librarians can help, but it is up to us to reach out to these altmetric communities early on so that we can contribute in any way we can. I think it is at least our duty to educate ourselves on the benefits of altmetrics and their potential significance for informing the patrons that we serve.

Other Altmetric Platforms

PaperCritic

ScienceCard

Symplectic

VIVO

References

1. Loria P. The new metrics cannot be ignored – we need to implement centralised impact management systems to understand what these numbers mean [Internet]. London School of Economics and Political Science Blog. 2013. Available from: http://blogs.lse.ac.uk/impactofsocialsciences/2013/03/05/the-new-metrics-cannot-be-ignored/

2. Piwowar H. Altmetrics: Value of all research products [Internet]. Nature. 2013 Jan;493(159).Available from: http://www.nature.com/nature/journal/v493/n7431/full/493159a.html

Drupal Ladder: A great learning tool for librarians

Recently I attended a workshop at the NIH Library on learning how to use Drupal called Drupal4Gov. The workshop wasn’t designed for librarians but I definitely found the workshop useful and thought I would pass along the information. And even though this was a government workshop, the things I learned are applicable to any environment – especially a library-related one.

The great thing about Drupal is that once you get past the difficulty of installing it, it is very easy to use and there is a wealth of support on the web and within the Drupal community itself. So keep reading if you’re interested in learning a new skill, or are thinking about using Drupal as a content management system in your library. 

What is Drupal?

I thought it would be fruitful to explain Drupal before I start explaining the tools that I used to learn the software. Drupal is simply (from the website):

…an open source content management platform powering millions of websites and applications. It’s built, used, and supported by an active and diverse community of people around the world.

Basically Drupal is an easy way to develop websites, and other applications for your business or institution. From a library perspective, Drupal can run your library website, support your OPAC, and link out to your subscribed databases. Think of Drupal like the WordPress platform, but with many more features that are more intuitive.

What is Drupal Ladder?

Drupal Ladder is a website that contains (or links to) lessons and materials to help people learn about and contribute to Drupal. The site was created by the Boston Initiative to help Drupal user groups develop and share and develop materials. These lessons are designed for the most novice user to the experienced software developer. 

There are a variety of ladders to choose from, but the best one to learn how to use Drupal and learn how to apply some of the great features of Drupal are in the Drupal4Gov ladder:

Drupal Ladders

Once you’ve selected the ladder you want to learn, you’ll be taken to a page where you can see all the steps you can learn, from installing Drupal to contributing your own project. I thought this was an excellent tool to learn something new because the directions are very clear and the each step builds on the previous one so you are never left feeling lost.

Drupal4Gov - Drupal Ladder

What’s great about this program is that the Drupal Ladder gives you the option of installing Drupal on your own server (if you have one), or using a simulation called Dev Desktop that simulates a server and allows you to have all the same functionality of Drupal. For librarians specifically, the first 5 rungs on the ladder above are an excellent way to become familiar with the software and try a few of the more advanced functions.

Another cool tool you can use is called simplytest.me that allows you to run anybody’s Drupal site for 30 minutes to an hour and play around with it. This is an helpful way for people to see how different websites and applications are developed and used. I could spend hours just fiddling around with the themes of websites and installing cool modules into the program.

I chose to write about this topic today because I see more and more libraries struggling to figure out how they can quickly and easily build new websites or platforms for their patrons. With the influx of new librarianship roles like embedded librarians and informationists, I figured knowing how to quickly build a website would be useful – this is what Drupal is designed for. Because Drupal is open source and has such a strong community supporting it, I kept thinking to myself during the workshop: Why can’t librarians be a part of this community too? I think that Drupal is an excellent skill to have as it provides libraries with a lot of options to move forward if they are looking for a new content management system. The ease of use and intuitive nature of Drupal also make it easier to train other staff how to use it. If you have the time, I encourage any librarian reading this to give the Drupal Ladder a try. The more time you put into learning it and exploring what Drupal can do, the easier it is to use. 

**I am not affiliated with Drupal in any way, the views expressed here are my own.**

Open Access & Open Data: Projects that librarians should know about (and share with others!)

Last week I had the opportunity to attend a presentation by Heather Joseph –  a representative of SPARC (Scholarly Publishing and Academic Resources Coalition) – to hear about some of the great open access journal publishing initiatives taking place. There are a variety of publishing platforms that have emerged as of late that offer their own unique way of promoting open access and supporting research sharing. I thought I would share with you some of the initiatives that Heather highlighted in her talk. 

To extend the discussion into the realm of open access data, I also want to discuss a few of the data sharing initiatives I have found while working on my current projects. I believe that these data sharing resources represent an ideal  future for research and data publication; they offer platforms where investigators can share data, collaborate and modify data with other researchers and even use software to transform their datasets into education materials. To access each resource, click on the images to link to their respective webpages.

Open Access Publishers

Public Library of Science (PLOS)

PLOS

The most obvious on the list but I feel like I would have heard about it from colleagues if I didn’t include it. PLOS is the initiative that provides multiple platforms for scientific journals that are completely open access. They are strong advocates of sharing research and have 9 core principles that promote sharing, community engagement and scientific excellence. PLOS hosts many excellent journals such asPLOS ONE, which publishes across the full range of life and health sciences; community journals (PLOS GeneticsPLOS Computational BiologyPLOS Pathogensand PLOS Neglected Tropical Diseases); and  PLOS Medicine and PLOS Biology. PLOS Blogs and Currents also make for some excellent reading, focused mainly on the issues of research sharing and open access. I read PLOS blogs and currents on a regular basis, as they provide excellent information on open access and focus on many publication issues that librarians need to be aware of.

eLIFE

eLife   the funder researcher collaboration and forthcoming journal for the best in life science and biomedicine

eLIFE is one of the new actors in the realm of open access publishing, and prides itself on being:

a researcher-led digital publication for outstanding work, a platform to maximise the reach and influence of new findings and a showcase for new approaches for the presentation and assessment of research.

Working with the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust among 200 others, eLIFe is focusing its attention to early-career researchers. Their goal is to make researchers first foray into publishing a constructive and fair exercise by providing a fair, transparent, and supportive author experience. eLIFE is also interested in promoting data sharing, but I don’t think it has been fully realized yet. I look forward to see what will come out of eLIFE as it continues to grow.

PeerJ

PeerJ

PeerJ offers a different model from eLife and PLOS in that it costs money to sign up, but for a small sum a publisher can be set up with a publication platform for life. $99 allows a researcher to publish one article per year for life; $199 allows a researcher to publish twice a year for life; and $299 provides the researcher with the opportunity to publish as many articles as they want per year. There is still a rigorous peer review process and paying this amount does not guarantee that their papers will be accepted. It is also important to note that all authors of an article must be members of PeerJ to submit. PeerJ has a set list of criteria that need to be met and provides an extensive list of editors from various disciplines that review submissions. Furthermore, every PeerJ member is required to review at least one paper each year or participate in post-publication peer review.

A news article in Nature comments on PeerJ as one of the cheapest options for this type of publishing. I highly encourage everyone to read the news article as it provides some insight into the emerging nature of open access publishing platforms. PeerJ seems like a good idea, but we’ll have to see if it will generate enough of a following to remain sustainable over time.

Open Humanities Alliance

Open Humanities Alliance

For my humanities friends out there, I had to include the Open Humanities Alliance in this list. The Alliance is a community-building project of thOpen Humanities Press. It aims to overcome some of the common technical barriers to open access in the humanities by linking students and faculty with resources such as open source software, hosting and archiving. The Open Humanities Alliance is a way for like-minded people from inside or outside the academy to work together in opening humanities scholarship to the world.

The one project that is sponsored by the Alliance that I want to talk about is the Open Access Journal Incubator ibiblio. This project is designed to provide researchers with a place to access a wide variety of research (music, art, literature, politics, etc.) as well as share their own. Contributors to ibiblio have to meet their set of criteria before they can share their research, but the requirements are clear and easy to follow. I had a lot of fun rooting around the site looking at the 900+ collections.

Data Sharing Projects

As a result of the discussions of research data sharing within the scientific community, projects such as HUBzero, Cytobank, and WebPAX have emerged to broach the subject through online communities that encourage the sharing of research data, foster research collaboration, and promote collective data analysis. I discuss a little bit about each one below.

Cytobank

Cytobank

Cytobank is a data sharing repository designed to manage, share, and analyze flow cytometry data from any researcher. Cytobank prides itself on being a platform for researchers, collaborators, lab and core facility managers, developers and statisticians, educators and trainers, and vendors.

What is great about Cytobank is that it allows researchers to manage their own data and host it on a cloud server; share experiment data and details quickly and easily through the web to other Cytobank users; foster interactive discussions around particular experiments; and allow researchers to turn their cytometry data into education materials. I believe that we will be seeing more repositories like Cytobank as data sharing becomes more common among researchers. This type of repository represents the potential benefits of data sharing by providing researchers with a place where they can store and manage their research as well as collaborate with others to achieve new scientific discovery.

HubZERO

HUBzero   Platform for Scientific Collaboration

HubZERO is an open source software platform for building powerful Web sites that support scientific discovery, learning, and collaboration. The scientific community has started to refer to web sites like this as “collaboratories” supporting “team science.” HubZERO differs from Cytobank in that it provides a content management system that is  built to support scientific activities. Using this system researchers can work together in projects, publish datasets and computational tools with Digital Object Identifiers (DOIs), and make these publications available for others to use as live, interactive digital resources. HubZERO’s datasets and tools run on cloud computing resources, campus clusters, and other national high-performance computing (HPC) facilities. You can take a look at some existing hubs here.

These hubs represent new and exciting innovations in data sharing. These sites are dynamic with options to build animations with data; download data; take courses to understand various datasets; view publications associated with the data;  observe online presentations about the data; and even create online simulations based on the data.

WebPax

WebPAX.com   Share Your Medical Images

WebPax is exciting because it focuses primarily on sharing medical imagery. Researchers can host and manage their medical images on the site and share them with colleagues for further analysis. Researchers create an account and have full control over who can view their images. They can then share their images with a select group of people or post them to where all members can see them. In case you were wondering about privacy, all images are anonymized and encrypted using secure socket layer (SSL) encryption technologies to make sure that third parties are unable to access this sensitive information. Because so many physicians come into the library wanting to see images on a particular topic, I think WebPax would be an excellent resource to point them to. Not only will it give them another option for viewing images, but it might even encourage them to share some of their own.