About the project

The rationale for our project is very simple: biological data is piling up much faster than anyone can make sense of it (Stephens 2015). Molecular biologists at all levels are required to develop, modify and use methods for data analysis that are realised in software. However, the necessary training for these researchers is lacking, which results in vast amounts of data sitting on hard drives around the world waiting for someone to analyse it and inexperienced researchers carrying out often unreliable and irreproducible analyses.

Data from some bioinformatics research facilities indicate that almost 80% of techniques are applied to fewer than 20% of the projects - in other words, there are no “one-size-fits-all” analyses (Chang 2015). Training researchers in how to use specific bioinformatics methods is not the way forward as most of the analyses the researchers perform require bespoke solutions. Much more beneficial is training in the basic skills of data and software “carpentry” so that the researchers develop flexible and adaptable solutions to their individual questions (Smith 2014).

Our project aims to facilitate development of such skills and train the next generation of researchers in robust and reliable data analysis via a series of workshops.

The workshops

Over the course of the two-year project, we will run 10 workshops covering topics ranging from the basics of UNIX shell and R to high-performance and cloud computing. The first several workshops will be aimed at novices and will introduce UNIX shell, R, Python and concepts of analysis automation and data reproducibility. The first three workshops will take place in July, September (2017) and January (2018), with the dates for the remaining workshops to be announced later.

The format of the materials and the nature of the delivery will be based on the successful Software Carpentry blended-learning model, where students learn by developing skills through hands-on live coding and peer programming sessions, led by an instructor and supported by a small team of helpers.

The material and learning objectives of our workshops will cover the essential skills and best practices for scientific computing for bioinformatics and can be embedded into undergraduate and/or postgraduate training programmes going forward.

Who is who

The project is funded by the BBSRC STARS programme.

The main organisers of the workshops are Dr Mary J. O’Connell, Dr Martin Callaghan (both at the University of Leeds) and Dr Jarek Bryk (at the University of Huddersfield). Drs Callaghan and Bryk are Software Carpentry Instructors. The workshops are also run by Dr Alastair Droop and Dr Bede Constantinides from the University of Leeds.

In addition, Dr Chris Creevey (at the University of Aberystwyth) and Dr Liz Duncan (at the University of Leeds) will oversee curriculum design and selection of the participants.

You can contact us directly with questions about the project at @jarekbryk, @evol_molly or @nextgenbiol.