CECAM Workshop at FZJ: Macromolecular simulation software
Biomolecular simulation continues to grow in popularity and scope of application. It is no longer the preserve of a few specialist groups but is widespread - part of the ‘toolkit’ used by researchers in a wide variety of fields, often closely integrated with other experimental research techniques. As the use of biomolecular simulation grows, a corresponding boom in (decentralised) software development is taking place. Much of this is within multidisciplinary research groups and is highly focussed on the specific needs of that community. Development is often done by researchers with little or no formal training in software engineering or programming. This is not entirely a negative – it encourages practical solutions to real-life problems and ‘thinking outside the box’. However, it is obvious that it is not the ideal route to the production and maintenance of high quality, flexible, sustainable, and usable software products that can be adopted and modified by others. Giving researchers the right tools and training to develop such software will reduce the amount of ‘wheel-reinvention’, and consequently improve research productivity.
A number of recent developments make it easier to see a route to achieving this goal. Firstly there is the growth of the Open Source, Open Development paradigm, and the establishment of a number of widely-recognised platforms to support such activities such as SourceForge, GitHub, Bitbucket, and others. Secondly there has been a move towards object-oriented coding styles that facilitiate reusability and extension. In the biomolecular simulation field in particular, we see an increasing focus on high level scripting languages like Python, which combines power and flexibility with a relatively shallow learning curve, and promote a paradigm of code sharing, re-use, and extension. There are an growing number of software development projects related to biomolecular simulation that are based around Python toolkits, for example:
OpenMM (https://simtk.org/home/openmm),
MMTK (http://dirac.cnrs-orleans.fr/MMTK/),
MDAnalysis (https://code.google.com/p/mdanalysis/) ,
MDTraj (http://mdtraj.org/latest/),
SIRE (http://siremol.org/Sire/Home.html),
Bookshelf (http://sbcb.bioch.ox.ac.uk/bookshelf/)
as well as more general-purpose Python-based tools that have clear applicability to biosimulation – e.g. RADICAL-Cybertools (http://radical-cybertools.github.com).
This CECAM Macromolecular Simulation Software workshop will give an opportunity for representatives of these projects and many others that are as yet less well-known, to interact with end users with the aim of tackling the most challenging problems in this domain, particularly those relating to how the capabilities of and opportunities presented by future generations of massive, sometimes distributed, heterogeneous, computational resources can best be leveraged in this domain of science. Somewhat in the spirit of a “Hackathon”, the aim will be to challenge the code developers to maximise the interoperability of their projects through application to real-life simulation problems. Through this process they will identify gaps in provision, opportunities for optimisation, integration, and collaboration, and provide a showcase for the rich diversity of activity in this area.
Workshop website: Macromulecolar Simulation Software