Thanks to the current software and hardware advances, computational structural bioinformatics is able to produce longer simulations. However, as output size increases, so does the computational effort needed to analyze it. Clustering is becoming one of the most popular tools to analyze conformational ensembles. Unfortunately using it without real knowledge of how algorithms work or how they are parameterized (and what this parameters mean) can produce poor results that are difficult to discover in later stages.

pyProCT is a Python cluster analysis toolkit that implements the Hypothesis-driven Clustering Exploration methodology. With HCE, users can take advantage of their domain knowledge as well as their purposes and expectations to define a hypothesis. pyProCT will then use different clustering algorithms and parametrizations in order to find the most suitable clustering (the one that best fits the hypothesis). This way users can focus on what they want and pyProCT decides how to get it, producing notably better results.

Once the clustering has been produced, users do not have tools to check how their clusterings are. pyProCT comes with a browser-based GUI that can be used to assest the quality of results.

pyProCT has been uploaded to pypi.This means that pyProCT and its dependencies can be installed in your computer using 'pip' by issuing:
pip install pyProCT
pip install pyProCT-GUI

pyProCT and its GUI source code can be found in their github repositories:



pyRMSD is an open source standalone Python package focused in the efficient RMSD calculation of biomolecule conformations. It is specially proficient when used to perform computationally expensive collective operations like pairwise RMSD matrices.

pyRMSD is based in three pillars:

- The first is the CondensedMatrix object, which offers both a small memory footprint and good performance. This matrix object has been completely written in C (so its a Python C extension), and offers an access speed of about 6x faster than its pure Python equivalent. In addition, when used inside your already made algorithms, it can add a free speedup. For instance, we have tested it in a neighbour cardinality function (see 'BenchmarkNeighborOperations.py'), getting an incredible 100x speedup for free!

- The second is the RMSDCalculator object, which implements three well known superpositon algorithms (Kabsch's, QTRFIT and QCP). Some of them whith OpenMP and Cuda versions, allowing in this way to fully exploit your machine resources (with a 5-11x speedup compared with the serial version).

- The third and last is the MatrixHandler. Together with the built-in pdb reader they implement the whole flow of calculating the RMSD matrix:

pyRMSD is hosted on github, and you can download the last version from there using git. If you feel less adventurous, there is a zipped package with the latest stable version, downloadable here.

Remember that any comment or contribution will be welcome!


pyRMSD currently has 5 basic operations:

1 - Pairwise RMSD calculation
2 - One vs. following (in a sequence of conformers).
3 - One vs. all the other conformations (in a sequence of conformers).
4 - Pairwise RMSD matrix
5 - Iterative superposition of a sequence.

All methods can use the same coordinates for fitting and RMSD calculation, or a different set of coordinates for fitting (superposing) and calculating RMSD.

In addition, methods 1, 2 and 3 can be used to modify the input coordinates (the input coordinates will be superposed). The iterative superposition method will always have this behaviour as it would be senseless otherwise.

pyRMSD has been uploaded to pypi. It can be installed in your computer using 'pip' by issuing:

pip install pyRMSD

This installation choice will not install CUDA calculators.

Citing pyRMSD

pyRMSD: a Python package for efficient pairwise RMSD matrix calculation and handling