The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. The core algorithms and data structures are written in C++. Wrappers are provided to use the toolkit from either Python, Java, or C#. Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows molecules to be stored in a relational database and retrieved via substructure and similarity searches.

Please see the RDKit Documentation for more information on installation, usage, cookbooks, and lots more.

GitHub project Download RDKit

Similarity Maps Example

As a way to demonstrate a use of the RDKit, below is an example of how to easily create a similarity map.

The RDKit provides a broad range of standard cheminformatics functionality for working with molecules in two and three dimensions as well a number of unique features. One of these is the ability to generate similarity maps to visualize the atomic contributions to the similarity between a molecule and a reference molecule. The methodology is described in Riniker, S. & Landrum, G. A. Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods. J Cheminf (2013) and available in the rdkit.Chem.Draw.SimilarityMaps module.

In the following example, we show how to do this using Python.

Start by creating two molecules:

>>> from rdkit import Chem
>>> mol = Chem.MolFromSmiles('COc1cccc2cc(C(=O)NCCCCN3CCN(c4cccc5nccnc54)CC3)oc21')
>>> refmol = Chem.MolFromSmiles('CCCN(CCCCN1CCN(c2ccccc2OC)CC1)Cc1ccc2ccccc2c1')

The SimilarityMaps module supports three kind of fingerprints: atom pairs, topological torsions and Morgan fingerprints.

>>> from rdkit.Chem import Draw
>>> from rdkit.Chem.Draw import SimilarityMaps
>>> fp = SimilarityMaps.GetAPFingerprint(mol, fpType='normal')
>>> fp = SimilarityMaps.GetTTFingerprint(mol, fpType='normal')
>>> fp = SimilarityMaps.GetMorganFingerprint(mol, fpType='bv')

The types of atom pairs and torsions are normal (default), hashed and bit vector (bv). The types of the Morgan fingerprint are bit vector (bv, default) and count vector (count).

The function generating a similarity map for two fingerprints requires the specification of the fingerprint function and optionally the similarity metric. The default for the latter is the Dice similarity. Using all the default arguments of the Morgan fingerprint function, the similarity map can be generated like this:

>>> fig, maxweight = SimilarityMaps.GetSimilarityMapForFingerprint(refmol, mol, SimilarityMaps.GetMorganFingerprint)

Producing this image: