A Functional Group Count Vortex Script

I recently wrote a review of Reaction Workflows, a web-based tool that allow users to build workflows from nodes that provide inputs and outputs or perform actions, including ones to perform reaction-, scaffold-, and transform-based enumeration, and it is all done within a web browser interface using drag and drop. Whilst you can draw input structures one of the real strengths is the ability to import pre-categorised reagent files e.g.Acid Chlorides or secondary amines. Whilst Workflows comes with a set of pre-categorised reagents I’m sure most users will want to include their own proprietary or catalogues of commercial reagents.

This script is intended to help with the categorisation, it uses SMARTS strings to define queries. If you are not familiar with SMARTS then the Daylight Theory pages are a good starting place. I also find the SMARTSviewer at the Univ of Hamburg really helpful. There is a pascal script Checkmol that does somethings similar.

SMARTS is a language that allows you to specify substructures using rules that are straightforward extensions of SMILES. For example, to search a database for phenol-containing structures, one would use the SMARTS string [OH]c1ccccc1, which should be familiar to those acquainted with SMILES.

The script is a variation of the high performance sub-structure search scripts described previously, however instead of simply flagging the presence (or absence) of a SMARTS query we provide a count of the number of times a SMARTS query is identified within a molecule. The script uses all available cores and is thus capable of running multiple queries in parallel and can thus handle very large datasets. The script currently contains around 70 different SMARTS queries for both functional groups and atom counts and I’d be happy to add any suggestions.

The result is shown in the screenshot below

The Vortex Script

The script can be downloaded from here

Last Updated 21 June 2017

Related Posts