Skip to main content

Library Enumeration Guide

Emily Ripka avatar
Written by Emily Ripka
Updated over 2 years ago

Table of Contents

Overview

One fundamental task in medicinal chemistry is the design of libraries, particularly in the area of parallel-enabled medicinal chemistry. Unfortunately, drawing a library one molecule at a time becomes untenable given the incredible size and diversity of building block collections today. Thus, either a chemoinformatics expert must be contacted to help with writing reaction enumeration and filtering code, or an external tool friendly to the non-coder must be used.

Manifold's library design function is PostEra's version of this tool, which allows the medicinal chemist to accomplish all of these necessary tasks by themselves. The tool enables the chemist to employ the powerful logic of code in a visual, intuitive manner without the need to learn a new programming language beyond that of organic chemistry.

How it Works

The tool is built around a concept familiar to all synthetic and medicinal chemists: the retrosynthesis tree. One simply imagines the route by which any given target molecule in a library is made, then supplies the building blocks. Once the library is "run," all of the library targets will be constructed from the building blocks according to the given route. Additionally, the chemist has the power to add filters that accomplish tasks such as filtering out "ugly substructures."

For a general demonstration, see the video below where the common task of creating an amide formation library is performed. Note that since Manifold's tool is built upon the tree concept with every new branch allowing filters, almost any multi-step library can be constructed.

A demo using Manifold's Library Enumeration tool to create, filter, and export a multi-step library from a series of Molecule Sets.

Crafting a Tree

In Figure 1, we further detail the Manifold features that allow crafting of the reaction tree. With the help of the nest (Figure 1A) and expand (Figure 1D) buttons, a tree of arbitrary size can be crafted and customized using 30 robust reactions, hand-curated and maintained by PostEra’s chemists. Details on a given reaction are neatly nested within the reaction selection of the tree, as shown in Figure 2, for example.

Figure 1. Overview of the library enumeration main tree view, with the important aspects to building out a synthesis tree annotated. To nest the tree within another reaction, use the nest button (A) at the top of the tree. To choose a reaction from the list of predefined reactions use the reaction selection dropdown (C). To expand out a node to have other reactions feed into it, use the expand button (D) at the bottom of a branch of the tree. All notifications on the left (B) need to be addressed before you can start an enumeration.

Although we think this is a powerful set of reactions to build libraries off of, we do foresee the desire to define custom reaction templates to plug into the enumeration tree. This feature is currently under development and will be coming to Manifold soon!

The Figure 1 example shows that by clicking the "amidation" reaction, one is given 2 "leaf nodes" where they should enter the building blocks of interest for each respective building block. Simply click on the pencil above the PostEra logo on one of the leaf nodes to start entering building blocks either by drawing, importing a file of entries, or adding a molecule set from your account.

Figure 2. Reveal details for each reaction by clicking on the information button (A) next to its name. A detailed description and illustration of the reaction and its building blocks (B), pertinent references (C), and example reactions with citations (D) have been hand-curated by PostEra’s chemists to improve understanding of the reaction inputs and output.

The tree can be altered both before and after an enumeration, with guardrails in place to notify you of losing a node’s molecules. This offers both step-wise and 1-shot creation of libraries, depending on the user’s preferred mode of operation or needs.

The tree can be altered both before and after an enumeration, with guardrails in place to notify you of losing a node’s molecules. This offers both step-wise and 1-shot creation of libraries, depending on the user’s preferred mode of operation or needs.

Adding Building Blocks

As suggested in the notification message of Figure 1B, in order for the reaction to run, each leaf node must contain at least one building block that fulfills the necessary criteria. In the current example, there must be at least one amine and one carboxylic acid present.

Conveniently, the building blocks can either be added one by one using a drawing tool, imported as a text list or SDF file, or by addition of an entire molecule set saved to your account (Figure 3). Creating a list of building blocks is in essence creating a molecule set specifically designed to be run in the given library, and thus can be explored as a traditional Manifold molecule set as well.

Figure 3. View of the building-block editing slide-out menu alongside the library enumeration tree. Users can add molecules to the selected node (C) by four different methods (A): drawing structures individually, entering SMILES strings as text, uploading an SDF or text file of molecules, or adding a molecule set from their account (the currently displayed option). A truncated view of the node’s molecules can then be viewed in this menu, enabling the user to remove some undesired molecules (B). All molecules, if desired, can be visualized in the “set view” as illustrated in Figure 4, Panel III.

Building block sets can then be explored and edited in several modes, annotated in Figure 4, in addition to the view presented in the reactant edit menu of Figure 3B. The library tree is simply an interface to display how nodes of molecules feed into one another, but the meat of the molecule sets can be accessed through interacting with a given node. Figure 4-I concisely annotates each component of a building block node, for example. From the node’s card, you can visualize details of a single molecule at a time (Figure 4-II), or you can dive deep into the set and view, sort, and filter all molecules as a whole (Figure 4-III).

Figure 4. The various views of building block sets for a library. Panel I is the starting point; a reactant node in the tree. This card offers a summary of the building blocks to be used for a given reactant type (I-F), and serves as an interface to explore and edit the set further. You can navigate through the first 100 or so entries in this view (I-E), return to the editing menu of Figure 3 (I-B), view details for a given building block (I-D “View details”, illustrated in Panel II), or view all building blocks in the same view as a Manifold Molecule Set (I-C, illustrated in Panel III). The details view (Panel II) displays purchase information (II-A), properties (II-B), and enables navigation within the window (II-C). Whereas the set view (Panel III) allows full exploration of the building blocks, as well as bulk deletion through multiple-molecule selection (III-A) and deletion (III-B), and a quick visualization of availability (III-C).

Applying Boolean Logic Filters

One feature that makes the Manifold library enumeration tool particularly useful is the ability to use filters at every step of the tree. These filters are called "boolean logic" filters, because they provide a straightforward, visual way to encode rules and restrictions using the rules of boolean algebra, which would usually require the use of a programming language.

Using these filters, one can define chemical moieties that should either be required or excluded in the building blocks of the library. Furthermore, they can also be used to encode logic such as physicochemical filters on product molecules of the library.

The visual "funnel" symbols indicate where each filter is being applied as the molecules propagate up the tree (Figure 5A). A user can then add filters one at a time in the view, or they can add an entire set of filters saved to their account (Figure 5C).

Figure 5. View of the boolean-filtering slide-out menu alongside the library enumeration tree. Filters can be applied anywhere the filter button (A) is displayed in the tree, allowing filtering of product molecules or building blocks. Filters can be added one by one (B) or a user can add in one of their saved filter sets from their account (D). Filters can be edited before or after the enumeration has completed, or be removed entirely (D), and the corresponding node and any products generated from it will be updated according to the filters. The set of filters can even be saved as a custom set from this view (F).

A simple use of the filters in this case would be to exclude all Bioactivation and Reactive Metabolite alerts, and require a cLogP between 0 and 4 for all product molecules (Figure 5E) yielded from the amide coupling reaction.

After application of the filters, the products which don’t fit within the filter constraints will be flagged (Figure 6A), and can be hidden from view in the set view of all product molecules.

Figure 6. Viewing the molecule details of a filtered product molecule. The alert flag is shown above the image, indicating that this molecule has been filtered out by the user-applied filters from Figure 5.

Starting and observing an enumeration in progress

Once an enumeration is started, the molecules will start reacting and propagating up through the tree according to the rules the user has encoded. We currently support libraries of up to 100k product molecules, where at least one product molecule will be created for each combination of building blocks. If no product is formed due to the absence of the required functional group, then the product corresponding to those building blocks will be displayed as “No Product” (Figure 7), to indicate to the user that a bad building block has been supplied.

Figure 7. Example of a “No Product” product molecule. This reaction requires an amine and a carboxylic acid to form a product, and although there is an amine (B), there is not a carboxylic acid in the right building block (C), thus no product is formed (A).

As the enumeration proceeds, a progress estimate is displayed to the user (Figure 8A), and the enumeration can be canceled at any time, instead of waiting for all products to be generated (Figure 8B). You do not, however, need to wait for the enumeration to complete to observe the product molecules; they can be visualized as usual as the results are returned through any of the aforementioned modes of viewing molecule nodes.

Figure 8. Observing a library enumeration in progress, annotating the important features in the tree-view of the library. A notification indicating that the library is being enumerated is in blue in the bottom left (A), with a progress estimate to indicate remaining time. The user can cancel the enumeration at any point (B) if they would like to make changes. The number of product molecules is indicated on the product node card (C), and the user can scroll through the product molecules during the enumeration with the overlain navigation arrows (D).

Viewing + Exporting Results

Manifold allows for easy export of product molecules to either CSV or SDF format. The products will be annotated with the building blocks needed to form the respective product (Figure 9, orange columns). Through the web application, we currently only support single-page downloads from the set view, where each page has up to 102 molecules. For bulk export of an entire library, you can contact the Manifold team at manifold@postera.ai.

Further annotations are also present for each molecule, as depicted in Figure 9 (purple columns). The “potential selectivity issue flag”, indicates that the functional moieties of the building blocks were not unique, and the reaction template could not discriminate between the products, and thus all possible products were generated. The “filtered out” flag is true for product molecules which don’t conform to the set of boolean filters applied. Lastly the “has problematic group” flag indicates that the product molecule has a substructure which is deemed problematic for this reaction, and the product returned will likely not form.

Figure 9. Schematic representation of information included in export files pertinent to library enumeration, including building blocks used to form the product (orange columns), and substructure and filter dependent flags (purple columns).

Appendix

A. Molecule Sets in Library Enumeration

Molecule sets allow users to aggregate molecules across Manifold searches and internal inventories and can serve as plug-and-play modules to quickly build out libraries. Molecule sets can be created from any Manifold search, or from your account dashboard, as shown in Figure A1.

Figure A1. User’s account-bound molecule sets, viewed from the account dashboard view. Navigate to Molecule Sets from the dashboard menu (A), where all sets can be viewed and accessed, or a new set can be created (C). As illustrated in the zoom-in on the left, an individual card in this view reveals the number of molecules in the set, and allows the user to share the set, copy it to a new set, or delete it entirely.

A Manifold library is just a combination of molecule sets to make new molecule sets, and thus building block sets and product molecules in an enumeration tree, and molecule sets can all be explored in the same way.

In the tree view of a library, you can access all product molecules as a molecule set by clicking on the view all button in the top-right of its card (Figure A2A).

Figure A2. A completed library enumeration, displaying 3.3k product molecules. Clicking on the top right fullscreen button of the product node (A), the user can visualize all product molecules in full.

The view of all of the molecules presented to you (Figure A3) is just as you would view a molecule set from your account dashboard, equipt with pagination to view all results, sort, filter, and export functionalities.

Figure A3. Full view of the library product molecules as a set. Like the usual molecule set view, the user can export the product molecules (A), page through all of the products (B), or return to the previous tree view (C). Clicking on an individual product’s details menu (D) will display more information as shown in Figure 7.

You can explore more details for an individual product (Figure A4) molecule by clicking on the "View Details" option for a given molecule (Figure A3D), which reveals a familiar view from exploring building block sets earlier (Figure 4).

Figure A4. Details pop-up for a single product molecule. The info tab (A) displays purchase information and basic properties, whereas the library enumeration tab (B) displays the building blocks used to form the product molecule.

Sorting functionality is exposed in the set view as well, be that cLogP, MW, or a synthetic accessibility score (Figure A5).

Figure A5. Sort menu (A) for molecule set view. The user can sort by original order, cLogP, molecular weight, or synthetic accessibility score (B), in ascending or descending order (C).

On top of the boolean filtering exposed to the user for a library, in set view, further filtering (Figures A6 + A7) is available by synthetic accessibility score, purchase information, match type, as well as the library-specific fields annotated in Figure 9.

Figure A6. Filter menu (A) for molecule set view. Filters are available to filter based on properties and synthetic accessibility score (B), vendors and databases (C), lead time and price (D).

Figure A7. Continuation of the filter menu (A) for the molecule set view. Filters are also available for catalog type (only building blocks, only screening molecules, or only purchasable) (B), types of matches (C), and library enumeration specific fields (D).

B. Boolean Filtering and Filter Sets

Boolean filters applied to a library node will flag all molecules of that node, and any product molecules formed from it, if it does not conform to the filter constraints. Further, a summary of total filtered-out molecules is displayed at the top of a node's card (Figure B1C).

Figure B1. Illustration of post-enumeration application of filters. The product node has been filtered to exclude molecules which trigger the Bioactivation and Reactive Metabolite alert set (B). The filter button at the product node (A) indicates that one filter has been applied. The newly filtered product count has the number of excluded molecules in red (C), with a notification icon for each molecule which has been excluded.

As indicated in Figure 5F, filters can be saved as a set to your account. These saved filter sets (Figure B2) allow users to group filters as to not continually plug in the same set for routine filter constraints. This also allows for uniform filtering across Manifold searches, and the ability to share filter sets with others for ease of collaboration.

Figure B2. User’s account-bound filter sets, viewed from the account dashboard view. Navigate to Filter Sets from the dashboard menu (A), where all sets can be viewed and accessed, or a new set can be created (C). As illustrated in the zoom-in on the left, an individual card in this view reveals the number of filters in the set, and allows the user to share the set, copy it to a new set, or delete it entirely.

Did this answer your question?