

Base Config object for use with Molecule Graph Construction.


Allowable atom types for nodes in the graph.

alias of Literal[‘C’, ‘H’, ‘O’, ‘N’, ‘F’, ‘P’, ‘S’, ‘Cl’, ‘Br’, ‘I’, ‘B’]

class graphein.molecule.config.MoleculeGraphConfig(*, verbose: bool = False, add_hs: bool = False, edge_construction_functions: typing.List[typing.Union[typing.Callable, str]] = [<function add_fully_connected_edges>, <function add_k_nn_edges>, <function add_distance_threshold>, <function add_atom_bonds>], node_metadata_functions: typing.List[typing.Union[typing.Callable, str]] = [<function atom_type_one_hot>], edge_metadata_functions: typing.List[typing.Union[typing.Callable, str]] = None, graph_metadata_functions: typing.List[typing.Callable] = None)[source]#

Config Object for Molecule Structure Graph Construction.

  • verbose (bool) – Specifies verbosity of graph creation process.

  • add_hs (bool) – Specifies whether hydrogens should be added to the graph.

  • edge_construction_functions (List[Callable]) – List of functions that take an nx.Graph and return an nx.Graph with desired edges added. Prepared edge constructions can be found in graphein.protein.edges

  • node_metadata_functions (List[Callable], optional) – List of functions that take an nx.Graph

  • edge_metadata_functions (List[Callable], optional) – List of functions that take an

  • graph_metadata_functions (List[Callable], optional) – List of functions that take an nx.Graph and return an nx.Graph with added graph-level features and metadata.


Functions for working with Small Molecule Graphs.

graphein.molecule.graphs.add_nodes_to_graph(G: networkx.classes.graph.Graph, verbose: bool = False) networkx.classes.graph.Graph[source]#

Add nodes into molecule graph.

  • G (nx.Graph) – nx.Graph with metadata to populate with nodes.

  • verbose (bool) – Controls verbosity of this step.


nx.Graph with nodes added.

graphein.molecule.graphs.construct_graph(config: Optional[graphein.molecule.config.MoleculeGraphConfig] = None, sdf_path: Optional[str] = None, smiles: Optional[str] = None, mol2_path: Optional[str] = None, pdb_path: Optional[str] = None, edge_construction_funcs: Optional[str] = None, edge_annotation_funcs: Optional[List[Callable]] = None, node_annotation_funcs: Optional[List[Callable]] = None, graph_annotation_funcs: Optional[List[Callable]] = None) networkx.classes.graph.Graph[source]#

Constructs protein structure graph from a sdf_path, mol2_path or smiles.

Users can provide a MoleculeGraphConfig object to specify construction parameters.

However, config parameters can be overridden by passing arguments directly to the function.

  • config (graphein.molecule.config.MoleculeGraphConfig, optional) – MoleculeGraphConfig object. If None, defaults to config in graphein.molecule.config.

  • sdf_path (str, optional) – Path to sdf_file to build graph from. Default is None.

  • smiles (str, optional) – smiles string to build graph from. Default is None.

  • mol2_path (str, optional) – Path to mol2_file to build graph from. Default is None.

  • pdb_path (str, optional) – Path to pdb_file to build graph from. Default is None.

  • edge_construction_funcs (List[Callable], optional) – List of edge construction functions. Default is None.

  • edge_annotation_funcs (List[Callable], optional) – List of edge annotation functions. Default is None.

  • node_annotation_funcs (List[Callable], optional) – List of node annotation functions. Default is None.

  • graph_annotation_funcs (List[Callable]) – List of graph annotation function. Default is None.


Molecule Structure Graph



graphein.molecule.graphs.initialise_graph_with_metadata(name: str, rdmol: rdkit.Mol, coords: np.ndarray) nx.Graph[source]#

Initializes the nx Graph object with initial metadata.

  • name (str) – Name of the molecule. Either the smiles or filename depending on how the graph was created.

  • rdmol (rdkit.Mol) – Processed Dataframe of molecule structure.


Returns initial molecule structure graph with metadata.

Functions for computing biochemical edges of graphs.

graphein.molecule.edges.distance.add_distance_threshold(G: networkx.classes.graph.Graph, threshold: float = 5.0)[source]#

Adds edges to any nodes within a given distance of each other.

  • G (nx.Graph) – molecule structure graph to add distance edges to

  • threshold (float) – Distance in angstroms, below which two nodes are connected.


Graph with distance-based edges added

graphein.molecule.edges.distance.add_fully_connected_edges(G: networkx.classes.graph.Graph)[source]#

Adds fully connected edges to nodes.


G (nx.Graph) – Molecule structure graph to add distance edges to.

graphein.molecule.edges.distance.add_k_nn_edges(G: networkx.classes.graph.Graph, k: int = 1, mode: str = 'connectivity', metric: str = 'minkowski', p: int = 2, include_self: Union[bool, str] = False)[source]#

Adds edges to nodes based on K nearest neighbours.

  • G (nx.Graph) – Molecule structure graph to add distance edges to.

  • k (int) – Number of neighbors for each sample.

  • mode (str) – Type of returned matrix: "connectivity" will return the connectivity matrix with ones and zeros, and "distance" will return the distances between neighbors according to the given metric.

  • metric (str) – The distance metric used to calculate the k-Neighbors for each sample point. The DistanceMetric class gives a list of available metrics. The default distance is "euclidean" ("minkowski" metric with the p param equal to 2).

  • p (int) – Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used. Default is 2 (euclidean).

  • include_self (Union[bool, str]) – Whether or not to mark each sample as the first nearest neighbor to itself. If "auto", then True is used for mode="connectivity" and False for mode="distance". Default is False.


Graph with knn-based edges added.

graphein.molecule.edges.distance.compute_distmat(coords: numpy.ndarray) numpy.ndarray[source]#

Compute pairwise euclidean distances between every atom.

Design choice: passed in a DataFrame to enable easier testing on dummy data.


coords (pd.DataFrame) – pd.Dataframe containing molecule structure. Must contain columns ["x_coord", "y_coord", "z_coord"].


np.ndarray of euclidean distance matrix.

graphein.molecule.edges.distance.get_interacting_atoms(angstroms: float, distmat: numpy.ndarray) numpy.ndarray[source]#

Find the atoms that are within a particular radius of one another.

  • angstroms (float) – Radius in angstroms.

  • distmat (np.ndarray) – Distance matrix.


Array of interacting atoms

Functions for computing atomic structure of molecules.

graphein.molecule.edges.atomic.add_atom_bonds(G: networkx.classes.graph.Graph) networkx.classes.graph.Graph[source]#

Adds atomic bonds to a molecular graph.


G (nx.Graph) – Molecular graph to add atomic bond edges to.


Molecular graph with atomic bonds added.

Functions for featurising Small Molecule Graphs.

graphein.molecule.features.nodes.atom_type.atom_type_one_hot(n, d: Dict[str, Any], return_array: bool = True, allowable_set: Optional[List[str]] = None) numpy.ndarray[source]#

Adds a one-hot encoding of atom types as a node attribute.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data.

  • return_array (bool) – If True, returns a numpy np.ndarray of one-hot encoding, otherwise returns a pd.Series. Default is True.

  • allowable_set – Specifies vocabulary of amino acids. Default is None (which uses graphein.molecule.atoms.BASE_ATOMS).


One-hot encoding of amino acid types.

Return type

Union[pd.Series, np.ndarray]

graphein.molecule.features.nodes.atom_type.atomic_mass(n: str, d: Dict[str, Any]) float[source]#

Adds mass of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Mass of the atom.

graphein.molecule.features.nodes.atom_type.chiral_tag(n: str, d: Dict[str, Any]) rdkit.Chem.rdchem.ChiralType[source]#

Adds indicator of atom chirality to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Indicator of atom chirality.

rdkit.Chem.rdchem.ChiralType str, d: Dict[str, Any]) int[source]#

Adds the degree of the node to the node data.

N.B. this is the degree as defined by RDKit rather than the ‘true’ degree of the node in the graph. For the latter, use

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Degree of the atom.

graphein.molecule.features.nodes.atom_type.explicit_valence(n: str, d: Dict[str, Any]) int[source]#

Adds explicit valence of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Explicit valence of the atom.

graphein.molecule.features.nodes.atom_type.formal_charge(n: str, d: Dict[str, Any]) int[source]#

Adds the formal charge of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Formal charge of the atom.

graphein.molecule.features.nodes.atom_type.hybridization(n: str, d: Dict[str, Any]) rdkit.Chem.rdchem.HybridizationType[source]#

Adds the hybridization of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Hybridization of the atom.

graphein.molecule.features.nodes.atom_type.implicit_valence(n: str, d: Dict[str, Any]) int[source]#

Adds implicit valence of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Implicit valence of the atom.

graphein.molecule.features.nodes.atom_type.is_aromatic(n: str, d: Dict[str, Any]) bool[source]#

Adds indicator of aromaticity of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Indicator of aromaticity of the atom.

graphein.molecule.features.nodes.atom_type.is_isotope(n: str, d: Dict[str, Any]) int[source]#

Adds indicator of whether or not the atom is an isotope to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Indicator of whether or not the atom is an isotope.

graphein.molecule.features.nodes.atom_type.is_ring(n: str, d: Dict[str, Any]) bool[source]#

Adds indicator of ring membership of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Indicator of ring membership of the atom.

graphein.molecule.features.nodes.atom_type.is_ring_size(n: str, d: Dict[str, Any], ring_size: int) bool[source]#

Adds indicator of ring membership of size ring_size of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data.

  • ring_size (int) – The size of the ring to look for.


Indicator of ring membership of size ring_size of the atom.

graphein.molecule.features.nodes.atom_type.num_explicit_h(n: str, d: Dict[str, Any]) int[source]#

Adds the number of explicit Hydrogens of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Number of explicit Hydrogens of the atom.

graphein.molecule.features.nodes.atom_type.num_implicit_h(n: str, d: Dict[str, Any]) int[source]#

Adds the number of implicit Hydrogens of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Number of implicit Hydrogens of the atom.

graphein.molecule.features.nodes.atom_type.num_radical_electrons(n: str, d: Dict[str, Any]) int[source]#

Adds the number of radical electrons of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Number of radical electrons of the atom.

graphein.molecule.features.nodes.atom_type.total_degree(n: str, d: Dict[str, Any]) int[source]#

Adds the total degree of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data.


Total degree of the atom.

graphein.molecule.features.nodes.atom_type.total_num_h(n: str, d: Dict[str, Any]) int[source]#

Adds the total number of Hydrogens of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data


Total number of Hydrogens of the atom.

graphein.molecule.features.nodes.atom_type.total_valence(n: str, d: Dict[str, Any]) int[source]#

Adds the total valence of the atom to the node data.

  • n (str) – Node name, this is unused and only included for compatibility with the other functions.

  • d (Dict[str, Any]) – Node data.


Total valence of the atom.

Functions for computing atomic features for molecules.

graphein.molecule.features.edges.bonds.add_bond_type(u: str, v: str, d: Dict[str, Any]) rdkit.Chem.rdchem.BondType[source]#

Adds bond type as an edge feature to the graph.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.


Returns the bond type.

graphein.molecule.features.edges.bonds.bond_is_aromatic(u: str, v: str, d: Dict[str, Any]) bool[source]#

Adds indicator of aromaticity of a bond to the graph as an edge feature.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.


Returns indicator of aromaticity of bond.

graphein.molecule.features.edges.bonds.bond_is_conjugated(u: str, v: str, d: Dict[str, Any]) bool[source]#

Adds indicator of conjugated bond to the graph as an edge feature.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.


Returns indicator of conjugated bond.

graphein.molecule.features.edges.bonds.bond_is_in_ring(u: str, v: str, d: Dict[str, Any]) bool[source]#

Adds indicator of ring membership to the graph as an edge feature.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.


Returns indicator of ring membership of bond.

graphein.molecule.features.edges.bonds.bond_is_in_ring_size(u: str, v: str, d: Dict[str, Any], ring_size: int) int[source]#

Adds indicator of ring membership of size ring_size to the graph as an edge feature.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.

  • ring_size (int) – Size of the ring to look for


Returns ring size of bond.

graphein.molecule.features.edges.bonds.bond_stereo(u: str, v: str, d: Dict[str, Any]) rdkit.Chem.rdchem.BondStereo[source]#

Adds bond stereo configuration as an edge feature to the graph.

  • u (str) – First node in the edge.

  • v (str) – Second node in the edge.

  • d (Dict[str, Any]) – Dictionary of edge metadata.


Returns the bond stereo.

Functions for featurising Small Molecule Graphs.

graphein.molecule.features.graph.molecule.mol_descriptors(g: networkx.classes.graph.Graph, descriptor_list: Optional[List[str]] = None, return_array: bool = False, return_series: bool = False) Union[numpy.ndarray, pandas.core.series.Series, Dict[str, Union[float, int]]][source]#

Adds global molecular descriptors to the graph.

  • g (nx.Graph) – The graph to add the descriptors to.

  • descriptor_list (Optional[List[str]]) – The list of descriptors to add. If None, all descriptors are added.

  • return_array (bool) – If True, the descriptors are returned as a np.ndarray.

  • return_series – If True, the descriptors are returned as a pd.Series.


The descriptors as a dictionary (default) np.ndarray or pd.Series.

Union[np.ndarray, pd.Series, Dict[str, Union[float, int]]]


Functions for featurising Small Molecule Graphs.

Plotting functions for molecules wrap the methods defined on protein graphs and provide sane defaults.

graphein.molecule.visualisation.plot_molecular_graph(G: nx.Graph, angle: int = 30, plot_title: Optional[str] = None, figsize: Tuple[int, int] = (10, 7), node_alpha: float = 0.7, node_size_min: float = 20.0, node_size_multiplier: float = 1, label_node_ids: bool = True, node_colour_map: = <matplotlib.colors.ListedColormap object>, edge_color_map: = <matplotlib.colors.ListedColormap object>, colour_nodes_by: str = 'element', colour_edges_by: str = 'kind', edge_alpha: float = 0.5, plot_style: str = 'ggplot', out_path: Optional[str] = None, out_format: str = '.png') Axes3D[source]#

Plots molecular graph in Axes3D.

  • G (nx.Graph) – nx.Graph Protein Structure graph to plot.

  • angle (int) – View angle. Defaults to 30.

  • plot_title (str, optional) – Title of plot. Defaults to None.

  • figsize (Tuple[int, int]) – Size of figure, defaults to (10, 7).

  • node_alpha (float) – Controls node transparency, defaults to 0.7.

  • node_size_min (float) – Specifies node minimum size, defaults to 20.

  • node_size_multiplier (float) – Scales node size by a constant. Node sizes reflect degree. Defaults to 20.

  • label_node_ids (bool) – bool indicating whether or not to plot node_id labels. Defaults to True.

  • node_colour_map ( – colour map to use for nodes. Defaults to

  • edge_color_map ( – colour map to use for edges. Defaults to

  • colour_nodes_by (str) – Specifies how to colour nodes. "degree", "seq_position" or a node feature.

  • colour_edges_by (str) – Specifies how to colour edges. Currently only "kind" is supported.

  • edge_alpha (float) – Controls edge transparency. Defaults to 0.5.

  • plot_style (str) – matplotlib style sheet to use. Defaults to "ggplot".

  • out_path (str, optional) – If not none, writes plot to this location. Defaults to None (does not save).

  • out_format (str) – Fileformat to use for plot


matplotlib Axes3D object.

graphein.molecule.visualisation.plotly_molecular_graph(g: nx.Graph, plot_title: Optional[str] = None, figsize: Tuple[int, int] = (620, 650), node_alpha: float = 0.7, node_size_min: float = 20, node_size_multiplier: float = 1.0, label_node_ids: bool = True, node_color_map: = <matplotlib.colors.ListedColormap object>, edge_color_map: = <matplotlib.colors.ListedColormap object>, colour_nodes_by: str = 'element', colour_edges_by: str = 'kind') go.Figure[source]#

Plots molecular graph using plotly.

  • G (nx.Graph) – nx.Graph Molecular graph to plot

  • plot_title (str, optional) – Title of plot, defaults to None.

  • figsize (Tuple[int, int]) – Size of figure, defaults to (620, 650).

  • node_alpha (float) – Controls node transparency, defaults to 0.7.

  • node_size_min (float) – Specifies node minimum size. Defaults to 20.0.

  • node_size_multiplier (float) – Scales node size by a constant. Node sizes reflect degree. Defaults to 1.0.

  • label_node_ids (bool) – bool indicating whether or not to plot node_id labels. Defaults to True.

  • node_colour_map ( – colour map to use for nodes. Defaults to

  • edge_color_map ( – colour map to use for edges. Defaults to

  • colour_nodes_by (str) – Specifies how to colour nodes. "degree", or a node feature. Defaults to "element".

  • colour_edges_by (str) – Specifies how to colour edges. Currently only "kind" is supported.


Plotly Graph Objects plot

Author: Eric J. Ma, Arian Jamasb Purpose: This is a set of utility variables and functions related to small molecules that can be used across the Graphein project.

These include various collections of standard atom types used molecule-focussed ML

graphein.molecule.atoms.ALLOWED_BOND_TYPES: List[rdkit.Chem.rdchem.BondType] = [rdkit.Chem.rdchem.BondType.SINGLE, rdkit.Chem.rdchem.BondType.DOUBLE, rdkit.Chem.rdchem.BondType.TRIPLE, rdkit.Chem.rdchem.BondType.AROMATIC]#

Vocabulary of allowed bondtypes.

graphein.molecule.atoms.ALLOWED_BOND_TYPE_TO_CHANNEL: Dict[rdkit.Chem.rdchem.BondType, int] = {rdkit.Chem.rdchem.BondType.SINGLE: 0, rdkit.Chem.rdchem.BondType.DOUBLE: 1, rdkit.Chem.rdchem.BondType.TRIPLE: 2, rdkit.Chem.rdchem.BondType.AROMATIC: 3}#

Mapping of bondtypes to integer values.

graphein.molecule.atoms.ALLOWED_DEGREES: List[int] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]#

Vocabulary of allowed atom degrees.

graphein.molecule.atoms.ALLOWED_HYBRIDIZATIONS: List[rdkit.Chem.rdchem.HybridizationType] = [rdkit.Chem.rdchem.HybridizationType.SP, rdkit.Chem.rdchem.HybridizationType.SP2, rdkit.Chem.rdchem.HybridizationType.SP3, rdkit.Chem.rdchem.HybridizationType.SP3D, rdkit.Chem.rdchem.HybridizationType.SP3D2]#

Vocabulary of allowed hybridizations.

graphein.molecule.atoms.ALLOWED_NUM_H: List[int] = [0, 1, 2, 3, 4]#

Vocabulary of allowed number of Hydrogens.

graphein.molecule.atoms.ALLOWED_VALENCES: List[int] = [0, 1, 2, 3, 4, 5, 6]#

Vocabulary of allowed atom valences.

graphein.molecule.atoms.ALL_BOND_TYPES: List[rdkit.Chem.rdchem.BondType] = [rdkit.Chem.rdchem.BondType.AROMATIC, rdkit.Chem.rdchem.BondType.DATIVE, rdkit.Chem.rdchem.BondType.DATIVEL, rdkit.Chem.rdchem.BondType.DATIVER, rdkit.Chem.rdchem.BondType.DOUBLE, rdkit.Chem.rdchem.BondType.FIVEANDAHALF, rdkit.Chem.rdchem.BondType.FOURANDAHALF, rdkit.Chem.rdchem.BondType.HEXTUPLE, rdkit.Chem.rdchem.BondType.HYDROGEN, rdkit.Chem.rdchem.BondType.IONIC, rdkit.Chem.rdchem.BondType.ONEANDAHALF, rdkit.Chem.rdchem.BondType.OTHER, rdkit.Chem.rdchem.BondType.QUADRUPLE, rdkit.Chem.rdchem.BondType.QUINTUPLE, rdkit.Chem.rdchem.BondType.SINGLE, rdkit.Chem.rdchem.BondType.THREEANDAHALF, rdkit.Chem.rdchem.BondType.THREECENTER, rdkit.Chem.rdchem.BondType.TRIPLE, rdkit.Chem.rdchem.BondType.TWOANDAHALF, rdkit.Chem.rdchem.BondType.UNSPECIFIED, rdkit.Chem.rdchem.BondType.ZERO]#

Vocabulary of all RDkit BondTypes.

graphein.molecule.atoms.ALL_BOND_TYPES_TO_CHANNEL: Dict[rdkit.Chem.rdchem.BondType, int] = {rdkit.Chem.rdchem.BondType.UNSPECIFIED: 19, rdkit.Chem.rdchem.BondType.SINGLE: 14, rdkit.Chem.rdchem.BondType.DOUBLE: 4, rdkit.Chem.rdchem.BondType.TRIPLE: 17, rdkit.Chem.rdchem.BondType.QUADRUPLE: 12, rdkit.Chem.rdchem.BondType.QUINTUPLE: 13, rdkit.Chem.rdchem.BondType.HEXTUPLE: 7, rdkit.Chem.rdchem.BondType.ONEANDAHALF: 10, rdkit.Chem.rdchem.BondType.TWOANDAHALF: 18, rdkit.Chem.rdchem.BondType.THREEANDAHALF: 15, rdkit.Chem.rdchem.BondType.FOURANDAHALF: 6, rdkit.Chem.rdchem.BondType.FIVEANDAHALF: 5, rdkit.Chem.rdchem.BondType.AROMATIC: 0, rdkit.Chem.rdchem.BondType.IONIC: 9, rdkit.Chem.rdchem.BondType.HYDROGEN: 8, rdkit.Chem.rdchem.BondType.THREECENTER: 16, rdkit.Chem.rdchem.BondType.DATIVE: 1, rdkit.Chem.rdchem.BondType.DATIVEL: 2, rdkit.Chem.rdchem.BondType.DATIVER: 3, rdkit.Chem.rdchem.BondType.OTHER: 11, rdkit.Chem.rdchem.BondType.ZERO: 20}#

Vocabulary of all RDkit BondTypes mapped to integer values.

graphein.molecule.atoms.ALL_STEREO_TO_CHANNEL: Dict[rdkit.Chem.rdchem.BondStereo, int] = {rdkit.Chem.rdchem.BondStereo.STEREONONE: 3, rdkit.Chem.rdchem.BondStereo.STEREOANY: 0, rdkit.Chem.rdchem.BondStereo.STEREOZ: 5, rdkit.Chem.rdchem.BondStereo.STEREOE: 2, rdkit.Chem.rdchem.BondStereo.STEREOCIS: 1, rdkit.Chem.rdchem.BondStereo.STEREOTRANS: 4}#

Vocabulary of all RDKit bond stereo types mapped to integer values.

graphein.molecule.atoms.ALL_STEREO_TYPES: List[rdkit.Chem.rdchem.BondStereo] = [rdkit.Chem.rdchem.BondStereo.STEREOANY, rdkit.Chem.rdchem.BondStereo.STEREOCIS, rdkit.Chem.rdchem.BondStereo.STEREOE, rdkit.Chem.rdchem.BondStereo.STEREONONE, rdkit.Chem.rdchem.BondStereo.STEREOTRANS, rdkit.Chem.rdchem.BondStereo.STEREOZ]#

Vocabulary of all RDKit bond stereo types.

graphein.molecule.atoms.BASE_ATOMS: List[str] = ['C', 'H', 'O', 'N', 'F', 'P', 'S', 'Cl', 'Br', 'I', 'B']#

Vocabulary of 11 standard atom types.

graphein.molecule.atoms.CHIRAL_TYPE: List[rdkit.Chem.rdchem.ChiralType] = [rdkit.Chem.rdchem.ChiralType.CHI_OTHER, rdkit.Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CCW, rdkit.Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CW, rdkit.Chem.rdchem.ChiralType.CHI_UNSPECIFIED]#

Vocabulary of all RDKit chiral types.

graphein.molecule.atoms.CHIRAL_TYPE_TO_CHANNEL: Dict[rdkit.Chem.rdchem.ChiralType, int] = {rdkit.Chem.rdchem.ChiralType.CHI_UNSPECIFIED: 3, rdkit.Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CW: 2, rdkit.Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CCW: 1, rdkit.Chem.rdchem.ChiralType.CHI_OTHER: 0}#

Vocabulary of all RDKit chiral types mapped to integer values.

graphein.molecule.atoms.EXTENDED_ATOMS = ['C', 'N', 'O', 'S', 'F', 'Si', 'P', 'Cl', 'Br', 'Mg', 'Na', 'Ca', 'Fe', 'As', 'Al', 'I', 'B', 'V', 'K', 'Tl', 'Yb', 'Sb', 'Sn', 'Ag', 'Pd', 'Co', 'Se', 'Ti', 'Zn', 'H', 'Li', 'Ge', 'Cu', 'Au', 'Ni', 'Cd', 'In', 'Mn', 'Zr', 'Cr', 'Pt', 'Hg', 'Pb', 'Unknown']#

Vocabulary of additional atom types.

graphein.molecule.atoms.RDKIT_MOL_DESCRIPTORS: List[str] = ['MaxEStateIndex', 'MinEStateIndex', 'MaxAbsEStateIndex', 'MinAbsEStateIndex', 'qed', 'MolWt', 'HeavyAtomMolWt', 'ExactMolWt', 'NumValenceElectrons', 'NumRadicalElectrons', 'MaxPartialCharge', 'MinPartialCharge', 'MaxAbsPartialCharge', 'MinAbsPartialCharge', 'FpDensityMorgan1', 'FpDensityMorgan2', 'FpDensityMorgan3', 'BCUT2D_MWHI', 'BCUT2D_MWLOW', 'BCUT2D_CHGHI', 'BCUT2D_CHGLO', 'BCUT2D_LOGPHI', 'BCUT2D_LOGPLOW', 'BCUT2D_MRHI', 'BCUT2D_MRLOW', 'BalabanJ', 'BertzCT', 'Chi0', 'Chi0n', 'Chi0v', 'Chi1', 'Chi1n', 'Chi1v', 'Chi2n', 'Chi2v', 'Chi3n', 'Chi3v', 'Chi4n', 'Chi4v', 'HallKierAlpha', 'Ipc', 'Kappa1', 'Kappa2', 'Kappa3', 'LabuteASA', 'PEOE_VSA1', 'PEOE_VSA10', 'PEOE_VSA11', 'PEOE_VSA12', 'PEOE_VSA13', 'PEOE_VSA14', 'PEOE_VSA2', 'PEOE_VSA3', 'PEOE_VSA4', 'PEOE_VSA5', 'PEOE_VSA6', 'PEOE_VSA7', 'PEOE_VSA8', 'PEOE_VSA9', 'SMR_VSA1', 'SMR_VSA10', 'SMR_VSA2', 'SMR_VSA3', 'SMR_VSA4', 'SMR_VSA5', 'SMR_VSA6', 'SMR_VSA7', 'SMR_VSA8', 'SMR_VSA9', 'SlogP_VSA1', 'SlogP_VSA10', 'SlogP_VSA11', 'SlogP_VSA12', 'SlogP_VSA2', 'SlogP_VSA3', 'SlogP_VSA4', 'SlogP_VSA5', 'SlogP_VSA6', 'SlogP_VSA7', 'SlogP_VSA8', 'SlogP_VSA9', 'TPSA', 'EState_VSA1', 'EState_VSA10', 'EState_VSA11', 'EState_VSA2', 'EState_VSA3', 'EState_VSA4', 'EState_VSA5', 'EState_VSA6', 'EState_VSA7', 'EState_VSA8', 'EState_VSA9', 'VSA_EState1', 'VSA_EState10', 'VSA_EState2', 'VSA_EState3', 'VSA_EState4', 'VSA_EState5', 'VSA_EState6', 'VSA_EState7', 'VSA_EState8', 'VSA_EState9', 'FractionCSP3', 'HeavyAtomCount', 'NHOHCount', 'NOCount', 'NumAliphaticCarbocycles', 'NumAliphaticHeterocycles', 'NumAliphaticRings', 'NumAromaticCarbocycles', 'NumAromaticHeterocycles', 'NumAromaticRings', 'NumHAcceptors', 'NumHDonors', 'NumHeteroatoms', 'NumRotatableBonds', 'NumSaturatedCarbocycles', 'NumSaturatedHeterocycles', 'NumSaturatedRings', 'RingCount', 'MolLogP', 'MolMR', 'fr_Al_COO', 'fr_Al_OH', 'fr_Al_OH_noTert', 'fr_ArN', 'fr_Ar_COO', 'fr_Ar_N', 'fr_Ar_NH', 'fr_Ar_OH', 'fr_COO', 'fr_COO2', 'fr_C_O', 'fr_C_O_noCOO', 'fr_C_S', 'fr_HOCCN', 'fr_Imine', 'fr_NH0', 'fr_NH1', 'fr_NH2', 'fr_N_O', 'fr_Ndealkylation1', 'fr_Ndealkylation2', 'fr_Nhpyrrole', 'fr_SH', 'fr_aldehyde', 'fr_alkyl_carbamate', 'fr_alkyl_halide', 'fr_allylic_oxid', 'fr_amide', 'fr_amidine', 'fr_aniline', 'fr_aryl_methyl', 'fr_azide', 'fr_azo', 'fr_barbitur', 'fr_benzene', 'fr_benzodiazepine', 'fr_bicyclic', 'fr_diazo', 'fr_dihydropyridine', 'fr_epoxide', 'fr_ester', 'fr_ether', 'fr_furan', 'fr_guanido', 'fr_halogen', 'fr_hdrzine', 'fr_hdrzone', 'fr_imidazole', 'fr_imide', 'fr_isocyan', 'fr_isothiocyan', 'fr_ketone', 'fr_ketone_Topliss', 'fr_lactam', 'fr_lactone', 'fr_methoxy', 'fr_morpholine', 'fr_nitrile', 'fr_nitro', 'fr_nitro_arom', 'fr_nitro_arom_nonortho', 'fr_nitroso', 'fr_oxazole', 'fr_oxime', 'fr_para_hydroxylation', 'fr_phenol', 'fr_phenol_noOrthoHbond', 'fr_phos_acid', 'fr_phos_ester', 'fr_piperdine', 'fr_piperzine', 'fr_priamide', 'fr_prisulfonamd', 'fr_pyridine', 'fr_quatN', 'fr_sulfide', 'fr_sulfonamd', 'fr_sulfone', 'fr_term_acetylene', 'fr_tetrazole', 'fr_thiazole', 'fr_thiocyan', 'fr_thiophene', 'fr_unbrch_alkane', 'fr_urea']#

Vocabulary of easy-to-compute RDKit molecule descriptors