Usage#
Graphein provides both a programmatic API via the Python library as well as a command-line interface.
Command Line Interface#
Graphein has a simple command line interface to get started and convert PDB files into graphs. It reads a ProteinGraphConfig
object from the config.yaml
, constructs a graph for the given PDB file(s) and saves them in the output directory in gpickle
format.
graphein -c config.yaml -p path/to/pdbs -o path/to/output
YAML Config#
A .yaml
config file can be specified to specify any of the config objects. To specify functions, use the !func:
tag. To specify one of the config objects defined in graphein use the format !<config_name>
(e.g. !ProteinGraphConfig
).
# protein_graph_config.yml
!ProteinGraphConfig
granularity: "CA"
keep_hets: False
insertions: False
verbose: False
node_metadata_functions:
- !func:graphein.protein.features.nodes.amino_acid.meiler_embedding
- !func:graphein.protein.features.nodes.amino_acid.expasy_protein_scale
edge_construction_functions:
- !func:graphein.protein.edges.distance.add_peptide_bonds
- !func:graphein.protein.edges.distance.add_distance_threshold
long_interaction_threshold: 5
threshold: 10.
dssp_config: !DSSPConfig
from graphein.utils.config import parse_config
yml_config = parse_config(PATH / "protein_graph_config.yml")
Reading the example .yaml file
above with the parse_config
function, would be the equivalent of specifying a Python dict of arguments and loading it into the ProteinGraphConfig
.
protein_graph_config = {
"granularity": "CA",
"keep_hets": False,
"insertions": False,
"verbose": False,
"node_metadata_functions": [meiler_embedding, expasy_protein_scale],
"edge_construction_functions": [
add_peptide_bonds,
partial(
add_distance_threshold,
long_interaction_threshold=5,
threshold=10.0,
),
],
"dssp_config": DSSPConfig(),
}
config = ProteinGraphConfig(**protein_graph_config)