Constructing Subgraphs in Graphein#

Graphein provides utilities for extracting various subgraphs. These are composable to enable selection of quite specific subsets.

We first start by constructing a graph with a bunch of different edge types. This will be the basis graph upon which all of the selections are made

Open In Colab

[1]:
# Install Graphein if necessary
# !pip install graphein
# Install DSSP if necessary
# !sudo apt-get install dssp (better for colab) OR !conda install -c salilab dssp
[2]:
import plotly.io as pio
pio.renderers.default
[2]:
'sphinx_gallery'
[3]:
from graphein.protein.config import ProteinGraphConfig
from graphein.protein.edges.distance import *
from graphein.protein.graphs import construct_graph

edge_fns = [
    add_aromatic_interactions,
    add_hydrophobic_interactions,
    add_aromatic_sulphur_interactions,
    add_cation_pi_interactions,
    add_disulfide_interactions,
    add_hydrogen_bond_interactions,
    add_ionic_interactions,
    add_peptide_bonds
    ]
config = ProteinGraphConfig(edge_construction_functions=edge_fns)

g = construct_graph(config=config, pdb_code="4hhb")
DEBUG:graphein.protein.graphs:Deprotonating protein. This removes H atoms from the pdb_df dataframe
DEBUG:graphein.protein.graphs:Detected 574 total nodes
DEBUG:graphein.protein.features.nodes.amino_acid:Reading meiler embeddings from: /Users/arianjamasb/github/graphein/graphein/protein/features/nodes/meiler_embeddings.csv
INFO:graphein.protein.edges.distance:Found: 84 aromatic-aromatic interactions
INFO:graphein.protein.edges.distance:Found 1284 hydrophobic interactions.
INFO:graphein.protein.edges.distance:Found 6 disulfide interactions.
INFO:graphein.protein.edges.distance:Found 208 hbond interactions.
INFO:graphein.protein.edges.distance:Found 12 hbond interactions.
INFO:graphein.protein.edges.distance:Found 4566 ionic interactions.
574
[4]:
from graphein.protein.visualisation import plotly_protein_structure_graph
plotly_protein_structure_graph(g, node_size_min=4, node_size_multiplier=2)

Subsetting with a list of nodes#

The simplest method of constructing a subgraph is when we already have a defined list of nodes that we wish to extract. The naming convention for nodes is:

CHAIN:RESIDUE_NAME:POSITION

e.g: A:ALA:110

We can use the extract_subgraph_from_node_list() function to achieve this.

extract_subgraph_from_node_list(
    g,
    node_list: Optional[List[str]],
    filter_dataframe: bool = True,
    inverse: bool = False,
    return_node_list: bool = False
)
  • Selections can be inverted with the inverse parameter

  • Whether or not we wish to filter the pdb_df dataframe associated with the graph (accessed via g.graph["pdb_df"]) is controlled by the filter_dataframe parameter

  • If we just wish to retrieve a list of nodes identified by the selection, instead of returning the subgraph itself we specify this with the return_node_list parameter.

This is the core subsetting function. The other subsetting functions described below are based on different methods for computing a list of nodes to subset the graph to. If you wish to implement a subsetting method not described here, you simply need to compute a list of node_ids and provide them to this function.

[5]:
from graphein.protein.subgraphs import extract_subgraph_from_node_list

NODE_LIST = ['B:LYS:82', 'B:GLY:83', 'B:THR:84', 'B:PHE:85', 'B:ALA:86', 'B:THR:87', 'B:LEU:88', 'B:SER:89', 'B:GLU:90', 'B:LEU:91', 'B:HIS:92', 'B:CYS:93', 'B:ASP:94', 'B:LYS:95', 'B:LEU:96', 'B:HIS:97', 'B:VAL:98', 'B:ASP:99', 'B:PRO:100', 'B:GLU:101', 'B:ASN:102', 'B:PHE:103', 'B:ARG:104', 'B:LEU:105', 'B:LEU:106', 'B:GLY:107', 'B:ASN:108', 'B:VAL:109', 'B:LEU:110', 'B:VAL:111', 'B:CYS:112', 'B:VAL:113', 'B:LEU:114', 'B:ALA:115', 'B:HIS:116', 'B:HIS:117', 'B:PHE:118', 'B:GLY:119', 'B:LYS:120', 'B:GLU:121', 'B:PHE:122', 'B:THR:123', 'B:PRO:124', 'B:PRO:125', 'B:VAL:126', 'B:GLN:127', 'B:ALA:128', 'B:ALA:129', 'B:TYR:130', 'B:GLN:131', 'B:LYS:132', 'B:VAL:133', 'B:VAL:134', 'B:ALA:135', 'B:GLY:136', 'B:VAL:137', 'B:ALA:138', 'B:ASN:139', 'B:ALA:140', 'B:LEU:141', 'B:ALA:142', 'B:HIS:143', 'B:LYS:144', 'B:TYR:145', 'B:HIS:146', 'C:VAL:1', 'C:LEU:2', 'C:SER:3', 'C:PRO:4', 'C:ALA:5', 'C:ASP:6', 'C:LYS:7', 'C:THR:8', 'C:ASN:9', 'C:VAL:10', 'C:LYS:11', 'C:ALA:12', 'C:ALA:13', 'C:TRP:14', 'C:GLY:15', 'C:LYS:16', 'C:VAL:17', 'C:GLY:18', 'C:ALA:19', 'C:HIS:20', 'C:ALA:21', 'C:GLY:22', 'C:GLU:23', 'C:TYR:24', 'C:GLY:25', 'C:ALA:26', 'C:GLU:27', 'C:ALA:28', 'C:LEU:29', 'C:GLU:30', 'C:ARG:31', 'C:MET:32', 'C:PHE:33', 'C:LEU:34', 'C:SER:35', 'C:PHE:36', 'C:PRO:37', 'C:THR:38', 'C:THR:39', 'C:LYS:40', 'C:THR:41', 'C:TYR:42', 'C:PHE:43', 'C:PRO:44', 'C:HIS:45', 'C:PHE:46', 'C:ASP:47', 'C:LEU:48', 'C:SER:49', 'C:HIS:50', 'C:GLY:51', 'C:SER:52', 'C:ALA:53', 'C:GLN:54', 'C:VAL:55', 'C:LYS:56', 'C:GLY:57', 'C:HIS:58', 'C:GLY:59', 'C:LYS:60', 'C:LYS:61', 'C:VAL:62', 'C:ALA:63', 'C:ASP:64', 'C:ALA:65', 'C:LEU:66', 'C:THR:67', 'C:ASN:68', 'C:ALA:69', 'C:VAL:70', 'C:ALA:71']

s_g = extract_subgraph_from_node_list(
    g,
    NODE_LIST
    )

# Test our extraction worked
for n in s_g.nodes():
    assert n in NODE_LIST

for n in NODE_LIST:
    assert n in g.nodes()

# Visualise the subgraph
plotly_protein_structure_graph(s_g, node_size_min=4, node_size_multiplier=2)
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['B:LYS:82', 'B:GLY:83', 'B:THR:84', 'B:PHE:85', 'B:ALA:86', 'B:THR:87', 'B:LEU:88', 'B:SER:89', 'B:GLU:90', 'B:LEU:91', 'B:HIS:92', 'B:CYS:93', 'B:ASP:94', 'B:LYS:95', 'B:LEU:96', 'B:HIS:97', 'B:VAL:98', 'B:ASP:99', 'B:PRO:100', 'B:GLU:101', 'B:ASN:102', 'B:PHE:103', 'B:ARG:104', 'B:LEU:105', 'B:LEU:106', 'B:GLY:107', 'B:ASN:108', 'B:VAL:109', 'B:LEU:110', 'B:VAL:111', 'B:CYS:112', 'B:VAL:113', 'B:LEU:114', 'B:ALA:115', 'B:HIS:116', 'B:HIS:117', 'B:PHE:118', 'B:GLY:119', 'B:LYS:120', 'B:GLU:121', 'B:PHE:122', 'B:THR:123', 'B:PRO:124', 'B:PRO:125', 'B:VAL:126', 'B:GLN:127', 'B:ALA:128', 'B:ALA:129', 'B:TYR:130', 'B:GLN:131', 'B:LYS:132', 'B:VAL:133', 'B:VAL:134', 'B:ALA:135', 'B:GLY:136', 'B:VAL:137', 'B:ALA:138', 'B:ASN:139', 'B:ALA:140', 'B:LEU:141', 'B:ALA:142', 'B:HIS:143', 'B:LYS:144', 'B:TYR:145', 'B:HIS:146', 'C:VAL:1', 'C:LEU:2', 'C:SER:3', 'C:PRO:4', 'C:ALA:5', 'C:ASP:6', 'C:LYS:7', 'C:THR:8', 'C:ASN:9', 'C:VAL:10', 'C:LYS:11', 'C:ALA:12', 'C:ALA:13', 'C:TRP:14', 'C:GLY:15', 'C:LYS:16', 'C:VAL:17', 'C:GLY:18', 'C:ALA:19', 'C:HIS:20', 'C:ALA:21', 'C:GLY:22', 'C:GLU:23', 'C:TYR:24', 'C:GLY:25', 'C:ALA:26', 'C:GLU:27', 'C:ALA:28', 'C:LEU:29', 'C:GLU:30', 'C:ARG:31', 'C:MET:32', 'C:PHE:33', 'C:LEU:34', 'C:SER:35', 'C:PHE:36', 'C:PRO:37', 'C:THR:38', 'C:THR:39', 'C:LYS:40', 'C:THR:41', 'C:TYR:42', 'C:PHE:43', 'C:PRO:44', 'C:HIS:45', 'C:PHE:46', 'C:ASP:47', 'C:LEU:48', 'C:SER:49', 'C:HIS:50', 'C:GLY:51', 'C:SER:52', 'C:ALA:53', 'C:GLN:54', 'C:VAL:55', 'C:LYS:56', 'C:GLY:57', 'C:HIS:58', 'C:GLY:59', 'C:LYS:60', 'C:LYS:61', 'C:VAL:62', 'C:ALA:63', 'C:ASP:64', 'C:ALA:65', 'C:LEU:66', 'C:THR:67', 'C:ASN:68', 'C:ALA:69', 'C:VAL:70', 'C:ALA:71'].
[6]:
# The associated dataframe is filtered to only include the remaining nodes by default.
# If this is not desired, set filter_dataframe=False
s_g.graph["pdb_df"]
[6]:
record_name atom_number blank_1 atom_name alt_loc residue_name blank_2 chain_id residue_number insertion ... y_coord z_coord occupancy b_factor blank_4 segment_id element_symbol charge line_idx node_id
222 ATOM 1689 CA LYS B 82 ... -20.862 8.452 1.0 24.25 C NaN 2572 B:LYS:82
223 ATOM 1698 CA GLY B 83 ... -23.724 10.746 1.0 41.64 C NaN 2581 B:GLY:83
224 ATOM 1702 CA THR B 84 ... -22.242 11.744 1.0 25.47 C NaN 2585 B:THR:84
225 ATOM 1709 CA PHE B 85 ... -18.963 12.749 1.0 21.59 C NaN 2592 B:PHE:85
226 ATOM 1720 CA ALA B 86 ... -20.242 13.948 1.0 23.14 C NaN 2603 B:ALA:86
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
353 ATOM 2694 CA THR C 67 ... 16.119 9.983 1.0 15.27 C NaN 3577 C:THR:67
354 ATOM 2701 CA ASN C 68 ... 18.613 11.088 1.0 21.49 C NaN 3584 C:ASN:68
355 ATOM 2709 CA ALA C 69 ... 17.929 8.006 1.0 15.27 C NaN 3592 C:ALA:69
356 ATOM 2714 CA VAL C 70 ... 18.432 5.673 1.0 21.72 C NaN 3597 C:VAL:70
357 ATOM 2721 CA ALA C 71 ... 21.922 7.494 1.0 22.92 C NaN 3604 C:ALA:71

136 rows × 22 columns

[7]:
# Inversing the selection.
s_g = extract_subgraph_from_node_list(
    g,
    NODE_LIST,
    inverse=True
    )
plotly_protein_structure_graph(s_g, node_size_min=4, node_size_multiplier=2)
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['A:VAL:1', 'A:LEU:2', 'A:SER:3', 'A:PRO:4', 'A:ALA:5', 'A:ASP:6', 'A:LYS:7', 'A:THR:8', 'A:ASN:9', 'A:VAL:10', 'A:LYS:11', 'A:ALA:12', 'A:ALA:13', 'A:TRP:14', 'A:GLY:15', 'A:LYS:16', 'A:VAL:17', 'A:GLY:18', 'A:ALA:19', 'A:HIS:20', 'A:ALA:21', 'A:GLY:22', 'A:GLU:23', 'A:TYR:24', 'A:GLY:25', 'A:ALA:26', 'A:GLU:27', 'A:ALA:28', 'A:LEU:29', 'A:GLU:30', 'A:ARG:31', 'A:MET:32', 'A:PHE:33', 'A:LEU:34', 'A:SER:35', 'A:PHE:36', 'A:PRO:37', 'A:THR:38', 'A:THR:39', 'A:LYS:40', 'A:THR:41', 'A:TYR:42', 'A:PHE:43', 'A:PRO:44', 'A:HIS:45', 'A:PHE:46', 'A:ASP:47', 'A:LEU:48', 'A:SER:49', 'A:HIS:50', 'A:GLY:51', 'A:SER:52', 'A:ALA:53', 'A:GLN:54', 'A:VAL:55', 'A:LYS:56', 'A:GLY:57', 'A:HIS:58', 'A:GLY:59', 'A:LYS:60', 'A:LYS:61', 'A:VAL:62', 'A:ALA:63', 'A:ASP:64', 'A:ALA:65', 'A:LEU:66', 'A:THR:67', 'A:ASN:68', 'A:ALA:69', 'A:VAL:70', 'A:ALA:71', 'A:HIS:72', 'A:VAL:73', 'A:ASP:74', 'A:ASP:75', 'A:MET:76', 'A:PRO:77', 'A:ASN:78', 'A:ALA:79', 'A:LEU:80', 'A:SER:81', 'A:ALA:82', 'A:LEU:83', 'A:SER:84', 'A:ASP:85', 'A:LEU:86', 'A:HIS:87', 'A:ALA:88', 'A:HIS:89', 'A:LYS:90', 'A:LEU:91', 'A:ARG:92', 'A:VAL:93', 'A:ASP:94', 'A:PRO:95', 'A:VAL:96', 'A:ASN:97', 'A:PHE:98', 'A:LYS:99', 'A:LEU:100', 'A:LEU:101', 'A:SER:102', 'A:HIS:103', 'A:CYS:104', 'A:LEU:105', 'A:LEU:106', 'A:VAL:107', 'A:THR:108', 'A:LEU:109', 'A:ALA:110', 'A:ALA:111', 'A:HIS:112', 'A:LEU:113', 'A:PRO:114', 'A:ALA:115', 'A:GLU:116', 'A:PHE:117', 'A:THR:118', 'A:PRO:119', 'A:ALA:120', 'A:VAL:121', 'A:HIS:122', 'A:ALA:123', 'A:SER:124', 'A:LEU:125', 'A:ASP:126', 'A:LYS:127', 'A:PHE:128', 'A:LEU:129', 'A:ALA:130', 'A:SER:131', 'A:VAL:132', 'A:SER:133', 'A:THR:134', 'A:VAL:135', 'A:LEU:136', 'A:THR:137', 'A:SER:138', 'A:LYS:139', 'A:TYR:140', 'A:ARG:141', 'B:VAL:1', 'B:HIS:2', 'B:LEU:3', 'B:THR:4', 'B:PRO:5', 'B:GLU:6', 'B:GLU:7', 'B:LYS:8', 'B:SER:9', 'B:ALA:10', 'B:VAL:11', 'B:THR:12', 'B:ALA:13', 'B:LEU:14', 'B:TRP:15', 'B:GLY:16', 'B:LYS:17', 'B:VAL:18', 'B:ASN:19', 'B:VAL:20', 'B:ASP:21', 'B:GLU:22', 'B:VAL:23', 'B:GLY:24', 'B:GLY:25', 'B:GLU:26', 'B:ALA:27', 'B:LEU:28', 'B:GLY:29', 'B:ARG:30', 'B:LEU:31', 'B:LEU:32', 'B:VAL:33', 'B:VAL:34', 'B:TYR:35', 'B:PRO:36', 'B:TRP:37', 'B:THR:38', 'B:GLN:39', 'B:ARG:40', 'B:PHE:41', 'B:PHE:42', 'B:GLU:43', 'B:SER:44', 'B:PHE:45', 'B:GLY:46', 'B:ASP:47', 'B:LEU:48', 'B:SER:49', 'B:THR:50', 'B:PRO:51', 'B:ASP:52', 'B:ALA:53', 'B:VAL:54', 'B:MET:55', 'B:GLY:56', 'B:ASN:57', 'B:PRO:58', 'B:LYS:59', 'B:VAL:60', 'B:LYS:61', 'B:ALA:62', 'B:HIS:63', 'B:GLY:64', 'B:LYS:65', 'B:LYS:66', 'B:VAL:67', 'B:LEU:68', 'B:GLY:69', 'B:ALA:70', 'B:PHE:71', 'B:SER:72', 'B:ASP:73', 'B:GLY:74', 'B:LEU:75', 'B:ALA:76', 'B:HIS:77', 'B:LEU:78', 'B:ASP:79', 'B:ASN:80', 'B:LEU:81', 'C:HIS:72', 'C:VAL:73', 'C:ASP:74', 'C:ASP:75', 'C:MET:76', 'C:PRO:77', 'C:ASN:78', 'C:ALA:79', 'C:LEU:80', 'C:SER:81', 'C:ALA:82', 'C:LEU:83', 'C:SER:84', 'C:ASP:85', 'C:LEU:86', 'C:HIS:87', 'C:ALA:88', 'C:HIS:89', 'C:LYS:90', 'C:LEU:91', 'C:ARG:92', 'C:VAL:93', 'C:ASP:94', 'C:PRO:95', 'C:VAL:96', 'C:ASN:97', 'C:PHE:98', 'C:LYS:99', 'C:LEU:100', 'C:LEU:101', 'C:SER:102', 'C:HIS:103', 'C:CYS:104', 'C:LEU:105', 'C:LEU:106', 'C:VAL:107', 'C:THR:108', 'C:LEU:109', 'C:ALA:110', 'C:ALA:111', 'C:HIS:112', 'C:LEU:113', 'C:PRO:114', 'C:ALA:115', 'C:GLU:116', 'C:PHE:117', 'C:THR:118', 'C:PRO:119', 'C:ALA:120', 'C:VAL:121', 'C:HIS:122', 'C:ALA:123', 'C:SER:124', 'C:LEU:125', 'C:ASP:126', 'C:LYS:127', 'C:PHE:128', 'C:LEU:129', 'C:ALA:130', 'C:SER:131', 'C:VAL:132', 'C:SER:133', 'C:THR:134', 'C:VAL:135', 'C:LEU:136', 'C:THR:137', 'C:SER:138', 'C:LYS:139', 'C:TYR:140', 'C:ARG:141', 'D:VAL:1', 'D:HIS:2', 'D:LEU:3', 'D:THR:4', 'D:PRO:5', 'D:GLU:6', 'D:GLU:7', 'D:LYS:8', 'D:SER:9', 'D:ALA:10', 'D:VAL:11', 'D:THR:12', 'D:ALA:13', 'D:LEU:14', 'D:TRP:15', 'D:GLY:16', 'D:LYS:17', 'D:VAL:18', 'D:ASN:19', 'D:VAL:20', 'D:ASP:21', 'D:GLU:22', 'D:VAL:23', 'D:GLY:24', 'D:GLY:25', 'D:GLU:26', 'D:ALA:27', 'D:LEU:28', 'D:GLY:29', 'D:ARG:30', 'D:LEU:31', 'D:LEU:32', 'D:VAL:33', 'D:VAL:34', 'D:TYR:35', 'D:PRO:36', 'D:TRP:37', 'D:THR:38', 'D:GLN:39', 'D:ARG:40', 'D:PHE:41', 'D:PHE:42', 'D:GLU:43', 'D:SER:44', 'D:PHE:45', 'D:GLY:46', 'D:ASP:47', 'D:LEU:48', 'D:SER:49', 'D:THR:50', 'D:PRO:51', 'D:ASP:52', 'D:ALA:53', 'D:VAL:54', 'D:MET:55', 'D:GLY:56', 'D:ASN:57', 'D:PRO:58', 'D:LYS:59', 'D:VAL:60', 'D:LYS:61', 'D:ALA:62', 'D:HIS:63', 'D:GLY:64', 'D:LYS:65', 'D:LYS:66', 'D:VAL:67', 'D:LEU:68', 'D:GLY:69', 'D:ALA:70', 'D:PHE:71', 'D:SER:72', 'D:ASP:73', 'D:GLY:74', 'D:LEU:75', 'D:ALA:76', 'D:HIS:77', 'D:LEU:78', 'D:ASP:79', 'D:ASN:80', 'D:LEU:81', 'D:LYS:82', 'D:GLY:83', 'D:THR:84', 'D:PHE:85', 'D:ALA:86', 'D:THR:87', 'D:LEU:88', 'D:SER:89', 'D:GLU:90', 'D:LEU:91', 'D:HIS:92', 'D:CYS:93', 'D:ASP:94', 'D:LYS:95', 'D:LEU:96', 'D:HIS:97', 'D:VAL:98', 'D:ASP:99', 'D:PRO:100', 'D:GLU:101', 'D:ASN:102', 'D:PHE:103', 'D:ARG:104', 'D:LEU:105', 'D:LEU:106', 'D:GLY:107', 'D:ASN:108', 'D:VAL:109', 'D:LEU:110', 'D:VAL:111', 'D:CYS:112', 'D:VAL:113', 'D:LEU:114', 'D:ALA:115', 'D:HIS:116', 'D:HIS:117', 'D:PHE:118', 'D:GLY:119', 'D:LYS:120', 'D:GLU:121', 'D:PHE:122', 'D:THR:123', 'D:PRO:124', 'D:PRO:125', 'D:VAL:126', 'D:GLN:127', 'D:ALA:128', 'D:ALA:129', 'D:TYR:130', 'D:GLN:131', 'D:LYS:132', 'D:VAL:133', 'D:VAL:134', 'D:ALA:135', 'D:GLY:136', 'D:VAL:137', 'D:ALA:138', 'D:ASN:139', 'D:ALA:140', 'D:LEU:141', 'D:ALA:142', 'D:HIS:143', 'D:LYS:144', 'D:TYR:145', 'D:HIS:146'].

Spatial Subgraphing#

We can construct spatial subgraphs by specifying a central point and a radius. All nodes within that radius (euclidean distance) will be selected. This selection can be inversed as before.

Here we select all nodes within 20 \(\mathring A\) of the origin:

N.B. different proteins may use different co-ordinate spaces

[8]:
from graphein.protein.subgraphs import extract_subgraph_from_point

s_g = extract_subgraph_from_point(g, centre_point=(0, 0, 0), radius=20)

plotly_protein_structure_graph(s_g, node_size_min=4, node_size_multiplier=2)
DEBUG:graphein.protein.subgraphs:Found 177 nodes in the spatial point-radius subgraph.
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['D:LEU:105', 'C:ARG:31', 'D:VAL:134', 'A:MET:32', 'B:TYR:145', 'C:HIS:87', 'C:THR:137', 'A:VAL:132', 'B:LEU:105', 'A:VAL:93', 'C:SER:35', 'A:PHE:98', 'C:THR:41', 'A:PRO:95', 'B:VAL:98', 'A:LEU:136', 'B:ASN:139', 'B:HIS:143', 'C:LEU:125', 'A:CYS:104', 'B:LEU:110', 'D:HIS:143', 'D:TYR:35', 'C:LEU:34', 'A:LEU:34', 'D:TRP:37', 'B:LEU:141', 'C:LEU:136', 'C:VAL:93', 'D:GLU:101', 'B:TYR:35', 'B:LEU:106', 'A:LEU:100', 'C:LEU:106', 'D:VAL:133', 'D:VAL:111', 'C:ARG:141', 'A:VAL:1', 'D:GLY:136', 'A:VAL:107', 'B:VAL:34', 'A:LYS:127', 'D:ALA:140', 'A:SER:102', 'C:THR:38', 'C:VAL:135', 'A:LEU:106', 'D:ALA:142', 'A:ASN:97', 'C:VAL:1', 'B:ALA:142', 'A:ALA:130', 'C:LEU:29', 'B:PRO:36', 'C:VAL:107', 'A:VAL:96', 'B:THR:38', 'B:VAL:133', 'D:LEU:106', 'A:ARG:141', 'A:ALA:88', 'C:PHE:128', 'C:ASP:94', 'D:LEU:31', 'D:TYR:145', 'C:THR:39', 'C:MET:32', 'B:TRP:37', 'A:ASP:126', 'A:TYR:42', 'D:VAL:34', 'A:LEU:125', 'A:PHE:33', 'C:THR:134', 'B:LYS:132', 'D:PHE:103', 'B:VAL:134', 'A:LYS:40', 'B:ALA:138', 'C:ALA:28', 'D:ASP:99', 'D:VAL:98', 'C:ALA:130', 'C:LYS:139', 'A:PHE:36', 'A:LYS:139', 'A:ASP:94', 'C:SER:133', 'C:ARG:92', 'B:ASP:99', 'D:LEU:110', 'A:THR:41', 'B:ASN:102', 'C:LEU:101', 'C:ASP:126', 'A:LEU:101', 'D:ASN:102', 'C:PHE:98', 'A:SER:138', 'A:TYR:140', 'A:LEU:105', 'C:PRO:37', 'B:GLY:107', 'A:HIS:87', 'D:THR:38', 'A:LEU:129', 'D:CYS:112', 'A:HIS:103', 'D:LEU:32', 'C:LYS:99', 'A:THR:38', 'A:SER:133', 'D:ARG:104', 'A:ARG:31', 'C:VAL:96', 'C:VAL:132', 'C:LEU:91', 'D:PRO:36', 'C:LEU:105', 'C:SER:102', 'A:THR:137', 'D:ASN:139', 'D:PRO:100', 'A:PRO:37', 'C:LYS:40', 'B:GLY:136', 'A:THR:39', 'B:GLU:101', 'A:ALA:123', 'C:PHE:33', 'D:LYS:132', 'D:ALA:138', 'B:ALA:135', 'A:SER:131', 'D:LEU:141', 'D:VAL:109', 'C:TYR:42', 'B:CYS:112', 'B:PHE:103', 'C:LEU:100', 'D:VAL:137', 'D:GLN:131', 'B:ALA:140', 'A:PHE:128', 'A:THR:134', 'B:VAL:111', 'C:PRO:95', 'B:ARG:104', 'C:ASN:97', 'B:PRO:100', 'C:TYR:140', 'A:LEU:91', 'C:ALA:88', 'A:LEU:29', 'B:VAL:137', 'C:SER:138', 'C:PHE:36', 'C:ALA:123', 'D:ALA:135', 'B:LEU:31', 'A:SER:35', 'A:ARG:92', 'B:VAL:109', 'D:GLY:107', 'D:ASN:108', 'D:ALA:128', 'B:ASN:108', 'C:SER:131', 'C:LYS:127', 'A:VAL:135', 'C:CYS:104', 'B:ALA:128', 'C:LEU:129', 'C:HIS:103', 'A:LYS:99', 'B:GLN:131', 'B:LEU:32'].
[9]:
# Again, we can inverse this selection
s_g = extract_subgraph_from_point(g, centre_point=(0, 0, 0), radius=20, inverse=True)
plotly_protein_structure_graph(s_g, node_size_min=4, node_size_multiplier=2)
DEBUG:graphein.protein.subgraphs:Found 177 nodes in the spatial point-radius subgraph.
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['A:LEU:2', 'A:SER:3', 'A:PRO:4', 'A:ALA:5', 'A:ASP:6', 'A:LYS:7', 'A:THR:8', 'A:ASN:9', 'A:VAL:10', 'A:LYS:11', 'A:ALA:12', 'A:ALA:13', 'A:TRP:14', 'A:GLY:15', 'A:LYS:16', 'A:VAL:17', 'A:GLY:18', 'A:ALA:19', 'A:HIS:20', 'A:ALA:21', 'A:GLY:22', 'A:GLU:23', 'A:TYR:24', 'A:GLY:25', 'A:ALA:26', 'A:GLU:27', 'A:ALA:28', 'A:GLU:30', 'A:PHE:43', 'A:PRO:44', 'A:HIS:45', 'A:PHE:46', 'A:ASP:47', 'A:LEU:48', 'A:SER:49', 'A:HIS:50', 'A:GLY:51', 'A:SER:52', 'A:ALA:53', 'A:GLN:54', 'A:VAL:55', 'A:LYS:56', 'A:GLY:57', 'A:HIS:58', 'A:GLY:59', 'A:LYS:60', 'A:LYS:61', 'A:VAL:62', 'A:ALA:63', 'A:ASP:64', 'A:ALA:65', 'A:LEU:66', 'A:THR:67', 'A:ASN:68', 'A:ALA:69', 'A:VAL:70', 'A:ALA:71', 'A:HIS:72', 'A:VAL:73', 'A:ASP:74', 'A:ASP:75', 'A:MET:76', 'A:PRO:77', 'A:ASN:78', 'A:ALA:79', 'A:LEU:80', 'A:SER:81', 'A:ALA:82', 'A:LEU:83', 'A:SER:84', 'A:ASP:85', 'A:LEU:86', 'A:HIS:89', 'A:LYS:90', 'A:THR:108', 'A:LEU:109', 'A:ALA:110', 'A:ALA:111', 'A:HIS:112', 'A:LEU:113', 'A:PRO:114', 'A:ALA:115', 'A:GLU:116', 'A:PHE:117', 'A:THR:118', 'A:PRO:119', 'A:ALA:120', 'A:VAL:121', 'A:HIS:122', 'A:SER:124', 'B:VAL:1', 'B:HIS:2', 'B:LEU:3', 'B:THR:4', 'B:PRO:5', 'B:GLU:6', 'B:GLU:7', 'B:LYS:8', 'B:SER:9', 'B:ALA:10', 'B:VAL:11', 'B:THR:12', 'B:ALA:13', 'B:LEU:14', 'B:TRP:15', 'B:GLY:16', 'B:LYS:17', 'B:VAL:18', 'B:ASN:19', 'B:VAL:20', 'B:ASP:21', 'B:GLU:22', 'B:VAL:23', 'B:GLY:24', 'B:GLY:25', 'B:GLU:26', 'B:ALA:27', 'B:LEU:28', 'B:GLY:29', 'B:ARG:30', 'B:VAL:33', 'B:GLN:39', 'B:ARG:40', 'B:PHE:41', 'B:PHE:42', 'B:GLU:43', 'B:SER:44', 'B:PHE:45', 'B:GLY:46', 'B:ASP:47', 'B:LEU:48', 'B:SER:49', 'B:THR:50', 'B:PRO:51', 'B:ASP:52', 'B:ALA:53', 'B:VAL:54', 'B:MET:55', 'B:GLY:56', 'B:ASN:57', 'B:PRO:58', 'B:LYS:59', 'B:VAL:60', 'B:LYS:61', 'B:ALA:62', 'B:HIS:63', 'B:GLY:64', 'B:LYS:65', 'B:LYS:66', 'B:VAL:67', 'B:LEU:68', 'B:GLY:69', 'B:ALA:70', 'B:PHE:71', 'B:SER:72', 'B:ASP:73', 'B:GLY:74', 'B:LEU:75', 'B:ALA:76', 'B:HIS:77', 'B:LEU:78', 'B:ASP:79', 'B:ASN:80', 'B:LEU:81', 'B:LYS:82', 'B:GLY:83', 'B:THR:84', 'B:PHE:85', 'B:ALA:86', 'B:THR:87', 'B:LEU:88', 'B:SER:89', 'B:GLU:90', 'B:LEU:91', 'B:HIS:92', 'B:CYS:93', 'B:ASP:94', 'B:LYS:95', 'B:LEU:96', 'B:HIS:97', 'B:VAL:113', 'B:LEU:114', 'B:ALA:115', 'B:HIS:116', 'B:HIS:117', 'B:PHE:118', 'B:GLY:119', 'B:LYS:120', 'B:GLU:121', 'B:PHE:122', 'B:THR:123', 'B:PRO:124', 'B:PRO:125', 'B:VAL:126', 'B:GLN:127', 'B:ALA:129', 'B:TYR:130', 'B:LYS:144', 'B:HIS:146', 'C:LEU:2', 'C:SER:3', 'C:PRO:4', 'C:ALA:5', 'C:ASP:6', 'C:LYS:7', 'C:THR:8', 'C:ASN:9', 'C:VAL:10', 'C:LYS:11', 'C:ALA:12', 'C:ALA:13', 'C:TRP:14', 'C:GLY:15', 'C:LYS:16', 'C:VAL:17', 'C:GLY:18', 'C:ALA:19', 'C:HIS:20', 'C:ALA:21', 'C:GLY:22', 'C:GLU:23', 'C:TYR:24', 'C:GLY:25', 'C:ALA:26', 'C:GLU:27', 'C:GLU:30', 'C:PHE:43', 'C:PRO:44', 'C:HIS:45', 'C:PHE:46', 'C:ASP:47', 'C:LEU:48', 'C:SER:49', 'C:HIS:50', 'C:GLY:51', 'C:SER:52', 'C:ALA:53', 'C:GLN:54', 'C:VAL:55', 'C:LYS:56', 'C:GLY:57', 'C:HIS:58', 'C:GLY:59', 'C:LYS:60', 'C:LYS:61', 'C:VAL:62', 'C:ALA:63', 'C:ASP:64', 'C:ALA:65', 'C:LEU:66', 'C:THR:67', 'C:ASN:68', 'C:ALA:69', 'C:VAL:70', 'C:ALA:71', 'C:HIS:72', 'C:VAL:73', 'C:ASP:74', 'C:ASP:75', 'C:MET:76', 'C:PRO:77', 'C:ASN:78', 'C:ALA:79', 'C:LEU:80', 'C:SER:81', 'C:ALA:82', 'C:LEU:83', 'C:SER:84', 'C:ASP:85', 'C:LEU:86', 'C:HIS:89', 'C:LYS:90', 'C:THR:108', 'C:LEU:109', 'C:ALA:110', 'C:ALA:111', 'C:HIS:112', 'C:LEU:113', 'C:PRO:114', 'C:ALA:115', 'C:GLU:116', 'C:PHE:117', 'C:THR:118', 'C:PRO:119', 'C:ALA:120', 'C:VAL:121', 'C:HIS:122', 'C:SER:124', 'D:VAL:1', 'D:HIS:2', 'D:LEU:3', 'D:THR:4', 'D:PRO:5', 'D:GLU:6', 'D:GLU:7', 'D:LYS:8', 'D:SER:9', 'D:ALA:10', 'D:VAL:11', 'D:THR:12', 'D:ALA:13', 'D:LEU:14', 'D:TRP:15', 'D:GLY:16', 'D:LYS:17', 'D:VAL:18', 'D:ASN:19', 'D:VAL:20', 'D:ASP:21', 'D:GLU:22', 'D:VAL:23', 'D:GLY:24', 'D:GLY:25', 'D:GLU:26', 'D:ALA:27', 'D:LEU:28', 'D:GLY:29', 'D:ARG:30', 'D:VAL:33', 'D:GLN:39', 'D:ARG:40', 'D:PHE:41', 'D:PHE:42', 'D:GLU:43', 'D:SER:44', 'D:PHE:45', 'D:GLY:46', 'D:ASP:47', 'D:LEU:48', 'D:SER:49', 'D:THR:50', 'D:PRO:51', 'D:ASP:52', 'D:ALA:53', 'D:VAL:54', 'D:MET:55', 'D:GLY:56', 'D:ASN:57', 'D:PRO:58', 'D:LYS:59', 'D:VAL:60', 'D:LYS:61', 'D:ALA:62', 'D:HIS:63', 'D:GLY:64', 'D:LYS:65', 'D:LYS:66', 'D:VAL:67', 'D:LEU:68', 'D:GLY:69', 'D:ALA:70', 'D:PHE:71', 'D:SER:72', 'D:ASP:73', 'D:GLY:74', 'D:LEU:75', 'D:ALA:76', 'D:HIS:77', 'D:LEU:78', 'D:ASP:79', 'D:ASN:80', 'D:LEU:81', 'D:LYS:82', 'D:GLY:83', 'D:THR:84', 'D:PHE:85', 'D:ALA:86', 'D:THR:87', 'D:LEU:88', 'D:SER:89', 'D:GLU:90', 'D:LEU:91', 'D:HIS:92', 'D:CYS:93', 'D:ASP:94', 'D:LYS:95', 'D:LEU:96', 'D:HIS:97', 'D:VAL:113', 'D:LEU:114', 'D:ALA:115', 'D:HIS:116', 'D:HIS:117', 'D:PHE:118', 'D:GLY:119', 'D:LYS:120', 'D:GLU:121', 'D:PHE:122', 'D:THR:123', 'D:PRO:124', 'D:PRO:125', 'D:VAL:126', 'D:GLN:127', 'D:ALA:129', 'D:TYR:130', 'D:LYS:144', 'D:HIS:146'].

Subgraphing based on Residue Types#

[10]:
from graphein.protein.subgraphs import extract_subgraph_from_residue_types
residue_types = ["SER", "ALA", "GLY"]

s_g = extract_subgraph_from_residue_types(g, residue_types)
plotly_protein_structure_graph(s_g, colour_nodes_by="residue_name")
DEBUG:graphein.protein.subgraphs:Found 144 nodes in the residue type subgraph.
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['B:ALA:10', 'A:ALA:79', 'B:ALA:53', 'C:ALA:115', 'C:GLY:22', 'C:SER:35', 'B:ALA:86', 'D:GLY:74', 'C:ALA:12', 'A:SER:49', 'C:GLY:51', 'C:ALA:69', 'B:SER:49', 'D:GLY:119', 'A:ALA:13', 'A:ALA:65', 'A:SER:52', 'D:ALA:53', 'C:ALA:82', 'D:GLY:136', 'B:ALA:115', 'D:GLY:25', 'A:ALA:120', 'A:ALA:69', 'D:ALA:140', 'A:SER:102', 'C:ALA:110', 'B:GLY:16', 'D:ALA:86', 'A:ALA:115', 'C:ALA:21', 'D:ALA:142', 'B:ALA:129', 'A:GLY:59', 'B:ALA:13', 'B:ALA:142', 'A:ALA:130', 'B:ALA:76', 'D:SER:9', 'A:ALA:12', 'B:GLY:25', 'C:ALA:5', 'C:ALA:71', 'A:ALA:88', 'B:ALA:70', 'A:GLY:51', 'B:GLY:74', 'C:ALA:53', 'D:GLY:16', 'C:SER:52', 'A:SER:124', 'B:GLY:83', 'B:GLY:69', 'D:SER:44', 'B:ALA:138', 'D:ALA:70', 'C:ALA:26', 'C:ALA:28', 'C:GLY:59', 'D:ALA:115', 'C:ALA:130', 'D:ALA:129', 'B:GLY:56', 'A:SER:81', 'C:SER:49', 'B:GLY:46', 'C:SER:133', 'D:ALA:62', 'D:GLY:29', 'A:GLY:25', 'A:ALA:110', 'A:ALA:19', 'D:ALA:13', 'C:GLY:15', 'A:GLY:57', 'A:SER:138', 'D:ALA:76', 'B:GLY:107', 'C:ALA:79', 'A:ALA:82', 'D:GLY:24', 'D:SER:49', 'D:SER:89', 'A:SER:133', 'C:SER:81', 'C:GLY:57', 'C:ALA:65', 'C:SER:84', 'A:GLY:15', 'A:GLY:18', 'B:GLY:29', 'C:SER:102', 'C:ALA:120', 'B:ALA:27', 'D:GLY:64', 'C:GLY:18', 'B:GLY:136', 'A:SER:3', 'A:ALA:123', 'B:GLY:24', 'A:ALA:26', 'C:ALA:63', 'D:ALA:138', 'D:GLY:56', 'A:GLY:22', 'A:ALA:71', 'D:GLY:83', 'C:ALA:19', 'B:ALA:135', 'A:SER:131', 'C:ALA:111', 'C:GLY:25', 'D:SER:72', 'D:ALA:10', 'A:ALA:28', 'C:SER:3', 'A:ALA:5', 'B:SER:9', 'A:SER:84', 'B:ALA:140', 'A:ALA:111', 'B:SER:89', 'C:ALA:13', 'A:ALA:53', 'D:ALA:27', 'B:GLY:119', 'C:ALA:88', 'B:SER:44', 'C:SER:138', 'B:ALA:62', 'C:ALA:123', 'A:ALA:63', 'D:ALA:135', 'A:SER:35', 'C:SER:124', 'D:GLY:107', 'D:ALA:128', 'C:SER:131', 'D:GLY:46', 'D:GLY:69', 'B:ALA:128', 'B:GLY:64', 'A:ALA:21', 'B:SER:72'].
[11]:
# Inverse the selection
s_g = extract_subgraph_from_residue_types(g, residue_types, inverse=True)
plotly_protein_structure_graph(s_g, colour_nodes_by="residue_name", node_size_min=4, node_size_multiplier=2)
DEBUG:graphein.protein.subgraphs:Found 144 nodes in the residue type subgraph.
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['A:VAL:1', 'A:LEU:2', 'A:PRO:4', 'A:ASP:6', 'A:LYS:7', 'A:THR:8', 'A:ASN:9', 'A:VAL:10', 'A:LYS:11', 'A:TRP:14', 'A:LYS:16', 'A:VAL:17', 'A:HIS:20', 'A:GLU:23', 'A:TYR:24', 'A:GLU:27', 'A:LEU:29', 'A:GLU:30', 'A:ARG:31', 'A:MET:32', 'A:PHE:33', 'A:LEU:34', 'A:PHE:36', 'A:PRO:37', 'A:THR:38', 'A:THR:39', 'A:LYS:40', 'A:THR:41', 'A:TYR:42', 'A:PHE:43', 'A:PRO:44', 'A:HIS:45', 'A:PHE:46', 'A:ASP:47', 'A:LEU:48', 'A:HIS:50', 'A:GLN:54', 'A:VAL:55', 'A:LYS:56', 'A:HIS:58', 'A:LYS:60', 'A:LYS:61', 'A:VAL:62', 'A:ASP:64', 'A:LEU:66', 'A:THR:67', 'A:ASN:68', 'A:VAL:70', 'A:HIS:72', 'A:VAL:73', 'A:ASP:74', 'A:ASP:75', 'A:MET:76', 'A:PRO:77', 'A:ASN:78', 'A:LEU:80', 'A:LEU:83', 'A:ASP:85', 'A:LEU:86', 'A:HIS:87', 'A:HIS:89', 'A:LYS:90', 'A:LEU:91', 'A:ARG:92', 'A:VAL:93', 'A:ASP:94', 'A:PRO:95', 'A:VAL:96', 'A:ASN:97', 'A:PHE:98', 'A:LYS:99', 'A:LEU:100', 'A:LEU:101', 'A:HIS:103', 'A:CYS:104', 'A:LEU:105', 'A:LEU:106', 'A:VAL:107', 'A:THR:108', 'A:LEU:109', 'A:HIS:112', 'A:LEU:113', 'A:PRO:114', 'A:GLU:116', 'A:PHE:117', 'A:THR:118', 'A:PRO:119', 'A:VAL:121', 'A:HIS:122', 'A:LEU:125', 'A:ASP:126', 'A:LYS:127', 'A:PHE:128', 'A:LEU:129', 'A:VAL:132', 'A:THR:134', 'A:VAL:135', 'A:LEU:136', 'A:THR:137', 'A:LYS:139', 'A:TYR:140', 'A:ARG:141', 'B:VAL:1', 'B:HIS:2', 'B:LEU:3', 'B:THR:4', 'B:PRO:5', 'B:GLU:6', 'B:GLU:7', 'B:LYS:8', 'B:VAL:11', 'B:THR:12', 'B:LEU:14', 'B:TRP:15', 'B:LYS:17', 'B:VAL:18', 'B:ASN:19', 'B:VAL:20', 'B:ASP:21', 'B:GLU:22', 'B:VAL:23', 'B:GLU:26', 'B:LEU:28', 'B:ARG:30', 'B:LEU:31', 'B:LEU:32', 'B:VAL:33', 'B:VAL:34', 'B:TYR:35', 'B:PRO:36', 'B:TRP:37', 'B:THR:38', 'B:GLN:39', 'B:ARG:40', 'B:PHE:41', 'B:PHE:42', 'B:GLU:43', 'B:PHE:45', 'B:ASP:47', 'B:LEU:48', 'B:THR:50', 'B:PRO:51', 'B:ASP:52', 'B:VAL:54', 'B:MET:55', 'B:ASN:57', 'B:PRO:58', 'B:LYS:59', 'B:VAL:60', 'B:LYS:61', 'B:HIS:63', 'B:LYS:65', 'B:LYS:66', 'B:VAL:67', 'B:LEU:68', 'B:PHE:71', 'B:ASP:73', 'B:LEU:75', 'B:HIS:77', 'B:LEU:78', 'B:ASP:79', 'B:ASN:80', 'B:LEU:81', 'B:LYS:82', 'B:THR:84', 'B:PHE:85', 'B:THR:87', 'B:LEU:88', 'B:GLU:90', 'B:LEU:91', 'B:HIS:92', 'B:CYS:93', 'B:ASP:94', 'B:LYS:95', 'B:LEU:96', 'B:HIS:97', 'B:VAL:98', 'B:ASP:99', 'B:PRO:100', 'B:GLU:101', 'B:ASN:102', 'B:PHE:103', 'B:ARG:104', 'B:LEU:105', 'B:LEU:106', 'B:ASN:108', 'B:VAL:109', 'B:LEU:110', 'B:VAL:111', 'B:CYS:112', 'B:VAL:113', 'B:LEU:114', 'B:HIS:116', 'B:HIS:117', 'B:PHE:118', 'B:LYS:120', 'B:GLU:121', 'B:PHE:122', 'B:THR:123', 'B:PRO:124', 'B:PRO:125', 'B:VAL:126', 'B:GLN:127', 'B:TYR:130', 'B:GLN:131', 'B:LYS:132', 'B:VAL:133', 'B:VAL:134', 'B:VAL:137', 'B:ASN:139', 'B:LEU:141', 'B:HIS:143', 'B:LYS:144', 'B:TYR:145', 'B:HIS:146', 'C:VAL:1', 'C:LEU:2', 'C:PRO:4', 'C:ASP:6', 'C:LYS:7', 'C:THR:8', 'C:ASN:9', 'C:VAL:10', 'C:LYS:11', 'C:TRP:14', 'C:LYS:16', 'C:VAL:17', 'C:HIS:20', 'C:GLU:23', 'C:TYR:24', 'C:GLU:27', 'C:LEU:29', 'C:GLU:30', 'C:ARG:31', 'C:MET:32', 'C:PHE:33', 'C:LEU:34', 'C:PHE:36', 'C:PRO:37', 'C:THR:38', 'C:THR:39', 'C:LYS:40', 'C:THR:41', 'C:TYR:42', 'C:PHE:43', 'C:PRO:44', 'C:HIS:45', 'C:PHE:46', 'C:ASP:47', 'C:LEU:48', 'C:HIS:50', 'C:GLN:54', 'C:VAL:55', 'C:LYS:56', 'C:HIS:58', 'C:LYS:60', 'C:LYS:61', 'C:VAL:62', 'C:ASP:64', 'C:LEU:66', 'C:THR:67', 'C:ASN:68', 'C:VAL:70', 'C:HIS:72', 'C:VAL:73', 'C:ASP:74', 'C:ASP:75', 'C:MET:76', 'C:PRO:77', 'C:ASN:78', 'C:LEU:80', 'C:LEU:83', 'C:ASP:85', 'C:LEU:86', 'C:HIS:87', 'C:HIS:89', 'C:LYS:90', 'C:LEU:91', 'C:ARG:92', 'C:VAL:93', 'C:ASP:94', 'C:PRO:95', 'C:VAL:96', 'C:ASN:97', 'C:PHE:98', 'C:LYS:99', 'C:LEU:100', 'C:LEU:101', 'C:HIS:103', 'C:CYS:104', 'C:LEU:105', 'C:LEU:106', 'C:VAL:107', 'C:THR:108', 'C:LEU:109', 'C:HIS:112', 'C:LEU:113', 'C:PRO:114', 'C:GLU:116', 'C:PHE:117', 'C:THR:118', 'C:PRO:119', 'C:VAL:121', 'C:HIS:122', 'C:LEU:125', 'C:ASP:126', 'C:LYS:127', 'C:PHE:128', 'C:LEU:129', 'C:VAL:132', 'C:THR:134', 'C:VAL:135', 'C:LEU:136', 'C:THR:137', 'C:LYS:139', 'C:TYR:140', 'C:ARG:141', 'D:VAL:1', 'D:HIS:2', 'D:LEU:3', 'D:THR:4', 'D:PRO:5', 'D:GLU:6', 'D:GLU:7', 'D:LYS:8', 'D:VAL:11', 'D:THR:12', 'D:LEU:14', 'D:TRP:15', 'D:LYS:17', 'D:VAL:18', 'D:ASN:19', 'D:VAL:20', 'D:ASP:21', 'D:GLU:22', 'D:VAL:23', 'D:GLU:26', 'D:LEU:28', 'D:ARG:30', 'D:LEU:31', 'D:LEU:32', 'D:VAL:33', 'D:VAL:34', 'D:TYR:35', 'D:PRO:36', 'D:TRP:37', 'D:THR:38', 'D:GLN:39', 'D:ARG:40', 'D:PHE:41', 'D:PHE:42', 'D:GLU:43', 'D:PHE:45', 'D:ASP:47', 'D:LEU:48', 'D:THR:50', 'D:PRO:51', 'D:ASP:52', 'D:VAL:54', 'D:MET:55', 'D:ASN:57', 'D:PRO:58', 'D:LYS:59', 'D:VAL:60', 'D:LYS:61', 'D:HIS:63', 'D:LYS:65', 'D:LYS:66', 'D:VAL:67', 'D:LEU:68', 'D:PHE:71', 'D:ASP:73', 'D:LEU:75', 'D:HIS:77', 'D:LEU:78', 'D:ASP:79', 'D:ASN:80', 'D:LEU:81', 'D:LYS:82', 'D:THR:84', 'D:PHE:85', 'D:THR:87', 'D:LEU:88', 'D:GLU:90', 'D:LEU:91', 'D:HIS:92', 'D:CYS:93', 'D:ASP:94', 'D:LYS:95', 'D:LEU:96', 'D:HIS:97', 'D:VAL:98', 'D:ASP:99', 'D:PRO:100', 'D:GLU:101', 'D:ASN:102', 'D:PHE:103', 'D:ARG:104', 'D:LEU:105', 'D:LEU:106', 'D:ASN:108', 'D:VAL:109', 'D:LEU:110', 'D:VAL:111', 'D:CYS:112', 'D:VAL:113', 'D:LEU:114', 'D:HIS:116', 'D:HIS:117', 'D:PHE:118', 'D:LYS:120', 'D:GLU:121', 'D:PHE:122', 'D:THR:123', 'D:PRO:124', 'D:PRO:125', 'D:VAL:126', 'D:GLN:127', 'D:TYR:130', 'D:GLN:131', 'D:LYS:132', 'D:VAL:133', 'D:VAL:134', 'D:VAL:137', 'D:ASN:139', 'D:LEU:141', 'D:HIS:143', 'D:LYS:144', 'D:TYR:145', 'D:HIS:146'].