graphein.rna#
Graphs#
Functions for working with RNA Secondary Structure Graphs.
- graphein.rna.graphs.construct_rna_graph(dotbracket: Optional[str], sequence: Optional[str], edge_construction_funcs: List[Callable], edge_annotation_funcs: Optional[List[Callable]] = None, node_annotation_funcs: Optional[List[Callable]] = None, graph_annotation_funcs: Optional[List[Callable]] = None) networkx.classes.graph.Graph [source]#
Constructs an RNA secondary structure graph from dotbracket notation.
- Parameters
dotbracket (str, optional) – Dotbracket notation representation of secondary structure
sequence (str, optional) – Corresponding sequence RNA bases
edge_construction_funcs (List[Callable], optional) – List of edge construction functions. Defaults to
None
.edge_annotation_funcs (List[Callable], optional) – List of edge metadata annotation functions. Defaults to
None
.node_annotation_funcs (List[Callable], optional) – List of node metadata annotation functions. Defaults to
None
.graph_annotation_funcs (List[Callable], optional) – List of graph metadata annotation functions. Defaults to
None
.
- Returns
nx.Graph of RNA secondary structure
- Return type
nx.Graph
- graphein.rna.graphs.validate_dotbracket(db: str)[source]#
Sanitize dotbracket string. This ensures that it only has supported symbols.
See:
SUPPORTED_DOTBRACKET_NOTATION
- Parameters
db (str) – Dotbracket notation string
- Raises
ValueError – Raises ValueError if dotbracket notation contains unsupported symbols
- graphein.rna.graphs.validate_lengths(db: str, seq: str) None [source]#
Check lengths of dotbracket and sequence match.
- Parameters
- Raises
ValueError – Raises ValueError if lengths of dotbracket and sequence do not match.
- graphein.rna.graphs.validate_rna_sequence(s: str) None [source]#
Validate RNA sequence. This ensures that it only containts supported bases.
Supported bases are:
"A", "U", "G", "C", "I"
Supported bases can be accessed inRNA_BASES
- Parameters
s (str) – Sequence to validate
- Raises
ValueError – Raises ValueError if the sequence contains an unsupported base character
Edges#
Functions to compute edges for an RNA secondary structure graph.
- graphein.rna.edges.add_all_dotbracket_edges(G: networkx.classes.graph.Graph) networkx.classes.graph.Graph [source]#
Adds phosphodiester bonds between adjacent nucleotides and base_pairing interactions to an RNA secondary structure graph.
- Parameters
G (nx.Graph) – RNA Graph to add edges to.
- Returns
RNA graph with
phosphodiester_bond
andbase_pairing
edges added.- Return type
nx.Graph
- graphein.rna.edges.add_base_pairing_interactions(G: networkx.classes.graph.Graph) networkx.classes.graph.Graph [source]#
Adds base pairing interactions between nucleotides to an RNA secondary structure graph.
- Parameters
G (nx.Graph) – RNA Graph to add edges to.
- Raises
ValueError – if
dotbracket
contains an unsupported character.- Returns
RNA graph with
base_pairing
edges added.- Return type
nx.Graph
- graphein.rna.edges.add_phosphodiester_bonds(G: networkx.classes.graph.Graph) networkx.classes.graph.Graph [source]#
Adds phosphodiester bonds between adjacent nucleotides to an RNA secondary structure graph.
- Parameters
G (nx.Graph) – RNA Graph to add edges to.
- Returns
RNA graph with
phosphodiester_bond
edges added.- Return type
nx.Graph
- graphein.rna.edges.add_pseudoknots(G: networkx.classes.graph.Graph) networkx.classes.graph.Graph [source]#
Adds pseudoknots nucleotides to an RNA secondary structure graph.
- Parameters
G (nx.Graph) – RNA Graph to add edges to.
- Returns
RNA graph with pseudoknot edges added.
- Return type
nx.Graph
Visualisation#
Visualisation utilities for RNA Secondary Structure Graphs.
- graphein.rna.visualisation.plot_rna_graph(g: networkx.classes.graph.Graph, layout: <module 'networkx.drawing.layout' from '/Users/arianjamasb/opt/anaconda3/envs/graphein-wip/lib/python3.8/site-packages/networkx/drawing/layout.py'> = <function circular_layout>, label_base_type: bool = True, label_base_position: bool = False, label_dotbracket_symbol: bool = False, **kwargs)[source]#
Plots a RNA Secondary Structure Graph. Colours edges by kind.
- Parameters
g (nx.Graph) – NetworkX graph of RNA secondary structure graph.
layout (nx.layout) – Layout algorithm to use. Default is circular_layout.
label_base_type (bool) – Whether to label the base type of each base.
label_base_position (bool) – Whether to label the base position of each base.
label_dotbracket_symbol – Whether to label the dotbracket symbol of each base.
Constants#
Constants for working with RNA Secondary Structure Graphs.
- graphein.rna.constants.CANONICAL_BASE_PAIRINGS: Dict[str, List[str]] = {'A': ['U'], 'C': ['G'], 'G': ['C'], 'U': ['A']}#
Maps standard RNA bases to their canonical base pairings.
- graphein.rna.constants.PSEUDOKNOT_CLOSING_SYMBOLS: List[str] = [']', '}', '>']#
List of symbols denoting a pseudoknot closing.
- graphein.rna.constants.PSEUDOKNOT_OPENING_SYMBOLS: List[str] = ['[', '{', '<']#
List of symbols denoting a pseudoknot opening.
- graphein.rna.constants.RNA_BASES: List[str] = ['A', 'U', 'G', 'C', 'I']#
List of allowable RNA Bases.
- graphein.rna.constants.RNA_BASE_COLORS: Dict[str, str] = {'A': 'r', 'C': 'y', 'G': 'g', 'I': 'm', 'U': 'b'}#
Maps RNA bases (
RNA_BASES
) to a colour for visualisations.
- graphein.rna.constants.SIMPLE_DOTBRACKET_NOTATION: List[str] = ['(', '.', ')']#
List of characters in simplest dotbracket notation.
- graphein.rna.constants.SS_BOND_TYPES: List[str] = ['phosphodiester_bond', 'base_pairing', 'pseudoknot']#
List of valid secondary structure bond types.
- graphein.rna.constants.SUPPORTED_DOTBRACKET_NOTATION = ['(', '.', ')', '[', '{', '<', ']', '}', '>']#
List of all valid dotbracket symbols. Amalgamation of
SIMPLE_DOTBRACKET_NOTATION
andSUPPORTED_PSEUDOKNOT_NOTATION
.
- graphein.rna.constants.SUPPORTED_PSEUDOKNOT_NOTATION: List[str] = ['[', '{', '<', ']', '}', '>']#
List of characters denoting pseudoknots in dotbracket notation. Amalgam of
PSEUDOKNOT_OPENING_SYMBOLS
andPSEUDOKNOT_CLOSING_SYMBOLS
.
- graphein.rna.constants.VALID_BASE_PAIRINGS = {'A': ['U', 'I'], 'C': ['G', 'I'], 'G': ['C', 'U'], 'I': ['A', 'C', 'U'], 'U': ['A', 'G', 'I']}#
Mapping of RNA bases (
RNA_BASES
) to their allowable pairings. Amalgam ofCANONICAL_BASE_PAIRINGS
andWOBBLE_BASE_PAIRINGS
.