hpotk package
HPO toolkit is a library for working with Human Phenotype Ontology and the HPO annotation data.
- class hpotk.TermId[source]
Bases:
object
TermId is an identifier of an ontology concept.
TermId consists of a prefix and id that are separated by a delimiter:
>>> term_id = TermId.from_curie('HP:0001250') >>> assert term_id.prefix == 'HP' >>> assert term_id.id == '0001250'
The TermId has a natural ordering which compares two IDs first based on prefix and then value. Both comparisons are lexicographic.
- static from_curie(curie: str)[source]
Create a TermId from a compact URI (CURIE).
The prefix and id of a TermId must be separated either by a colon
:
or an underscore_
.>>> term_id = TermId.from_curie('HP:0001250') >>> term_id.value 'HP:0001250'
The parsing will forget the original delimiter. The value always joins the prefix and id with
:
.>>> ncit = TermId.from_curie('NCIT_C3117') >>> ncit.value 'NCIT:C3117'
The
:
has higher priority than_
, and it will be used as delimiter.>>> snomed = TermId.from_curie('SNOMEDCT_US:128613002') >>> snomed.prefix 'SNOMEDCT_US' >>> snomed.id '128613002'
- Parameters:
curie – a CURIE str to be parsed.
- Returns:
the created TermId.
- Raises:
ValueError if the value is mis-formatted.
- abstract property prefix: str
Get prefix of the ontology concept.
>>> term_id = TermId.from_curie('HP:1234567') >>> term_id.prefix 'HP'
- class hpotk.MinimalTerm[source]
Bases:
Identified
,Named
MinimalTerm is a data object with the minimal useful information about an ontology concept.
Each term has:
identifier - a
TermId
of the term (thanks to inheriting fromIdentified
)name - a human-friendly name of the term (thanks to inheriting from
Named
)alt_term_ids - a sequence of alternate identifiers (IDs of obsolete terms that should be replaced by this term)
is_obsolete - the obsoletion status
Most of the time, you should get terms from an
hpotk.ontology.MinimalOntology
. However, MinimalTerm can also be created from scratch usingcreate_minimal_term()
if you must do that from whatever reason.- static create_minimal_term(term_id: TermId | str, name: str, alt_term_ids: Iterable[TermId | str], is_obsolete: bool)[source]
Create MinimalTerm from the components.
>>> seizure = MinimalTerm.create_minimal_term(term_id='HP:0001250', name='Seizure', ... alt_term_ids=('HP:0002279', 'HP:0002391'), ... is_obsolete=False)
- Parameters:
term_id – a TermId or a CURIE str (e.g. ‘HP:0001250’).
name – term name (e.g. Seizure) .
alt_term_ids – an iterable with term IDs that represent the alternative IDs of the term.
is_obsolete – True if the MinimalTerm has been obsoleted, or False otherwise.
- Returns:
the created term.
- class hpotk.Term[source]
Bases:
MinimalTerm
A comprehensive representation of an ontology concept.
Term has all attributes of the
MinimalTerm
plus the following:definition - an optional definition of the term, including a comprehensive description and cross-references
comment - an optional comment
synonyms - an optional sequence of term synonyms
cross-references - an optional sequence of cross-references
Most of the time, you should be getting terms from
hpotk.ontology.Ontology
. However, if you absolutely must craft a term or two by hand, usecreate_term()
function.- static create_term(identifier: TermId | str, name: str, alt_term_ids: Iterable[TermId | str], is_obsolete: bool, definition: Definition | str | None, comment: str | None, synonyms: Iterable[Synonym] | None, xrefs: Iterable[TermId] | None)[source]
Create a MinimalTerm from the components.
- Parameters:
identifier – a TermId or a CURIE (e.g. ‘HP:0001250’).
name – term name (e.g. Seizure).
alt_term_ids – an iterable with term IDs that represent the alternative IDs of the term.
is_obsolete – True if the MinimalTerm has been obsoleted, or False otherwise.
definition – an optional str with a definition of the term or a
Definition
with the full info.comment – an optional comment of the term.
synonyms – an optional iterable with all synonyms of the term.
xrefs – an optional iterable with all the cross-references.
- Returns:
the created term.
- abstract property definition: Definition | None
Get the definition of the ontology concept.
- abstract property synonyms: Sequence[Synonym] | None
Get a sequence of all synonyms (including obsolete) of the ontology concept or None if the concept has no synonyms.
- current_synonyms() Iterable[Synonym] [source]
Get an iterable with current synonyms of the ontology concept.
The iterable is empty if the concept has no current synonyms.
- class hpotk.OntologyGraph[source]
Bases:
Generic
[NODE
]A simple graph with one node type and one edge type.
The graph is generic over a node type which must extend
TermId
. The graph must not be empty, it must consist of at least one node.Note
OntologyGraph provides iterators for traversals instead of sets, lists, etc. See Iterators vs. collections to learn why.
- abstract property root: NODE
Get the root node of the ontology graph.
- abstract get_children(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE] [source]
Get an iterator with the children of the source node.
- Parameters:
- Raises:
ValueError – if source is not present in the graph.
- abstract get_descendants(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE] [source]
Get an iterator with the descendants of the source node.
- Parameters:
- Raises:
ValueError – if source is not present in the graph.
- abstract get_parents(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE] [source]
Get an iterator with the parents of the source node.
- Parameters:
- Raises:
ValueError – if source is not present in the graph.
- abstract get_ancestors(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE] [source]
Get an iterator with the ancestors of the source node.
- Parameters:
- Raises:
ValueError – if source is not present in the graph.
- is_leaf(node: str | NODE | Identified) bool [source]
Test if the node is a leaf - a node with no children.
- Returns:
True if the node is a leaf node or False otherwise.
- Raises:
ValueError – if node is not present in the graph.
- is_parent_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool [source]
Return True if the subject sub is a parent of the object obj.
- Parameters:
sub – a graph node.
obj – other graph node.
- Returns:
True if the sub is a parent of the obj.
- Raises:
ValueError – if obj is not present in the graph.
- is_ancestor_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool [source]
Return True if the subject sub is an ancestor of the object obj.
- Parameters:
sub – a graph node.
obj – other graph node.
- Returns:
True if the sub is an ancestor of the obj.
- Raises:
ValueError – if obj is not present in the graph.
- is_child_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool [source]
Return True if the sub is a child of the obj.
- Parameters:
sub – a graph node.
obj – other graph node.
- Returns:
True if the sub is a child of the obj.
- Raises:
ValueError – if obj is not present in the graph.
- is_descendant_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool [source]
Return True if the sub is a descendant of the obj.
- Parameters:
sub – a graph node.
obj – other graph node.
- Returns:
True if the sub is a descendant of the obj.
- Raises:
ValueError – if obj is not present in the graph.
- class hpotk.GraphAware[source]
Bases:
Generic
[NODE
]A mixin class for entities that have an
OntologyGraph
.- abstract property graph: OntologyGraph[NODE]
Get the ontology graph.
- class hpotk.MinimalOntology[source]
Bases:
Generic
[ID
,MINIMAL_TERM
],GraphAware
[ID
],Versioned
MinimalOntology is a data structure for representing the ontology terms and the ontology hierarchy.
The typical way to load the ontology is by parsing Obographs JSON file using
hpotk.util.store.OntologyStore
, see Load ontology section for more info.Here we will load a toy HPO shipped with the documentation:
>>> import os >>> import hpotk >>> fpath_hpo = os.path.join('docs', 'data', 'hp.toy.json') >>> hpo = hpotk.load_minimal_ontology(fpath_hpo)
The ontology includes the following:
ontology hierarchy as
hpotk.graph.OntologyGraph
ontology terms as
hpotk.model.MinimalTerm
the metadata, such as the ontology version
The ontology acts as a Python container of term IDs, we can check if a term is in the ontology as:
>>> seizure_curie = 'HP:0001250' >>> seizure_curie in hpo True
This works for term IDs too:
>>> seizure_id = hpotk.TermId.from_curie(seizure_curie) >>> seizure_id in hpo True
The ontology has length - the number of primary terms:
>>> len(hpo) 393
Note
The toy HPO has only 393 terms. Real-life HPO has much more terms.
The terms of MinimalOntology are instances of
hpotk.model.MinimalTerm
.- abstract property term_ids: Iterator[ID]
Get an iterator over term IDs of the primary AND obsolete ontology terms.
- abstract property terms: Iterator[MINIMAL_TERM]
Get an iterator over current terms (not obsolete terms).
- abstract get_term(term_id: str | TermId | Identified) MINIMAL_TERM | None [source]
Get the current term for a term_id.
>>> seizure = hpo.get_term('HP:0001250') >>> seizure.name 'Seizure'
- Parameters:
term_id – a CURIE str (e.g. ‘HP:1234567’), a
hpotk.model.TermId
or anhpotk.model.Identified
entity that represents a current or an obsolete term.- Returns:
the current term or None if the ontology does not contain the term ID.
- get_term_name(term_id: str | TermId | Identified) str | None [source]
Get the name of the term with a term_id.
>>> seizure_name = hpo.get_term_name('HP:0001250') >>> seizure_name 'Seizure'
- Parameters:
term_id – a CURIE str (e.g. ‘HP:1234567’), a
hpotk.model.TermId
or anhpotk.model.Identified
entity that represents a current or an obsolete term.- Returns:
name of the term if the term is in ontology or None otherwise.
- class hpotk.Ontology[source]
Bases:
MinimalOntology
[ID
,TERM
]An ontology with all information available for terms.
The terms Ontology are instances of
hpotk.model.Term
.
- class hpotk.OntologyStore(store_dir: str, ontology_release_service: OntologyReleaseService, remote_ontology_service: RemoteOntologyService)[source]
Bases:
object
OntologyStore stores versions of the supported ontologies.
- load_minimal_ontology(ontology_type: OntologyType, release: str | None = None, **kwargs) MinimalOntology [source]
Load a release of a given ontology_type as a minimal ontology.
- Parameters:
ontology_type – the desired ontology type, see
OntologyType
for a list of supported ontologies.release – a str with the ontology release tag or None if the latest ontology should be fetched.
kwargs – key-value arguments passed to the low-level loader function (currently
load_minimal_ontology()
).
- Returns:
a minimal ontology.
- load_ontology(ontology_type: OntologyType, release: str | None = None, **kwargs) Ontology [source]
Load a release of a given ontology_type as an ontology.
- Parameters:
ontology_type – the desired ontology type, see
OntologyType
for a list of supported ontologies.release – a str with the ontology release tag or None if the latest ontology should be fetched.
kwargs – key-value arguments passed to the low-level loader function (currently
load_ontology()
).
- Returns:
an ontology.
- Raises:
ValueError – if the release corresponds to a non-existing ontology release.
- property store_dir: str
Get a str with a platform specific absolute path to the data directory.
The data directory points to $HOME/.hpo-toolkit on UNIX and $HOME/hpo-toolkit on Windows. The folder is created if it does not exist.
- load_minimal_hpo(release: str | None = None) MinimalOntology [source]
A convenience method for loading a specific HPO release.
- Parameters:
release – an optional str with the desired HPO release (if None, the latest HPO will be provided).
- Returns:
a
hpotk.MinimalOntology
with the HPO data.- Raises:
ValueError – if the release corresponds to a non-existing HPO release.
- load_hpo(release: str | None = None) Ontology [source]
A convenience method for loading a specific HPO release.
- Parameters:
release – an optional str with the desired HPO release (if None, the latest HPO will be provided).
- Returns:
a
hpotk.Ontology
with the HPO data.- Raises:
ValueError – if the release corresponds to a non-existing HPO release.
- clear(ontology_type: OntologyType | None = None)[source]
Clear all ontology resources or resources of selected ontology_type.
- Parameters:
ontology_type – the ontology to be cleared or None if resources of all ontologies should be cleared.
- resolve_store_path(ontology_type: OntologyType, release: str | None = None) str [source]
Resolve the path of the ontology resource (e.g. HPO hp.json file) within the ontology store.
Note, the path points to the location of the ontology resource in the local filesystem. The path may point to a non-existing file, if the load function has not been run yet.
Example
>>> import hpotk >>> store = hpotk.configure_ontology_store() >>> store.resolve_store_path(hpotk.store.OntologyType.HPO, release='v2023-10-09') '/home/user/.hpo-toolkit/HP/hp.v2023-10-09.json'
- Parameters:
ontology_type – the desired ontology type, see
OntologyType
for a list of supported ontologies.release – an optional str with the desired ontology release (if None, the latest ontology will be provided).
- Returns:
a str with path to the ontology resource.
- class hpotk.OntologyType(value)[source]
Bases:
Enum
Enum with the ontologies supported by the
OntologyStore
.- HPO = 'HPO'
Human Phenotype Ontology.
- MAxO = 'MAxO'
Medical Action Ontology.
- MONDO = 'MONDO'
Mondo Disease Ontology.
- hpotk.configure_ontology_store(store_dir: str | None = None, ontology_release_service: ~hpotk.store._api.OntologyReleaseService = <hpotk.store._github.GitHubOntologyReleaseService object>, remote_ontology_service: ~hpotk.store._api.RemoteOntologyService = <hpotk.store._github.GitHubRemoteOntologyService object>) OntologyStore [source]
Configure and create the default ontology store.
- Parameters:
store_dir – a str pointing to an existing directory for caching the ontology files or None if the platform-specific default folder should be used.
ontology_release_service – an
OntologyReleaseService
for fetching the ontology releases.remote_ontology_service – a
RemoteOntologyService
responsible for fetching the ontology data from a remote location if we do not have the data locally.
- Returns:
an
OntologyStore
.- Raises:
ValueError if something goes wrong.
- hpotk.load_minimal_ontology(file: ~typing.IO | str, term_factory: ~hpotk.ontology.load.obographs._factory.ObographsTermFactory[~hpotk.model._term.MinimalTerm] = <hpotk.ontology.load.obographs._factory.MinimalTermFactory object>, graph_factory: ~hpotk.graph._factory.GraphFactory = <hpotk.graph._factory.CsrIndexedGraphFactory object>, prefixes_of_interest: ~typing.Set[str] = {'HP'}) MinimalOntology [source]
- hpotk.load_ontology(file: ~typing.IO | str, term_factory: ~hpotk.ontology.load.obographs._factory.ObographsTermFactory[~hpotk.model._term.Term] = <hpotk.ontology.load.obographs._factory.TermFactory object>, graph_factory: ~hpotk.graph._factory.GraphFactory = <hpotk.graph._factory.CsrIndexedGraphFactory object>, prefixes_of_interest: ~typing.Set[str] = {'HP'}) Ontology [source]