hpotk package

HPO toolkit is a library for working with Human Phenotype Ontology and the HPO annotation data.

class hpotk.TermId[source]

Bases: object

TermId is an identifier of an ontology concept.

TermId consists of a prefix and id that are separated by a delimiter:

>>> term_id = TermId.from_curie('HP:0001250')
>>> assert term_id.prefix == 'HP'
>>> assert term_id.id == '0001250'

The TermId has a natural ordering which compares two IDs first based on prefix and then value. Both comparisons are lexicographic.

static from_curie(curie: str)[source]

Create a TermId from a compact URI (CURIE).

The prefix and id of a TermId must be separated either by a colon : or an underscore _.

>>> term_id = TermId.from_curie('HP:0001250')
>>> term_id.value
'HP:0001250'

The parsing will forget the original delimiter. The value always joins the prefix and id with :.

>>> ncit = TermId.from_curie('NCIT_C3117')
>>> ncit.value
'NCIT:C3117'

The : has higher priority than _, and it will be used as delimiter.

>>> snomed = TermId.from_curie('SNOMEDCT_US:128613002')
>>> snomed.prefix
'SNOMEDCT_US'
>>> snomed.id
'128613002'
Parameters:

curie – a CURIE str to be parsed.

Returns:

the created TermId.

Raises:

ValueError if the value is mis-formatted.

abstract property prefix: str

Get prefix of the ontology concept.

>>> term_id = TermId.from_curie('HP:1234567')
>>> term_id.prefix
'HP'
abstract property id: str

Get id of the ontology concept.

>>> term_id = TermId.from_curie('HP:1234567')
>>> term_id.id
'1234567'
property value: str

Get concept value consisting of self.prefix and self.value.

>>> term_id = TermId.from_curie('HP:1234567')
>>> term_id.value
'HP:1234567'
class hpotk.MinimalTerm[source]

Bases: Identified, Named

MinimalTerm is a data object with the minimal useful information about an ontology concept.

Each term has:

  • identifier - a TermId of the term (thanks to inheriting from Identified)

  • name - a human-friendly name of the term (thanks to inheriting from Named)

  • alt_term_ids - a sequence of alternate identifiers (IDs of obsolete terms that should be replaced by this term)

  • is_obsolete - the obsoletion status

Most of the time, you should get terms from an hpotk.ontology.MinimalOntology. However, MinimalTerm can also be created from scratch using create_minimal_term() if you must do that from whatever reason.

static create_minimal_term(term_id: TermId | str, name: str, alt_term_ids: Iterable[TermId | str], is_obsolete: bool)[source]

Create MinimalTerm from the components.

>>> seizure = MinimalTerm.create_minimal_term(term_id='HP:0001250', name='Seizure',
...                                           alt_term_ids=('HP:0002279', 'HP:0002391'),
...                                           is_obsolete=False)
Parameters:
  • term_id – a TermId or a CURIE str (e.g. ‘HP:0001250’).

  • name – term name (e.g. Seizure) .

  • alt_term_ids – an iterable with term IDs that represent the alternative IDs of the term.

  • is_obsoleteTrue if the MinimalTerm has been obsoleted, or False otherwise.

Returns:

the created term.

abstract property alt_term_ids: Sequence[TermId]

Get a sequence of identifiers of the ontology concepts that were obsoleted and should be replaced by the concept represented by this Term.

property is_current: bool

Return True if the term is current (not obsolete) and False otherwise.

abstract property is_obsolete: bool

Return True if the term is obsolete (not current) and False otherwise.

class hpotk.Term[source]

Bases: MinimalTerm

A comprehensive representation of an ontology concept.

Term has all attributes of the MinimalTerm plus the following:

  • definition - an optional definition of the term, including a comprehensive description and cross-references

  • comment - an optional comment

  • synonyms - an optional sequence of term synonyms

  • cross-references - an optional sequence of cross-references

Most of the time, you should be getting terms from hpotk.ontology.Ontology. However, if you absolutely must craft a term or two by hand, use create_term() function.

static create_term(identifier: TermId | str, name: str, alt_term_ids: Iterable[TermId | str], is_obsolete: bool, definition: Definition | str | None, comment: str | None, synonyms: Iterable[Synonym] | None, xrefs: Iterable[TermId] | None)[source]

Create a MinimalTerm from the components.

Parameters:
  • identifier – a TermId or a CURIE (e.g. ‘HP:0001250’).

  • name – term name (e.g. Seizure).

  • alt_term_ids – an iterable with term IDs that represent the alternative IDs of the term.

  • is_obsoleteTrue if the MinimalTerm has been obsoleted, or False otherwise.

  • definition – an optional str with a definition of the term or a Definition with the full info.

  • comment – an optional comment of the term.

  • synonyms – an optional iterable with all synonyms of the term.

  • xrefs – an optional iterable with all the cross-references.

Returns:

the created term.

abstract property definition: Definition | None

Get the definition of the ontology concept.

abstract property comment: str | None

Get the comment string of the ontology concept.

abstract property synonyms: Sequence[Synonym] | None

Get a sequence of all synonyms (including obsolete) of the ontology concept or None if the concept has no synonyms.

current_synonyms() Iterable[Synonym][source]

Get an iterable with current synonyms of the ontology concept.

The iterable is empty if the concept has no current synonyms.

obsolete_synonyms() Iterable[Synonym][source]

Get an iterable with obsolete synonyms of the ontology concept.

The iterable is empty if the concept has no obsolete synonyms.

abstract property xrefs: Sequence[TermId] | None

Get a sequence of the cross-references of the ontology concept.

class hpotk.OntologyGraph[source]

Bases: Generic[NODE]

A simple graph with one node type and one edge type.

The graph is generic over a node type which must extend TermId. The graph must not be empty, it must consist of at least one node.

Note

OntologyGraph provides iterators for traversals instead of sets, lists, etc. See Iterators vs. collections to learn why.

abstract property root: NODE

Get the root node of the ontology graph.

abstract get_children(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE][source]

Get an iterator with the children of the source node.

Parameters:
  • source – a TermId, an item that has a TermId (Identified), or a curie str representing the source node.

  • include_sourceTrue if the source should be included among the children, False otherwise.

Raises:

ValueError – if source is not present in the graph.

abstract get_descendants(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE][source]

Get an iterator with the descendants of the source node.

Parameters:
  • source – a TermId, an item that has a TermId (Identified), or a curie str representing the source node.

  • include_sourceTrue if the source should be included among the descendants, False otherwise.

Raises:

ValueError – if source is not present in the graph.

abstract get_parents(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE][source]

Get an iterator with the parents of the source node.

Parameters:
  • source – a TermId, an item that has a TermId (Identified), or a curie str representing the source node.

  • include_sourceTrue if the source should be included among the parents, False otherwise.

Raises:

ValueError – if source is not present in the graph.

abstract get_ancestors(source: str | NODE | Identified, include_source: bool = False) Iterator[NODE][source]

Get an iterator with the ancestors of the source node.

Parameters:
  • source – a TermId, an item that has a TermId (Identified), or a curie str representing the source node.

  • include_sourceTrue if the source should be included among the ancestors, False otherwise.

Raises:

ValueError – if source is not present in the graph.

is_leaf(node: str | NODE | Identified) bool[source]

Test if the node is a leaf - a node with no children.

Returns:

True if the node is a leaf node or False otherwise.

Raises:

ValueError – if node is not present in the graph.

is_parent_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool[source]

Return True if the subject sub is a parent of the object obj.

Parameters:
  • sub – a graph node.

  • obj – other graph node.

Returns:

True if the sub is a parent of the obj.

Raises:

ValueError – if obj is not present in the graph.

is_ancestor_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool[source]

Return True if the subject sub is an ancestor of the object obj.

Parameters:
  • sub – a graph node.

  • obj – other graph node.

Returns:

True if the sub is an ancestor of the obj.

Raises:

ValueError – if obj is not present in the graph.

is_child_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool[source]

Return True if the sub is a child of the obj.

Parameters:
  • sub – a graph node.

  • obj – other graph node.

Returns:

True if the sub is a child of the obj.

Raises:

ValueError – if obj is not present in the graph.

is_descendant_of(sub: str | NODE | Identified, obj: str | NODE | Identified) bool[source]

Return True if the sub is a descendant of the obj.

Parameters:
  • sub – a graph node.

  • obj – other graph node.

Returns:

True if the sub is a descendant of the obj.

Raises:

ValueError – if obj is not present in the graph.

class hpotk.GraphAware[source]

Bases: Generic[NODE]

A mixin class for entities that have an OntologyGraph.

abstract property graph: OntologyGraph[NODE]

Get the ontology graph.

class hpotk.MinimalOntology[source]

Bases: Generic[ID, MINIMAL_TERM], GraphAware[ID], Versioned

MinimalOntology is a data structure for representing the ontology terms and the ontology hierarchy.

The typical way to load the ontology is by parsing Obographs JSON file using hpotk.util.store.OntologyStore, see Load ontology section for more info.

Here we will load a toy HPO shipped with the documentation:

>>> import os
>>> import hpotk
>>> fpath_hpo = os.path.join('docs', 'data', 'hp.toy.json')
>>> hpo = hpotk.load_minimal_ontology(fpath_hpo)

The ontology includes the following:

The ontology acts as a Python container of term IDs, we can check if a term is in the ontology as:

>>> seizure_curie = 'HP:0001250'
>>> seizure_curie in hpo
True

This works for term IDs too:

>>> seizure_id = hpotk.TermId.from_curie(seizure_curie)
>>> seizure_id in hpo
True

The ontology has length - the number of primary terms:

>>> len(hpo)
393

Note

The toy HPO has only 393 terms. Real-life HPO has much more terms.

The terms of MinimalOntology are instances of hpotk.model.MinimalTerm.

abstract property term_ids: Iterator[ID]

Get an iterator over term IDs of the primary AND obsolete ontology terms.

abstract property terms: Iterator[MINIMAL_TERM]

Get an iterator over current terms (not obsolete terms).

abstract get_term(term_id: str | TermId | Identified) MINIMAL_TERM | None[source]

Get the current term for a term_id.

>>> seizure = hpo.get_term('HP:0001250')
>>> seizure.name
'Seizure'
Parameters:

term_id – a CURIE str (e.g. ‘HP:1234567’), a hpotk.model.TermId or an hpotk.model.Identified entity that represents a current or an obsolete term.

Returns:

the current term or None if the ontology does not contain the term ID.

get_term_name(term_id: str | TermId | Identified) str | None[source]

Get the name of the term with a term_id.

>>> seizure_name = hpo.get_term_name('HP:0001250')
>>> seizure_name
'Seizure'
Parameters:

term_id – a CURIE str (e.g. ‘HP:1234567’), a hpotk.model.TermId or an hpotk.model.Identified entity that represents a current or an obsolete term.

Returns:

name of the term if the term is in ontology or None otherwise.

class hpotk.Ontology[source]

Bases: MinimalOntology[ID, TERM]

An ontology with all information available for terms.

The terms Ontology are instances of hpotk.model.Term.

class hpotk.OntologyStore(store_dir: str, ontology_release_service: OntologyReleaseService, remote_ontology_service: RemoteOntologyService)[source]

Bases: object

OntologyStore stores versions of the supported ontologies.

load_minimal_ontology(ontology_type: OntologyType, release: str | None = None, **kwargs) MinimalOntology[source]

Load a release of a given ontology_type as a minimal ontology.

Parameters:
  • ontology_type – the desired ontology type, see OntologyType for a list of supported ontologies.

  • release – a str with the ontology release tag or None if the latest ontology should be fetched.

  • kwargs – key-value arguments passed to the low-level loader function (currently load_minimal_ontology()).

Returns:

a minimal ontology.

load_ontology(ontology_type: OntologyType, release: str | None = None, **kwargs) Ontology[source]

Load a release of a given ontology_type as an ontology.

Parameters:
  • ontology_type – the desired ontology type, see OntologyType for a list of supported ontologies.

  • release – a str with the ontology release tag or None if the latest ontology should be fetched.

  • kwargs – key-value arguments passed to the low-level loader function (currently load_ontology()).

Returns:

an ontology.

Raises:

ValueError – if the release corresponds to a non-existing ontology release.

property store_dir: str

Get a str with a platform specific absolute path to the data directory.

The data directory points to $HOME/.hpo-toolkit on UNIX and $HOME/hpo-toolkit on Windows. The folder is created if it does not exist.

load_minimal_hpo(release: str | None = None) MinimalOntology[source]

A convenience method for loading a specific HPO release.

Parameters:

release – an optional str with the desired HPO release (if None, the latest HPO will be provided).

Returns:

a hpotk.MinimalOntology with the HPO data.

Raises:

ValueError – if the release corresponds to a non-existing HPO release.

load_hpo(release: str | None = None) Ontology[source]

A convenience method for loading a specific HPO release.

Parameters:

release – an optional str with the desired HPO release (if None, the latest HPO will be provided).

Returns:

a hpotk.Ontology with the HPO data.

Raises:

ValueError – if the release corresponds to a non-existing HPO release.

clear(ontology_type: OntologyType | None = None)[source]

Clear all ontology resources or resources of selected ontology_type.

Parameters:

ontology_type – the ontology to be cleared or None if resources of all ontologies should be cleared.

resolve_store_path(ontology_type: OntologyType, release: str | None = None) str[source]

Resolve the path of the ontology resource (e.g. HPO hp.json file) within the ontology store.

Note, the path points to the location of the ontology resource in the local filesystem. The path may point to a non-existing file, if the load function has not been run yet.

Example

>>> import hpotk
>>> store = hpotk.configure_ontology_store()
>>> store.resolve_store_path(hpotk.store.OntologyType.HPO, release='v2023-10-09')  
'/home/user/.hpo-toolkit/HP/hp.v2023-10-09.json'
Parameters:
  • ontology_type – the desired ontology type, see OntologyType for a list of supported ontologies.

  • release – an optional str with the desired ontology release (if None, the latest ontology will be provided).

Returns:

a str with path to the ontology resource.

class hpotk.OntologyType(value)[source]

Bases: Enum

Enum with the ontologies supported by the OntologyStore.

HPO = 'HPO'

Human Phenotype Ontology.

MAxO = 'MAxO'

Medical Action Ontology.

MONDO = 'MONDO'

Mondo Disease Ontology.

property identifier: str

Get a str with the ontology identifier (e.g. HP for HPO).

>>> from hpotk.store import OntologyType
>>> OntologyType.HPO.identifier
'HP'
>>> OntologyType.MAxO.identifier
'MAXO'
hpotk.configure_ontology_store(store_dir: str | None = None, ontology_release_service: ~hpotk.store._api.OntologyReleaseService = <hpotk.store._github.GitHubOntologyReleaseService object>, remote_ontology_service: ~hpotk.store._api.RemoteOntologyService = <hpotk.store._github.GitHubRemoteOntologyService object>) OntologyStore[source]

Configure and create the default ontology store.

Parameters:
  • store_dir – a str pointing to an existing directory for caching the ontology files or None if the platform-specific default folder should be used.

  • ontology_release_service – an OntologyReleaseService for fetching the ontology releases.

  • remote_ontology_service – a RemoteOntologyService responsible for fetching the ontology data from a remote location if we do not have the data locally.

Returns:

an OntologyStore.

Raises:

ValueError if something goes wrong.

Subpackages