hpotk.annotations package

The hpotk.annotations module provides classes for working with HPO annotation data that is available for download from HPO release data.

The module contains data classes to model the annotation data. Most notable classes include HpoDiseases, a container of diseases, and HpoDisease a representation of the disease data.

The hpotk.annotations.load module contains code for loading the annotations into from HPO annotations format.

class hpotk.annotations.HpoDiseases[source]

Bases: AnnotatedItemContainer[HpoDiseaseAnnotation]

A container for HPO diseases that allows iteration over all diseases, knows about the number of diseases in the container, and supports retrieval of the disease by its identifier.

class hpotk.annotations.HpoDisease[source]

Bases: AnnotatedItem[HpoDiseaseAnnotation], Identified, Named

HpoDisease represents a computational model of a rare disease.

The model includes attributes:

  • identifier - disease ID, e.g. OMIM:256000

  • name - human-readable disease name, e.g. LEIGH SYNDROME; LS

  • annotations - the phenotype annotations of the disease. See AnnotatedItem for more details on all annotation-related methods

  • modes_of_inheritance - a collection of the modes of inheritance associated with the disease

  • onsets - a collection of term IDs representing the onsets of the disease. The terms are descendants of the Onset term.

abstract property modes_of_inheritance: Collection[TermId]

a collection of modes of inheritance associated with the disease.

Type:

return

abstract property onsets: Collection[TermId]

a collection of onsets known for the disease.

Type:

return

class hpotk.annotations.HpoDiseaseAnnotation[source]

Bases: Identified, FrequencyAwareFeature

HpoDiseaseAnnotation models data of a single disease annotation.

The annotation has the following attributes:

  • identifier - annotation ID, e.g. HP:0001250

  • frequency-related attributes of the annotation, such as frequency (see FrequencyAwareFeature for more info)

  • onsets - onsets of the feature (descendants of HPO’s Onset)

  • references - a sequence of cross-references that support presence/absence of the annotation

  • modifiers - a sequence of clinical modifiers of the annotation, such as age of onset, severity, laterality, …

abstract property references: Sequence[AnnotationReference]

a list of annotation references that support presence/absence of the disease annotation

Type:

return

abstract property modifiers: Sequence[TermId]

a list of disease annotation modifiers

Type:

return

abstract property onsets: Collection[TermId]

Get the known onsets of the phenotypic feature.

The onsets are descendants of Onset.

Returns:

the collection of the onset term IDs.

abstract onset_counts(onset: str | ID) Tuple[int, int] | None[source]

Get the count “n over m” of individuals annotated with the phenotypic feature with the onset of interest.

The onset should be a descendant of HPO’s Onset.

If the onset has descendants, the count will include the count of both direct (the term’s) and indirect (the descendants’) annotations. For instance, the count of individuals annotated with the feature with Antenatal onset will include the individuals with Fetal onset, because Fetal onset is a descendant of Antenatal onset.

Parameters:

onset – a str with the CURIE of the term ID or a TermId with the term ID.

Returns:

a tuple with n individuals annotated with the onset out of m investigated individuals, or None if the information is not available.

class hpotk.annotations.AnnotatedItem[source]

Bases: Generic[ANNOTATION], Identified

An item that can be annotated with ontology terms. For instance, a disease can be annotated with phenotypic features, items from HPO ontology.

abstract property annotations: Collection[ANNOTATION]

a collection of ANNOTATION objects for the annotated item.

Type:

return

present_annotations() Iterable[ANNOTATION][source]
Returns:

an iterable over present annotations.

absent_annotations() Iterable[ANNOTATION][source]
Returns:

an iterable over absent annotations.

annotation_by_id(query: str | TermId | Identified) ANNOTATION | None[source]

Find the annotation identified by the query.

Performs a linear search and finds the first match.

Parameters:

query – a str with CURIE, an TermId, or an Identified item (an item with an identifier).

Returns:

an annotation or None if no such annotation exists.

class hpotk.annotations.AnnotatedItemContainer[source]

Bases: Generic[ANNOTATED_ITEM], Iterable[ANNOTATED_ITEM], Sized, Versioned

Container for items that can be annotated with ontology terms.

For instance, if OMIM disease is an item type and phenotypic feature is the annotation type, then a corpus of OMIM diseases corresponds to a container of annotated items.

property items: Collection[ANNOTATED_ITEM]

an iterable over container items.

Type:

return

item_ids() Iterable[TermId][source]
Returns:

an iterable over all item identifiers.

class hpotk.annotations.EvidenceCode(value)[source]

Bases: Enum

An enumeration with evidence codes.

IEA = 1

Inferred from electronic evidence.

TAS = 2

Traceable author statement.

PCS = 3

Published clinical study.

static parse(value: str)[source]

Parse evidence code from str value.

Parameters:

value – a str with the evidence code.

Returns:

the parsed enum member or None if value is not valid EvidenceCode value.

class hpotk.annotations.Sex(value)[source]

Bases: Enum

An enum representing values of apparent biological sex.

Note

We do not attempt to model all contemporary complexities of the sex.

UNKNOWN = 1
MALE = 2
FEMALE = 3
static parse(value: str)[source]

Parse Sex from a str value.

Parameters:

value – a str with the sex code.

Returns:

the parsed enum member or None if value is not valid Sex value.

class hpotk.annotations.AnnotationReference(identifier: TermId, evidence_code: EvidenceCode)[source]

Bases: Identified

property identifier: TermId

Get the identifier.

property evidence_code: EvidenceCode

Subpackages