hpotk.algorithm.similarity package
- hpotk.algorithm.similarity.calculate_ic_for_annotated_items(items: AnnotatedItemContainer, ontology: MinimalOntology, base: float | None = None, module_root: TermId | None = None, use_pseudocount: bool = False) AnnotationIcContainer [source]
Calculate information content (IC) for each
TermId
based on a collection of annotated items.The calculation can be done for an ontology module - only the descendants of the provided module_root will be included in the analysis. If assume_annotated is True, then the count of all ontology/module terms is set to at least 1, even for those terms that do not annotate the items.
- Parameters:
items – a collection of
hpotk.annotations.AnnotatedItem
sontology – ontology with concepts used to annotate the items
base – information content base or None for e (produces IC in nats)
module_root – the root of the ontology module to calculate the IC for.
use_pseudocount – assume that each ontology term annotates at least one of the items.
- Returns:
a container with mappings from
TermId
to information content in nats, bits, or else, depending on the base value
- class hpotk.algorithm.similarity.AnnotationIcContainer[source]
Bases:
Mapping
[TermId
,float
],MetadataAware
A container for storing information content of item annotations.
- class hpotk.algorithm.similarity.SimpleAnnotationIcContainer(data: Mapping[TermId, float], metadata: Mapping[str, str] | None = None)[source]
Bases:
AnnotationIcContainer
An implementation of a
AnnotationIcContainer
that is backed by adict
.- property metadata: MutableMapping[str, str]
Get a mapping with entity metadata.
- class hpotk.algorithm.similarity.SimilarityContainer(metadata: Mapping[str, str] | None = None)[source]
Bases:
MetadataAware
,Sized
A container for pre-calculated semantic similarity results.
- get_similarity(a: str, b: str) float [source]
Get similarity of two entries a and b.
- Parameters:
a – an item, e.g. HP:1234567
b – another item, e.g. HP:9876543
- Returns:
a non-negative semantic similarity
- set_similarity(a: str, b: str, sim: float)[source]
Set semantic similarity for items a and b. :param a: an item, e.g. HP:1234567 :param b: another item, e.g. HP:9876543 :param sim: a non-negative semantic similarity
- hpotk.algorithm.similarity.precalculate_ic_mica_for_hpo_concept_pairs(ic: AnnotationIcContainer, hpo: MinimalOntology) SimilarityContainer [source]
Precalculate Resnik semantic similarity for HPO
TermId
pairs.- Parameters:
ic – a mapping for obtaining an information content of a
TermId
.hpo – HPO ontology.
- Returns:
a mapping with Resnik similarity for
TermId
pairs where the similarity \(s>0\).