hpotk.util.sort package

The hpotk.util.sort package sorts term IDs in a meaningful way. See Sorting term IDs section for more info.

class hpotk.util.sort.TermIdSorting[source]

Bases: object

TermIdSorting computes indices for sorting a sequence of identifiers/identified items.

abstract argsort(term_ids: Sequence[TermId | Identified]) Sequence[int][source]

Prepare indices for sorting a sequence of term IDs.

Parameters:

term_ids – a sequence of term IDs or identified entities to sort.

Returns:

a sequence of indices for sorting of the term_ids sequence.

class hpotk.util.sort.HierarchicalEdgeTermIdSorting(hpo: OntologyGraph | GraphAware)[source]

Bases: HierarchicalSorting

HierarchicalEdgeTermIdSorting uses hierarchical clustering to sort the term IDs. The clustering uses edge distance as a proxy to the term similarity.

Notes:

  • Does not maintain order if the input is already sorted.

Parameters:

hpo – HPO ontology graph or a graph aware instance.

class hpotk.util.sort.HierarchicalIcTermIdSorting(hpo: OntologyGraph | GraphAware, ic_source: Callable[[TermId], float])[source]

Bases: HierarchicalSorting

HierarchicalIcTermIdSorting uses hierarchical clustering to sort the term IDs. The clustering uses Resnik term similarity to assess term similarity.

Notes:

  • Does not maintain order if the input is already sorted.

Parameters:
  • hpo – HPO ontology graph or a graph aware instance.

  • ic_source – a callable for getting the information content (IC) as a float for a term ID.

argsort(term_ids: Sequence[TermId | Identified]) Sequence[int][source]

Prepare indices for sorting a sequence of term IDs.

Parameters:

term_ids – a sequence of term IDs or identified entities to sort.

Returns:

a sequence of indices for sorting of the term_ids sequence.

class hpotk.util.sort.HierarchicalSimilaritySorting(hpo: OntologyGraph | GraphAware, ic_source: Callable[[TermId], float])[source]

Bases: HierarchicalIcTermIdSorting

HierarchicalSimilaritySorting uses hierarchical clustering to sort the term IDs. The clustering uses Resnik term similarity to assess term similarity.

Notes:

  • Does not maintain order if the input is already sorted.

Parameters:
  • hpo – HPO ontology graph or a graph aware instance.

  • ic_source – a callable for getting the information content (IC) as a float for a term ID.