hydrate_subsets

hydrate_subsets(subset_configuration: Mapping[str, list[NormalizedNamableReference]], *, show_progress: bool = True) Mapping[str, list[NormalizedNamableReference]][source]

Convert a subset configuration dictionary into a subset artifact.

Parameters:
  • subset_configuration – A dictionary of prefixes to sets of parent terms

  • show_progress – Should progress bars be shown?

Returns:

A dictionary that uses the is-a hierarchy within the resources to get full term lists

Raises:

ValueError – If a prefix can’t be looked up with PyOBO

To get all the cells from MeSH:

from semra.api import hydrate_subsets, filter_subsets

configuration = {"mesh": ["mesh:D002477"], ...}
prefix_to_references = hydrate_subsets(configuration)

It’s also possible to use parents outside the vocabulary, such as when search for entity type in UMLS:

from semra import Reference
from semra.api import hydrate_subsets, filter_subsets

configuration = {
    "umls": [
        # all children of https://uts.nlm.nih.gov/uts/umls/semantic-network/Pathologic%20Function
        Reference.from_curie("sty:T049"),  # cell or molecular dysfunction
        Reference.from_curie("sty:T047"),  # disease or syndrome
        Reference.from_curie("sty:T191"),  # neoplastic process
        Reference.from_curie("sty:T050"),  # experimental model of disease
        Reference.from_curie("sty:T048"),  # mental or behavioral dysfunction
    ],
    ...
}
prefix_to_references = hydrate_subsets(configuration)