infer_dbxref_mutations

infer_dbxref_mutations(mappings: Iterable[Mapping], pairs: dict[tuple[str, str], float] | Iterable[tuple[str, str]], confidence: float | None = None, progress: bool = False) list[Mapping][source]

Upgrade database cross-references into exact matches for the given pairs.

Parameters:
  • mappings – A list of mappings

  • pairs – A dictionary of source/target prefix pairs to the confidence of upgrading dbxrefs. If giving a collection of pairs, will use the confidence value as given.

  • confidence – The default confidence to be used if pairs is given as a collection. Defaults to 0.7

  • progress – Should a progress bar be shown? Defaults to true.

Returns:

A new list of mappings containing upgrades

In the following example, we use four different terms for cranioectodermal dysplasia from the Disease Ontology (DOID), Medical Subject Headings (MeSH), and Unified Medical Language System (UMLS). We use the prior knowledge that there’s a high confidence that dbxrefs from DOID to MeSH are actually exact matches. This lets us infer m3 from m1. We don’t make any assertions about DOID-UMLS or MeSH-UMLS mappings here, so the example mapping m2 comes along for the ride.

>>> from semra import DB_XREF, EXACT_MATCH, Reference, NARROW_MATCH
>>> curies = "DOID:0050577", "mesh:C562966", "umls:C4551571"
>>> r1, r2, r3 = (Reference.from_curie(c) for c in curies)
>>> m1 = Mapping.from_triple((r1, DB_XREF, r2))
>>> m2 = Mapping.from_triple((r2, DB_XREF, r3))
>>> mappings = [m1, m2]
>>> pairs = {("DOID", "mesh"): 0.99}
>>> m3 = Mapping.from_triple(
...     (r1, EXACT_MATCH, r2),
...     evidence=[
...         ReasonedEvidence(
...             mappings=[m1], justification=KNOWLEDGE_MAPPING, confidence_factor=0.99
...         )
...     ],
... )  # this is what we are inferring
>>> assert infer_dbxref_mutations(mappings, pairs) == [m1, m3, m2]

This function is a thin wrapper around infer_mutations() where semra.DB_XREF is used as the “old” predicated and semra.EXACT_MATCH is used as the “new” predicate.