Configuration

class Configuration(*, name: str, key: str, description: str | None = None, creators: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, inputs: list[~semra.pipeline.Input], negative_inputs: list[~semra.pipeline.Input] = <factory>, priority: list[str] = <factory>, mutations: list[~semra.api.Mutation] = <factory>, subsets: ~collections.abc.Mapping[str, list[~bioregistry.reference.NormalizedNamableReference]] | None = None, exclude_pairs: list[tuple[str, str]] = <factory>, remove_prefixes: list[str] | None = None, keep_prefixes: list[str] | None = None, post_remove_prefixes: list[str] | None = None, post_keep_prefixes: list[str] | None = None, remove_imprecise: bool = True, validate_raw: bool = False, directory: ~pathlib.Path, write_raw_neo4j: bool = False, neo4j_gzip: None | ~typing.Literal['during', 'after'] = 'during', add_labels: bool = False, zenodo_record: int | None = None)[source]

Bases: BaseModel

Represents the steps taken during mapping assembly.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Attributes Summary

configuration_path

Get the path to the configuration file.

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

priority_counts_path

Get the path to the priority counts summary TSV.

priority_graph_path

Get the path to the priority counts graph depiction.

priority_jsonl_path

Get the path to priority mappings as a gzipped JSON lines file.

priority_pickle_path

Get the path to priority mappings as a gzipped pickle file.

priority_sssom_path

Get the path to priority mappings as a gzipped SSSOM TSV file.

processed_counts_path

Get the path to the processed counts summary TSV.

processed_graph_path

Get the path to the processed counts graph depiction.

processed_jsonl_path

Get the path to processed mappings as a gzipped JSON lines file.

processed_landscape_histogram_path

Get the path to the processed landscape histogram plot.

processed_landscape_upset_path

Get the path to the processed landscape UpSet plot.

processed_neo4j_name

Get the name for the processed mappings Neo4j docker image.

processed_neo4j_path

Get the path to the processed neo4j directory.

processed_pickle_path

Get the path to processed mappings as a gzipped pickle file.

processed_sssom_path

Get the path to processed mappings as a gzipped SSSOM TSV file.

raw_counts_path

Get the path to the raw counts summary TSV.

raw_graph_path

Get the path to the raw counts graph depiction.

raw_jsonl_path

Get the path to raw mappings as a gzipped JSON lines file.

raw_neo4j_name

Get the name for the raw mappings Neo4j docker image.

raw_neo4j_path

Get the path to the raw neo4j directory.

raw_pickle_path

Get the path to raw mappings as a gzipped pickle file.

raw_sssom_path

Get the path to raw mappings as a gzipped SSSOM TSV file.

readme_path

Get the path to the summary README file.

source_summary_path

Get the path to the source summary TSV file.

stats_path

Get the path to the statistics summary JSON file.

Methods Summary

cli(*args[, write_summary, ...])

Get and run a command line interface for this configuration.

ensure_zenodo(key, *[, metadata, processed])

Ensure a zenodo record.

from_prefixes(*, key, name, prefixes[, ...])

Get a configuration from ontology prefixes.

get_cli(*[, write_summary, ...])

Get a command line interface for this configuration.

get_hydrated_subsets(*[, show_progress])

Get the full subset filter lists based on the parent configuration.

get_mappings()

Run assembly based on this configuration, see assemble().

has_priority_path()

Check if the configuration has cached priority mappings.

has_processed_path()

Check if the configuration has cached priority mappings.

has_raw_path()

Check if the configuration has cached raw mappings.

infer_priority(values)

Infer the priority from the input list of not given.

read_priority_mappings(*[, show_progress])

Read priority mappings from pickle, if already cached.

read_processed_mappings(*[, show_progress])

Read processed mappings from pickle, if already cached.

read_raw_mappings(*[, show_progress])

Read raw mappings from pickle, if already cached.

upload_zenodo([processed])

Upload a Zenodo record.

zenodo_url()

Get the zenodo URL, if available.

Attributes Documentation

configuration_path

Get the path to the configuration file.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

priority_counts_path

Get the path to the priority counts summary TSV.

priority_graph_path

Get the path to the priority counts graph depiction.

priority_jsonl_path

Get the path to priority mappings as a gzipped JSON lines file.

priority_pickle_path

Get the path to priority mappings as a gzipped pickle file.

priority_sssom_path

Get the path to priority mappings as a gzipped SSSOM TSV file.

processed_counts_path

Get the path to the processed counts summary TSV.

processed_graph_path

Get the path to the processed counts graph depiction.

processed_jsonl_path

Get the path to processed mappings as a gzipped JSON lines file.

processed_landscape_histogram_path

Get the path to the processed landscape histogram plot.

processed_landscape_upset_path

Get the path to the processed landscape UpSet plot.

processed_neo4j_name

Get the name for the processed mappings Neo4j docker image.

processed_neo4j_path

Get the path to the processed neo4j directory.

processed_pickle_path

Get the path to processed mappings as a gzipped pickle file.

processed_sssom_path

Get the path to processed mappings as a gzipped SSSOM TSV file.

raw_counts_path

Get the path to the raw counts summary TSV.

raw_graph_path

Get the path to the raw counts graph depiction.

raw_jsonl_path

Get the path to raw mappings as a gzipped JSON lines file.

raw_neo4j_name

Get the name for the raw mappings Neo4j docker image.

raw_neo4j_path

Get the path to the raw neo4j directory.

raw_pickle_path

Get the path to raw mappings as a gzipped pickle file.

raw_sssom_path

Get the path to raw mappings as a gzipped SSSOM TSV file.

readme_path

Get the path to the summary README file.

source_summary_path

Get the path to the source summary TSV file.

stats_path

Get the path to the statistics summary JSON file.

Methods Documentation

cli(*args: Any, write_summary: bool = True, copy_to_landscape: bool = False, hooks: list[Callable[[Configuration, MappingPack], None]] | None = None) None[source]

Get and run a command line interface for this configuration.

ensure_zenodo(key: str, *, metadata: zenodo_client.Metadata | None = None, processed: bool = True, **kwargs: Any) requests.Response[source]

Ensure a zenodo record.

classmethod from_prefixes(*, key: str, name: str, prefixes: Iterable[str], include_biomappings: bool = True, include_gilda: bool = True, directory: Path) Self[source]

Get a configuration from ontology prefixes.

get_cli(*, write_summary: bool = True, copy_to_landscape: bool = False, hooks: list[Callable[[Configuration, MappingPack], None]] | None = None) Command[source]

Get a command line interface for this configuration.

get_hydrated_subsets(*, show_progress: bool = True) Mapping[str, list[NormalizedNamableReference]][source]

Get the full subset filter lists based on the parent configuration.

get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.none] = AssembleReturnType.none) None[source]
get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.all] = AssembleReturnType.all) MappingPack
get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.priority] = AssembleReturnType.priority) list[Mapping]

Run assembly based on this configuration, see assemble().

has_priority_path() bool[source]

Check if the configuration has cached priority mappings.

has_processed_path() bool[source]

Check if the configuration has cached priority mappings.

has_raw_path() bool[source]

Check if the configuration has cached raw mappings.

classmethod infer_priority(values: dict[str, Any]) dict[str, Any][source]

Infer the priority from the input list of not given.

read_priority_mappings(*, show_progress: bool = False) list[Mapping][source]

Read priority mappings from pickle, if already cached.

read_processed_mappings(*, show_progress: bool = False) list[Mapping][source]

Read processed mappings from pickle, if already cached.

read_raw_mappings(*, show_progress: bool = False) list[Mapping][source]

Read raw mappings from pickle, if already cached.

upload_zenodo(processed: bool = True, **kwargs: Any) Response[source]

Upload a Zenodo record.

zenodo_url() str | None[source]

Get the zenodo URL, if available.