Configuration

class Configuration(*, name: str, key: str, description: str | None = None, creators: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, inputs: list[~semra.pipeline.Input], negative_inputs: list[~semra.pipeline.Input] = <factory>, priority: list[str] = <factory>, mutations: list[~semra.api.Mutation] = <factory>, subsets: ~collections.abc.Mapping[str, list[~bioregistry.reference.NormalizedNamableReference]] | None = None, exclude_pairs: list[tuple[str, str]] = <factory>, remove_prefixes: list[str] | None = None, keep_prefixes: list[str] | None = None, post_remove_prefixes: list[str] | None = None, post_keep_prefixes: list[str] | None = None, remove_imprecise: bool = True, validate_raw: bool = False, directory: ~pathlib.Path, write_raw_neo4j: bool = False, neo4j_gzip: None | ~typing.Literal['during', 'after'] = 'during', add_labels: bool = False, zenodo_record: int | None = None)[source]

Bases: BaseModel

Represents the steps taken during mapping assembly.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Attributes Summary

`configuration_path`	Get the path to the configuration file.
`model_config`	Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
`priority_counts_path`	Get the path to the priority counts summary TSV.
`priority_graph_path`	Get the path to the priority counts graph depiction.
`priority_jsonl_path`	Get the path to priority mappings as a gzipped JSON lines file.
`priority_pickle_path`	Get the path to priority mappings as a gzipped pickle file.
`priority_sssom_path`	Get the path to priority mappings as a gzipped SSSOM TSV file.
`processed_counts_path`	Get the path to the processed counts summary TSV.
`processed_graph_path`	Get the path to the processed counts graph depiction.
`processed_jsonl_path`	Get the path to processed mappings as a gzipped JSON lines file.
`processed_landscape_histogram_path`	Get the path to the processed landscape histogram plot.
`processed_landscape_upset_path`	Get the path to the processed landscape UpSet plot.
`processed_neo4j_name`	Get the name for the processed mappings Neo4j docker image.
`processed_neo4j_path`	Get the path to the processed neo4j directory.
`processed_pickle_path`	Get the path to processed mappings as a gzipped pickle file.
`processed_sssom_path`	Get the path to processed mappings as a gzipped SSSOM TSV file.
`raw_counts_path`	Get the path to the raw counts summary TSV.
`raw_graph_path`	Get the path to the raw counts graph depiction.
`raw_jsonl_path`	Get the path to raw mappings as a gzipped JSON lines file.
`raw_neo4j_name`	Get the name for the raw mappings Neo4j docker image.
`raw_neo4j_path`	Get the path to the raw neo4j directory.
`raw_pickle_path`	Get the path to raw mappings as a gzipped pickle file.
`raw_sssom_path`	Get the path to raw mappings as a gzipped SSSOM TSV file.
`readme_path`	Get the path to the summary README file.
`source_summary_path`	Get the path to the source summary TSV file.
`stats_path`	Get the path to the statistics summary JSON file.

Methods Summary

`cli`(*args[, write_summary, ...])	Get and run a command line interface for this configuration.
`ensure_zenodo`(key, *[, metadata, processed])	Ensure a zenodo record.
`from_prefixes`(*, key, name, prefixes[, ...])	Get a configuration from ontology prefixes.
`get_cli`(*[, write_summary, ...])	Get a command line interface for this configuration.
`get_hydrated_subsets`(*[, show_progress])	Get the full subset filter lists based on the parent configuration.
`get_mappings`()	Run assembly based on this configuration, see `assemble()`.
`has_priority_path`()	Check if the configuration has cached priority mappings.
`has_processed_path`()	Check if the configuration has cached priority mappings.
`has_raw_path`()	Check if the configuration has cached raw mappings.
`infer_priority`(values)	Infer the priority from the input list of not given.
`read_priority_mappings`(*[, show_progress])	Read priority mappings from pickle, if already cached.
`read_processed_mappings`(*[, show_progress])	Read processed mappings from pickle, if already cached.
`read_raw_mappings`(*[, show_progress])	Read raw mappings from pickle, if already cached.
`upload_zenodo`([processed])	Upload a Zenodo record.
`zenodo_url`()	Get the zenodo URL, if available.

Attributes Documentation

configuration_path: Get the path to the configuration file.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

priority_counts_path: Get the path to the priority counts summary TSV.

priority_graph_path: Get the path to the priority counts graph depiction.

priority_jsonl_path: Get the path to priority mappings as a gzipped JSON lines file.

priority_pickle_path: Get the path to priority mappings as a gzipped pickle file.

priority_sssom_path: Get the path to priority mappings as a gzipped SSSOM TSV file.

processed_counts_path: Get the path to the processed counts summary TSV.

processed_graph_path: Get the path to the processed counts graph depiction.

processed_jsonl_path: Get the path to processed mappings as a gzipped JSON lines file.

processed_landscape_histogram_path: Get the path to the processed landscape histogram plot.

processed_landscape_upset_path: Get the path to the processed landscape UpSet plot.

processed_neo4j_name: Get the name for the processed mappings Neo4j docker image.

processed_neo4j_path: Get the path to the processed neo4j directory.

processed_pickle_path: Get the path to processed mappings as a gzipped pickle file.

processed_sssom_path: Get the path to processed mappings as a gzipped SSSOM TSV file.

raw_counts_path: Get the path to the raw counts summary TSV.

raw_graph_path: Get the path to the raw counts graph depiction.

raw_jsonl_path: Get the path to raw mappings as a gzipped JSON lines file.

raw_neo4j_name: Get the name for the raw mappings Neo4j docker image.

raw_neo4j_path: Get the path to the raw neo4j directory.

raw_pickle_path: Get the path to raw mappings as a gzipped pickle file.

raw_sssom_path: Get the path to raw mappings as a gzipped SSSOM TSV file.

readme_path: Get the path to the summary README file.

source_summary_path: Get the path to the source summary TSV file.

stats_path: Get the path to the statistics summary JSON file.

Methods Documentation

cli(*args: Any, write_summary: bool = True, copy_to_landscape: bool = False, hooks: list[Callable[[Configuration, MappingPack], None]] | None = None) → None[source]: Get and run a command line interface for this configuration.

ensure_zenodo(key: str, *, metadata: zenodo_client.Metadata | None = None, processed: bool = True, **kwargs: Any) → requests.Response[source]: Ensure a zenodo record.

classmethod from_prefixes(*, key: str, name: str, prefixes: Iterable[str], include_biomappings: bool = True, include_gilda: bool = True, directory: Path) → Self[source]: Get a configuration from ontology prefixes.

get_cli(*, write_summary: bool = True, copy_to_landscape: bool = False, hooks: list[Callable[[Configuration, MappingPack], None]] | None = None) → Command[source]: Get a command line interface for this configuration.

get_hydrated_subsets(*, show_progress: bool = True) → Mapping[str, list[NormalizedNamableReference]][source]: Get the full subset filter lists based on the parent configuration.

get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.none] = AssembleReturnType.none) → None[source]
get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.all] = AssembleReturnType.all) → MappingPack
get_mappings(*, refresh_raw: bool = False, refresh_processed: bool = False, refresh_source: bool = False, return_type: Literal[AssembleReturnType.priority] = AssembleReturnType.priority) → list[Mapping]: Run assembly based on this configuration, see assemble().

has_priority_path() → bool[source]: Check if the configuration has cached priority mappings.

has_processed_path() → bool[source]: Check if the configuration has cached priority mappings.

has_raw_path() → bool[source]: Check if the configuration has cached raw mappings.

classmethod infer_priority(values: dict[str, Any]) → dict[str, Any][source]: Infer the priority from the input list of not given.

read_priority_mappings(*, show_progress: bool = False) → list[Mapping][source]: Read priority mappings from pickle, if already cached.

read_processed_mappings(*, show_progress: bool = False) → list[Mapping][source]: Read processed mappings from pickle, if already cached.

read_raw_mappings(*, show_progress: bool = False) → list[Mapping][source]: Read raw mappings from pickle, if already cached.

upload_zenodo(processed: bool = True, **kwargs: Any) → Response[source]: Upload a Zenodo record.

zenodo_url() → str | None[source]: Get the zenodo URL, if available.