SeMRA Raw Semantic Mappings Database

The SeMRA Raw Semantic Mappings Database contains unprocessed semantic mappings assembled from hundreds of ontologies and databases through pyobo.

Reproduction

The SeMRA Raw Semantic Mappings Database can be rebuilt with the following commands:

$ uv pip install semra
$ semra build

The semra build command downloads and process all resource and constructs a database of unprocessed mappings.

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The SeMRA Raw Semantic Mappings Database can be downloaded from Zenodo at raw. After downloading all files and unzipping then, a web application wrapping the SeMRA Raw Semantic Mappings Database run locally on Docker with:

$ sh run_on_docker.sh

Navigate to http://localhost:8773 to see the web application.

Domain-specific Processed Mapping Databases

The domain-specific processed mapping databases and meta-landscape analysis can be reconstructed with the following commands:

$ uv pip install semra[landscape]
$ semra landscape

The semra landscape command runs all pre-configured domain-specific mapping database construction, landscape analyses, and the meta-landscape analysis.

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

The results can be found here:

Domain

Docs and Reproduction

Database Download

Analysis

Disease

semra.landscape.disease

disease

Disease Analysis

Cell and Cell Line

semra.landscape.cell

cell

Cell Analysis

Anatomy

semra.landscape.anatomy

anatomy

Anatomy Analysis

Protein Complex

semra.landscape.complex

complex

Complex Analysis

Gene

semra.landscape.gene

gene

Gene Analysis

Disease

The SeMRA Disease Mappings Database assembles semantic mappings to the following resources:

Prefix

Name

doid

Human Disease Ontology

mondo

Mondo Disease Ontology

efo

Experimental Factor Ontology

mesh

Medical Subject Headings

ncit

NCI Thesaurus

orphanet

Orphanet

orphanet.ordo

Orphanet Rare Disease Ontology

umls

Unified Medical Language System Concept Unique Identifier

omim

Online Mendelian Inheritance in Man

omim.ps

OMIM Phenotypic Series

gard

Genetic and Rare Diseases Information Center

icd10

International Classification of Diseases, 10th Revision

icd10cm

International Classification of Diseases, 10th Revision, Clinical Modification

icd10pcs

International Classification of Diseases, 10th Revision, Procedure Coding System

icd11

International Classification of Diseases, 11th Revision (Foundation Component)

icd11.code

ICD 11 Codes

icd9

International Classification of Diseases, 9th Revision

icd9cm

International Classification of Diseases, 9th Revision, Clinical Modification

icdo

International Classification of Diseases for Oncology

Results

The SeMRA Disease Mappings Database is available for download as SSSOM, JSON, and in a format ready for loading into a Neo4j graph database on Zenodo at diseaseimg.

A summary of the results can be viewed on the SeMRA GitHub repository in the landscape/disease folder.

Reproduction

The SeMRA Disease Mappings Database can be rebuilt with the following commands:

$ git clone https://github.com/biopragmatics/semra.git
$ cd semra
$ uv pip install .[landscape]
$ python -m semra.landscape.disease

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The pre-built artifacts for this mapping database can be downloaded from Zenodo at diseaseimg and unzipped. The web application can be run locally on Docker from inside the folder where the data was unzipped with:

$ sh run_on_docker.sh

If you reproduced the database yourself, you can cd to the right folder and run with:

$ cd ~/.data/semra/case-studies/disease
$ sh run_on_docker.sh

Finally, navigate in your web browser to http://localhost:8773 to see the web application.

Variables

DISEASE_CONFIGURATION

Configuration for the disease mappings database

Cell and Cell Line

The SeMRA Cell and Cell Line Mappings Database assembles semantic mappings to the following resources:

Prefix

Name

mesh

Medical Subject Headings

efo

Experimental Factor Ontology

cellosaurus

Cellosaurus

ccle

Cancer Cell Line Encyclopedia Cells

depmap

DepMap Cell Lines

bto

BRENDA Tissue Ontology

cl

Cell Ontology

clo

Cell Line Ontology

ncit

NCI Thesaurus

umls

Unified Medical Language System Concept Unique Identifier

Results

The SeMRA Cell and Cell Line Mappings Database is available for download as SSSOM, JSON, and in a format ready for loading into a Neo4j graph database on Zenodo at cellimg.

A summary of the results can be viewed on the SeMRA GitHub repository in the landscape/cell folder.

Reproduction

The SeMRA Cell and Cell Line Mappings Database can be rebuilt with the following commands:

$ git clone https://github.com/biopragmatics/semra.git
$ cd semra
$ uv pip install .[landscape]
$ python -m semra.landscape.cell

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The pre-built artifacts for this mapping database can be downloaded from Zenodo at cellimg and unzipped. The web application can be run locally on Docker from inside the folder where the data was unzipped with:

$ sh run_on_docker.sh

If you reproduced the database yourself, you can cd to the right folder and run with:

$ cd ~/.data/semra/case-studies/cell
$ sh run_on_docker.sh

Finally, navigate in your web browser to http://localhost:8773 to see the web application.

Variables

CELL_CONFIGURATION

Configuration for the cell and cell type mappings database

Protein Complex

The SeMRA Protein Complex Mappings Database assembles semantic mappings to the following resources:

Prefix

Name

complexportal

Complex Portal

fplx

FamPlex

go

Gene Ontology

chembl.target

ChEMBL target

wikidata

Wikidata

scomp

Selventa Complexes

signor

Signaling Network Open Resource

intact

IntAct protein interaction database

Results

The SeMRA Protein Complex Mappings Database is available for download as SSSOM, JSON, and in a format ready for loading into a Neo4j graph database on Zenodo at compleximg.

A summary of the results can be viewed on the SeMRA GitHub repository in the landscape/complex folder.

Reproduction

The SeMRA Protein Complex Mappings Database can be rebuilt with the following commands:

$ git clone https://github.com/biopragmatics/semra.git
$ cd semra
$ uv pip install .[landscape]
$ python -m semra.landscape.complex

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The pre-built artifacts for this mapping database can be downloaded from Zenodo at compleximg and unzipped. The web application can be run locally on Docker from inside the folder where the data was unzipped with:

$ sh run_on_docker.sh

If you reproduced the database yourself, you can cd to the right folder and run with:

$ cd ~/.data/semra/case-studies/complex
$ sh run_on_docker.sh

Finally, navigate in your web browser to http://localhost:8773 to see the web application.

Variables

COMPLEX_CONFIGURATION

Configuration for the protein complex mappings database

Gene

The SeMRA Gene Mappings Database assembles semantic mappings to the following resources:

Prefix

Name

ncbigene

NCBI Gene

hgnc

HUGO Gene Nomenclature Committee

mgi

Mouse Genome Informatics

rgd

Rat Genome Database

cgnc

Chicken Gene Nomenclature Consortium

wormbase

WormBase

flybase

FlyBase Gene

sgd

Saccharomyces Genome Database

omim

Online Mendelian Inheritance in Man

civic.gid

CIViC gene

umls

Unified Medical Language System Concept Unique Identifier

ncit

NCI Thesaurus

wikidata

Wikidata

Results

The SeMRA Gene Mappings Database is available for download as SSSOM, JSON, and in a format ready for loading into a Neo4j graph database on Zenodo at geneimg.

A summary of the results can be viewed on the SeMRA GitHub repository in the landscape/gene folder.

Reproduction

The SeMRA Gene Mappings Database can be rebuilt with the following commands:

$ git clone https://github.com/biopragmatics/semra.git
$ cd semra
$ uv pip install .[landscape]
$ python -m semra.landscape.gene

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The pre-built artifacts for this mapping database can be downloaded from Zenodo at geneimg and unzipped. The web application can be run locally on Docker from inside the folder where the data was unzipped with:

$ sh run_on_docker.sh

If you reproduced the database yourself, you can cd to the right folder and run with:

$ cd ~/.data/semra/case-studies/gene
$ sh run_on_docker.sh

Finally, navigate in your web browser to http://localhost:8773 to see the web application.

Variables

GENE_CONFIGURATION

Configuration for the gene mappings database

Anatomy

The SeMRA Anatomy Mappings Database assembles semantic mappings to the following resources:

Prefix

Name

uberon

Uber Anatomy Ontology

mesh

Medical Subject Headings

bto

BRENDA Tissue Ontology

caro

Common Anatomy Reference Ontology

ncit

NCI Thesaurus

umls

Unified Medical Language System Concept Unique Identifier

Results

The SeMRA Anatomy Mappings Database is available for download as SSSOM, JSON, and in a format ready for loading into a Neo4j graph database on Zenodo at anatomyimg.

A summary of the results can be viewed on the SeMRA GitHub repository in the landscape/anatomy folder.

Reproduction

The SeMRA Anatomy Mappings Database can be rebuilt with the following commands:

$ git clone https://github.com/biopragmatics/semra.git
$ cd semra
$ uv pip install .[landscape]
$ python -m semra.landscape.anatomy

Note

Downloading raw data resources can take on the order of hours to tens of hours depending on your internet connection and the reliability of the resources’ respective servers.

Processing and analysis can be run overnight on commodity hardware (e.g., a 2023 MacBook Pro with 36GB RAM).

Web Application

The pre-built artifacts for this mapping database can be downloaded from Zenodo at anatomyimg and unzipped. The web application can be run locally on Docker from inside the folder where the data was unzipped with:

$ sh run_on_docker.sh

If you reproduced the database yourself, you can cd to the right folder and run with:

$ cd ~/.data/semra/case-studies/anatomy
$ sh run_on_docker.sh

Finally, navigate in your web browser to http://localhost:8773 to see the web application.

Variables

ANATOMY_CONFIGURATION

Configuration for the anatomy mappings database