Querying with Cypher
SeMRA constructs data artifacts and docker configuration for locally deploying a Neo4j
graph databases and a web application via semra.io.write_neo4j() (for example
outputs, see semra.database or semra.landscape). The resulting graph
database can be queried directly with the Cypher query language in one of the following
ways:
By connecting with a client via the
boltprotocol on port 7687, which is exposed in the DockerfileBy navigating to http://localhost:7474 in the web browser to use Neo4j’s builtin graphical front-end, where you can type in Cypher queries and interact with the results.
The contents of the grpah database have the following schema:
Below, some example Cypher queries are given to show what is possible by direct querying of the database.
Lookup by CURIE
The following Cypher queries allow for looking up concepts, mappings, evidences, and mapping sets.
Look up a concept (e.g., a cell line) by its CURIE:
MATCH (n:concept)
WHERE n.curie = "cellosaurus:0440"
RETURN n
The same is possible for mappings, evidences, and mapping sets. Each of these three types of entities has SeMRA-specific CURIE generation. For a mapping:
MATCH (m:mapping)
WHERE m.curie = "..."
RETURN m
For an evidence:
MATCH (e:evidence)
WHERE e.curie = "..."
RETURN e
For a mapping set:
MATCH (s:mappingset)
WHERE s.curie = "..."
RETURN s
Cypher also lets you return certain parts from each record. The list of what fields are available can be found in the following documentation:
Concept |
|
Mapping |
|
Evidence |
|
Mapping Set |
For example, you can look up a concept by its CURIE and return specific parts, such as the name:
MATCH (n:concept)
WHERE n.curie = "cellosaurus:0440"
RETURN n.name
Traversing Mappings
Get all targets for exact match mappings where cellosaurus:0440 is the source:
MATCH
(source:concept)-[:`skos:exactMatch`]->(target:concept)
WHERE source.curie = "cellosaurus:0440"
RETURN target
The same query can be reified using owl:annotatedSource, owl:annotatedTarget,
and the mapping node type:
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m)-[:`owl:annotatedSource`]->(target:concept)
WHERE source.curie = "cellosaurus:0440" and m.predicate == "skos:exactMatch"
RETURN target
After reifying, you can extend the query to return evidences. In the interactive view, returning multiple elements will also automatically show edges between them
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m)-[:`owl:annotatedSource`]->(target:concept)
(m)-[:hasEvidence]->(e:evidence)
WHERE source.curie = "cellosaurus:0440" and m.predicate == "skos:exactMatch"
RETURN source, target, m, e
Reification is useful for doing complex filters, e.g., on mapping justification. The
following query returns exact matches to cellosaurus:0440 that have manual mapping
justification
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m)-[:`owl:annotatedSource`]->(target:concept)
(m)-[:hasEvidence]->(e:evidence)
WHERE
source.curie = "cellosaurus:0440"
and m.predicate == "skos:exactMatch"
and e.mapping_justification == "semapv:ManualMappingCuration"
RETURN target
The previous query can be reformulated to filter for minimum confidence:
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m)-[:`owl:annotatedSource`]->(target:concept)
(m)-[:hasEvidence]->(e:evidence)
WHERE
source.curie = "cellosaurus:0440"
and m.predicate == "skos:exactMatch"
and e.confidence > 0.3
RETURN target
It can also be extended to return the authors of the evidences:
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m)-[:`owl:annotatedSource`]->(target:concept)
(m)-[:hasEvidence]->(e:evidence)
(e)-[:hasAuthor]->(author:concept)
WHERE
source.curie = "cellosaurus:0440"
and m.predicate == "skos:exactMatch"
and e.mapping_justification == "semapv:ManualMappingCuration"
RETURN target, author
The following query gets all mappings (with associated evidences, mapping sets, and
authors) where cellosaurus:0440 is the source, with optional matches for mapping
sets and authors:
MATCH
(m:mapping)-[:`owl:annotatedSource`]->(source:concept) ,
(m:mapping)-[:`owl:annotatedTarget`]->(target:concept) ,
(m)-[:hasEvidence]->(e:evidence)
WHERE source.curie = "cellosaurus:0440"
OPTIONAL MATCH
(e)-[:fromSet]->(mset:mappingset)
OPTIONAL MATCH
(e)-[:hasAuthor]->(author:concept)
RETURN source, target, m, e, mset, author
Neo4j Output Reference
I/O for Neo4j.
Variables
The column headers for the concept nodes in the SeMRA Neo4j graph database export |
|
The predicate used in the graph data model connecting a reasoned evidence |
|
The column headers for properties attached to simple mappings |
|
for extra edges that aren't mapping edges, such as those with |
|
The column headers for evidence nodes in the SeMRA Neo4j graph database export |
|
The predicate used in the graph data model connecting an evidence node to a mapping set node |
|
node to the mapping node(s) from which it was derived |
|
The predicate used in the graph data model connecting a mapping node to an evidence node |
|
The column headers for the mapping nodes in the SeMRA Neo4j graph database export |
|
Built-in mutable sequence. |