Not able to fetch path

navratan22jan · May 14, 2018, 4:41am

Hi,

I have made a stardog db with opennlp and with entity extraction for persons. Then i loaded a text document in the doc store of the db.

stardog-admin db create -o docs.opennlp.models.path=D:\Setups\Stardog\stardog-5.2.3\opennlp -n testDB1

stardog doc put --rdf-extractors tika,entities testDB1 D:\Sample\article1.txt

The text file has the following data:

Navratan knows Mukesh as both are colleagues in ABC. They also have lunch together and are working on the same project for cleint.

Now I am trying to check the path between the entities, but I am not getting any path as the output.

stardog query -f text testDB1 “PATHS START ?x = :Navratan END ?y VIA ?p”

±------±------±------+
| x | p | y |
±------±------±------+
±------±------±------+

Any idea?

pedro · May 14, 2018, 11:32am

Hi Navratan,

Before we dig in into the actual paths query, let’s debug which entities are being extracted from the text. Two questions:

what is the content of the D:\Setups\Stardog\stardog-5.2.3\opennlp folder?
what is the output of the following query select * where { graph ?g { ?s ?p ?o }}?

-pedro

navratan22jan · May 15, 2018, 3:30am

Hi pedro,

D:\Setups\Stardog\stardog-5.2.3\opennlp folder has the following nlp models:

en-ner-person.bin
en-sent.bin
en-token.bin

Following is the output of the the query “select * where { graph ?g { ?s ?p ?o }}” when run on database testDB1:

g
Sort
s
Sort
p
Sort
o
Sort
stardog:docs:testDB1:article1.txt stardog:docs:testDB1:article1.txt rdf:type stardog:docs:Document
stardog:docs:testDB1:article1.txt stardog:docs:testDB1:article1.txt rdf:type owl:Thing

Kindly suggest

Thanks
Navratan

pedro · May 15, 2018, 11:51am

Hi Navratan,

That last query returns all the data in the database and, as you can see, there is nothing there besides simple metadata about the document itself.
The issue here is that the en-ner-person model doesn't recognise any of the two names in your example (it's specialized in english names). We provide other more general person-identification models, e.g., built from dbpedia, but can't guarantee that they will work with most non-english names, since they are trained with english language texts.
Models for other languages and domains are easy to learn, and I can provide some pointers if needed.

I would recommend changing the content of your example for now, and execute that select * query again, you'll get an idea on what kind of information is being extracted.

-pedro

navratan22jan · May 16, 2018, 5:28am

Hey Pedro,

I changed the content of the file and the executed the select * query again, now it has detected the person names as per the nlp model:

g
Sort
s
Sort
p
Sort
o
Sort
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt stardog:docs:hasEntity stardog:docs:entity:f06574bbbfa1a5b474f276714e769027
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt stardog:docs:hasEntity stardog:docs:entity:679a56e43cd3beace9e4ba690824b055
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt DCMI: Format text/plain; charset=ISO-8859-1
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt stardog:docs:fileSize 132
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt DCMI: Identifier article1.txt
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt rdfs:label article1.txt
stardog:docs:pathdb:article1.txt stardog:docs:entity:f06574bbbfa1a5b474f276714e769027 rdfs:label Nick
stardog:docs:pathdb:article1.txt stardog:docs:entity:679a56e43cd3beace9e4ba690824b055 rdfs:label Mike
stardog:docs:pathdb:article1.txt stardog:docs:entity:f06574bbbfa1a5b474f276714e769027 rdf:type stardog:docs:ner:person
stardog:docs:pathdb:article1.txt stardog:docs:entity:679a56e43cd3beace9e4ba690824b055 rdf:type stardog:docs:ner:person
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt rdf:type FOAF Vocabulary Specification
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt rdf:type stardog:docs:Document
stardog:docs:pathdb:article1.txt stardog:docs:pathdb:article1.txt rdf:type owl:Thing
stardog:docs:pathdb:article1.txt stardog:docs:entity:f06574bbbfa1a5b474f276714e769027 rdf:type owl:Thing
stardog:docs:pathdb:article1.txt stardog:docs:entity:679a56e43cd3beace9e4ba690824b055 rdf:type owl:Thing

Now, i ran the path query again, stardog query -f text pathdb "PATHS START ?x = :Navratan END ?y VIA ?p"

but its returning 0 paths.

pedro · May 16, 2018, 9:07am

Hi Navratan,

As you can see in the results of the select * query, there is no :Navratan object in the database, therefore the PATHS query won’t be able to return any results.

I’m not sure what are you trying to achieve with the paths query, specifically because entities are leaf nodes in the graph. If you want to simply find which entities are present in the same document, a select query like this will work:

select * where {
    graph ?doc {
        ?doc stardog:docs:hasEntity [ rdfs:label ?label ].
    }
}

navratan22jan · May 16, 2018, 9:38am

Pedro,

I want to see the relationships between various entities in my data, so that’s why was trying to do that with the help of path query. Can you suggest how can I check relationship between entities extracted from my text?

pedro · May 16, 2018, 10:19am

The only relationship you can extract is that two entities are in the same document, which is what the query I previously shared does.

If you wanted to automatically extract semantic relationships from your text, such as A knows B and A works at ABC, that task is called relation extraction, and is something that we don't support at the moment.
Your only option here would be to implement a custom extractor with the logic to extract such relationships.

navratan22jan · May 16, 2018, 10:52am

Ok Pedro, understood for unstructured data.
Is it the same for the structured data as well which is saved in Stardog db in triples format?

pedro · May 16, 2018, 11:04am

If your data is structured as triples, you have a graph, and therefore can write all kinds of queries to find relationships between entities, including path queries.

navratan22jan · May 16, 2018, 11:07am

Ok

Below is the link of a ttl file which i have used to save in stardog db.

https://raw.githubusercontent.com/stardog-union/stardog-examples/develop/examples/docs/blog/person_movie.ttl

now i want to find relationship between entities extracted from this file. Please guide what should be the query for that.

Thanks

lorenz_b · May 17, 2018, 6:55am

Not sure if I understand correctly, but shouldn’t you do entity linking before you can use the mentioned entities in the knowledge graph? Entity extraction is just finding parts of the text that denote entities like persons, places, etc.

Once you did this, a SPARQL like

SELECT ?e1 ?p ?e2 {
 ?e1 ?p ?e2 .
}

is all you need, clearly you have to might need to get the entities itself from a particular graph

select ?mention ?entity where {
  graph <tag:stardog:api:docs:movies:article.txt> {
    ?s rdfs:label ?mention ;
    ?s <http://purl.org/dc/terms/references> ?entity .
  }
}

and finally could wrap this is a combined query.

Mukesh · May 17, 2018, 7:50am

Basically we are looking to find relationship like A knows B and B works for organization XYZ within both structured and unstructured data. So for ex - if we have added only following ttl file within Stardog DB and want to find the above mentioned relationship then what will be query for same -

https://raw.githubusercontent.com/stardog-union/stardog-examples/develop/examples/docs/blog/person_movie.ttl

and also, if we have uploaded only following Article file and want to find out relationship between George Clooney and Matt Charman. Please provide complete example query and example for same

https://raw.githubusercontent.com/stardog-union/stardog-examples/develop/examples/docs/blog/article.txt

We are stuck on these points, So, if you can provide complete example and queries to achieve same for both structured and unstructured data then it will be really helpful.

Thanks,
Mukesh Gupta

stephen · May 17, 2018, 3:59pm

Hi Mukesh,

To retrieve relationships from person_movie.ttl you will first have to describe the relationships you’re looking to find. If you’re looking for explicit relationships such as :actor, :author, :director, you can just write a SPARQL query:

SELECT ?title WHERE {
  ?tom a :Person ;
    rdfs:label "Tom Hanks" .
  ?movie :actor ?tom ;
    rdfs:label ?title
}
ORDER BY ?title

If, however, you’re looking to infer a relationship, you need to define it so the reasoner can find them. For example if I wanted to define :20sActor as an actor who starred in a 1920’s movie, I can do that with a rule:

IF {
  ?movie :actor ?actor ;
    :copyrightYear ?year .
  FILTER(?year >= 1920 && ?year < 1930)
}
THEN {
  ?actor a :20sActor
}

Once that rule was inserted into the DB, I can query (with reasoning) to find instances of :20sActors without that data needing to be stored explicitly in my DB:

SELECT ?name WHERE {
  ?actor a :20sActor ;
    rdfs:label ?name
}
ORDER BY ?name

As for the unstructured data, you will need to do as Pedro suggested and get/use/create extractors that can retrieve the data you’re looking for. For example, I loaded article.txt with the English tika,entity extractors, and can now query over what it found:

select ?type ?label { 
  graph <tag:stardog:api:docs:rtfm:article.txt> { 
  ?s <tag:stardog:api:docs:hasEntity> [ rdf:type ?type; rdfs:label ?label ] 
  }
}

+---------------------------------------+------------------+
|                 type                  |      label       |
+---------------------------------------+------------------+
| tag:stardog:api:docs:ner:organization | "Watergate"      |
| tag:stardog:api:docs:ner:person       | "George Clooney" |
| tag:stardog:api:docs:ner:date         | "last year"      |
| tag:stardog:api:docs:ner:person       | "Grant Heslov"   |
| tag:stardog:api:docs:ner:person       | "Matt Charman"   |
+---------------------------------------+------------------+

lorenz_b · May 18, 2018, 6:57am

As far as I understood, the use-case is to have

Turtle file which contains entities with triples about them and
a text file which might contain mentions of those entities.

This needs two steps:

the entity extraction which finds parts in the text which mention entities
entity linking, i.e. map those entities mentions to the RDF entities in the loaded KB

Once both steps are done, a SPARQL query could be used to get relationships between entities mentioned in the given text.

Mukesh · May 21, 2018, 10:57am

Hi Stephen,

Thanks for sharing your input - Can you please suggest, In order to find the relationship between entities like A and B works for XYZ organization and both lives in same location - basically we want to get the links between nodes like in the following diagram -

If we have structured data (saved in triples format in Stardog DB) - then to achieve same should we used - PATH queries OR GRAPHQL OR any other? And, Please share complete example to find out relationships like in above image.

Also, for unstructured data, we are trying to create Custom Extractor. So, please suggest -

Can we create custom extractor other than in JAVA?
Please share steps to create Custom Extractor, we were following example and facing issues. So, it would be great, if you can share complete steps to implement the below example -

github.com

stardog-union/stardog-examples/blob/develop/examples/docs/test/src/com/complexible/stardog/examples/docs/WordCountExtractorTest.java

/*
 * Copyright (c) 2010-2018 Stardog Union. <https://stardog.com>
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package com.complexible.stardog.examples.docs;

import java.io.File;
import java.io.IOException;
import java.net.InetSocketAddress;

This file has been truncated. show original

Thanks,
Mukesh Gupta

mike · May 21, 2018, 7:25pm

Mukesh,

If you want the actual paths from one node to another, a PATHS query is most appropriate. GraphQL will only return you objects matching your specified criteria, so in effect, you’d hard code graph structure into your GraphQL template.

Regarding the extractor, you don’t have to use Java, but it would have to be a language that runs on the JVM, such as Kotlin. If you wanted to use something native, or a web service, you’d want a thin wrapper that calls out to the service.

Regarding the example, it’s hard to suggest solutions to whatever problems you’re facing without knowing what issues you’re running into. The example you reference is self-contained, so there’s nothing more to it than what is outlined there.

navratan22jan · May 22, 2018, 4:54am

Hi Mike,

For the example, will I have to first run the build.gradle file and then the java code?

I am trying to run the build file from gradle, but getting the following error:

D:\Setups\gradle\gradle-4.7\bin>gradle -q

FAILURE: Build failed with an exception.

Where:
Build file 'D:\Setups\gradle\gradle-4.7\bin\build.gradle' line: 4
What went wrong:
A problem occurred evaluating root project 'bin'.

Could not find method compile() for arguments [com.complexible.stardog:server:
5.2.3] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.Defa
ultDependencyHandler.

Or is there any other step that I need to implement first? Am I missing out on something?

Thanks
Navratan

Mukesh · May 22, 2018, 7:00am

Hey Mike,

We are trying to find out the relationships within following file (saved in Stardog DB) -

http://www.learningsparql.com/2ndeditionexamples/ex069.ttl

for ex - We want to know all the relationships of "Richard" (first name), then what should be Path Query for same, if we want to find out all paths and shortest path

Also, Within the following example, finding all the people Alice is connected to and how she is connected to them -

Can you please share the TTL file of this example, so that we can co-relate with this example?

Thanks,
Mukesh Gupta

mike · May 22, 2018, 2:24pm

@navratan22jan

You should be running gradle from the checkout of the stardog-examples repo, not from the gradle installation directory.

@Mukesh

Here’s the snippet of data from the example:

@prefix : <http://api.stardog.com/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix stardog: <tag:stardog:api:> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix paths: <urn:paths:> .

<urn:paths:Alice> <urn:paths:knows> <urn:paths:Bob> .

<urn:paths:Bob> <urn:paths:knows> <urn:paths:David> ;
    <urn:paths:worksWith> <urn:paths:Charlie> .

<urn:paths:Charlie> <urn:paths:parentOf> <urn:paths:Eve> .

<urn:paths:Eve> <urn:paths:knows> <urn:paths:David> .

Topic		Replies	Views
Entity recognition in Stardog Studio or CLI Support	13	337	June 16, 2025
Stardog 5.0.5.1 and PATH Queries Support	2	616	January 29, 2018
Set 'docs.opennlp.models.path' using Java API Support	2	453	April 6, 2018
Invalid NLP models directory Support	4	345	January 21, 2021
Missing triples from PATHS query Support	5	432	April 11, 2018

Not able to fetch path

Related topics