Similarity Search Model Creation issue

nicky508 · October 29, 2018, 4:39pm

Dear,

I am working with the Similarity Search function for a while. I created a java lib which is able to generate a similarity search model (query) and a select query for the model. However, I encountered a vague issue. If I run the insert query from Stardog studio the query is accepted and when I run the select query it runs perfectly.
However, when I run the exact same query from java using jena bindings. The insert query is accepted and a model is created, however after the select query ran, I got the following error:

QueryEval: com.complexible.common.rdf.model.StardogBNode cannot be cast to org.openrdf.model.Literal

If I take a look at the models, I see a some triples in the model generated from stardog studio. In the model created through java I see 1694881 blanknodes, which I cannot explain, but I guess this is not correct comparing this model with the model created by stardog studio.

What could be the issue. The queries are exactly the same, but it seems Stardog does something different internally?

The model generated via stardog studio: https://cloudbox.netage.nl/f/38195a0ec3/?dl=1
The model generated via java-jena: Private Seafile

jess · October 29, 2018, 4:41pm

Hi,

Thanks for the report. Can you include the error message from the server's log file?

Jess

nicky508 · October 30, 2018, 8:42am

Sure, here you could find the log -> Private Seafile

jess · October 30, 2018, 10:09pm

Thanks for sharing the log file. It helps to see what's going on.

I see in your other post that you originally had this code (using createRemote() with an endpoint):

UpdateProcessor updateExec = UpdateExecutionFactory.createRemote(request, endpoint);

but changed it to (using create() with a Dataset):

SDJenaFactory.createDataset(this.aConn);
UpdateProcessor updateExec = UpdateExecutionFactory.create(UpdateFactory.create(arg0), getDataset());

The latter does not work as it is processed by the Jena engine by executing the query separately from the insert. This is also less efficient. I tested the former code and it's sending the entire query to Stardog, which is what is required for the model training to work properly. Please give it a try.

Best,
Jess

semanticfire · October 30, 2018, 10:16pm

Which makes me wonder is there any benefit of using the Jena Models with the Stardog Dataset over using plain sparql ?
( besides this edge case )

jess · October 30, 2018, 10:26pm

The Jena support is a compatibility layer. It obscures the way Stardog works as you've seen. Using the native API or SPARQL via HTTP will give you the best results.

semanticfire · October 30, 2018, 10:32pm

in other words, do everything via the API and wrap it in Jena objects ourselves for our applications ?

nicky508 · October 31, 2018, 10:01am

Thnx for your response.
I indeed changed the code. But I tried it by changing it back:

AggregateRegistry.register("tag:stardog:api:analytics:set", (agg, distinct) -> AggNull.createAccNull(), NodeConst.nodeNil);
UpdateProcessor updateExec = UpdateExecutionFactory.createRemote(UpdateFactory.create(similarityModel.createInsertModelQuery(m, "http://data.resc.info/kro", modelName.replaceAll("[-+.^:,]","").replace(" ", "_"))), "http://stardog.netage.nl:5820/annex/kro/sparql/query");
updateExec.execute();

The query is created with the "createInsertModelQuery" method:

prefix spa: <tag:stardog:api:analytics:>
prefix : <http://schema.org/>
INSERT { graph spa:model { :basic_model a spa:SimilarityModel; spa:arguments (?bouwbest2 ?functie2 ?status2 ?bag_oppvlk2 ?bouwjaar2 ?maximale_hoogte2 ?gemiddelde_hoogte2 ?pandHoogte2 ?bouwlagen2 );
spa:predict ?object .}}WHERE {
SELECT
(spa:set(?bouwbest) as ?bouwbest2) (spa:set(?functie) as ?functie2) (spa:set(?status) as ?status2) (spa:set(?bag_oppvlk) as ?bag_oppvlk2) (spa:set(?bouwjaar) as ?bouwjaar2) (spa:set(?maximale_hoogte) as ?maximale_hoogte2) (spa:set(?gemiddelde_hoogte) as ?gemiddelde_hoogte2) (spa:set(?pandHoogte) as ?pandHoogte2) (spa:set(?bouwlagen) as ?bouwlagen2) ?object { GRAPH <http://data.resc.info/kro> {?object <http://vocab.netage.nl/kro#hasWOZ> ?hasWOZ.
?hasWOZ <http://vocab.netage.nl/kro#bouwbest> ?bouwbest.
?object <http://vocab.netage.nl/kro#hasBuilding> ?hasBuilding.
?hasBuilding <http://vocab.netage.nl/kro#functie> ?functie.
?hasBuilding <http://vocab.netage.nl/kro#status> ?status.
?hasBuilding <http://vocab.netage.nl/kro#bag_oppvlk> ?bag_oppvlk.
?hasBuilding <http://vocab.netage.nl/kro#bouwjaar> ?bouwjaar.
?object <http://vocab.netage.nl/kro#hasAHN> ?hasAHN.
?hasAHN <http://vocab.netage.nl/kro#maximale_hoogte> ?maximale_hoogte.
?hasAHN <http://vocab.netage.nl/kro#gemiddelde_hoogte> ?gemiddelde_hoogte.
?hasAHN <http://vocab.netage.nl/kro#pandHoogte> ?pandHoogte.
?hasAHN <http://vocab.netage.nl/kro#bouwlagen> ?bouwlagen.
}} GROUP BY ?object}

Result is the same, still a model with lots of blanknodes (1799800 triples)

nicky508 · October 31, 2018, 4:06pm

Small update, I fire the query to stardog with a simple http get, which works. So the issue is indeed screwed somewhere in JENA. I also played around with the stardog API. Which also works. We did some small changes in the code and now it works.

private Connection aConn = ConnectionConfiguration...

aConn.begin();
UpdateQuery q = aConn.update(arg0);
q.execute();
aConn.commit();

Thanks for your help!

system · November 14, 2018, 4:06pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Stardog similarity search query through jena Support	8	886	October 9, 2018
"Unable to create execution plan" with simple BIND Bug	5	621	July 11, 2018
Problem with running query Support	5	193	November 23, 2024
About the Stardog Explorer category Stardog Explorer	2	649	August 13, 2024
Create new class in Stardog Studio Support	4	343	October 18, 2022

Similarity Search Model Creation issue

Related topics