Erroneous query planning for queries with FROM NAMED declaration and GRAPH variable

We encountered a severe query planning error, that can be reproduced
as follows:

Given an empty database, load the following TriG data

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<urn:Beatles> {

<urn:Beatles> a foaf:Group .

<urn:John> a foaf:Person .
<urn:Paul> a foaf:Person .
<urn:George> a foaf:Person .
<urn:Ringo> a foaf:Person .
}

Then execute query A:

SELECT  ?relation ?object ?graph
FROM NAMED <urn:Beatles>
WHERE { 
  GRAPH ?graph { 
    <urn:Beatles> ?relation  ?object
  }
}

Query A will yield the expected result set of with one solution.

Then execute query B:

SELECT  ?relation ?object ?graph
FROM NAMED <urn:Beatles>
WHERE { 
  GRAPH ?graph { 
    <urn:John> ?relation  ?object
  }
}

The expected result set would again contain one solution (the rdf:type statement for John),
but the actual result set is empty. As the query plan shows, Stardog decides for some reason to
scan the wrong named graph:

From Named <urn:Beatles>
Projection(?relation, ?object, ?graph) [#1]
`-- Bind(<urn:John> AS ?graph) [#1]
    `-- Scan[SPO](<urn:John>, ?relation, ?object){<urn:John>} [#1]

Additional observations:

This error does not occur if query B is executed at least once before query A.

This error also does not occur if the ?graph variable is eliminated from the queries, e.g.

SELECT  ?relation ?object
FROM NAMED <urn:Beatles>
WHERE { 
  GRAPH <urn:Beatles> { 
    <urn:Beatles> ?relation  ?object
  }
}

and

SELECT  ?relation ?object
FROM NAMED <urn:Beatles>
WHERE { 
  GRAPH <urn:Beatles> { 
    <urn:John> ?relation  ?object
  }
}

I was able to reproduce this error on versions 5.3.6, 6.1.0 and 6.2.1.

1 Like

Hi, and welcome!

Thanks for the very detailed bug report. I seem to be able to reproduce the issue and am opening a ticket to resolve it.

As a workaround, try dropping the database and recreating it with the metadata option query.plan.reuse=NEVER

Hi Stephen, thanks for the speedy reply. Switching query plan reuse off circumvents that error indeed.

Can you give a rough estimation/rule of thumb which performance degredation is to be expected when relinquishing the query plan re-use? I could imagine, for example, that some benchmarks have been done on the performance gained when query plan re-use was introduced, which could give an inkling?

Hi Markus,

This greatly depends on complexity of your queries for query optimization (not query evaluation), which roughly corresponds to the number and size of BGPs in queries. Stardog tries to reuse query plans for structurally equivalent queries, i.e. those which differ only in constants, and not in structure.

You can run your queries multiple times and see the difference between the first and second evaluations. Of course not all of the difference can be attributed to reusing the cached query plan as evaluation also becomes faster due to warmer data caches, OS caches, and disk caches, but you'll get some idea. You can also try that with and without query.plan.reuse=NEVER to see performance difference.

This bug will definitely be fixed before the July's release. We might be able to suggest other workarounds after we dig into the issue.

Cheers,
Pavel

Hi Markus,

Just a quick follow-up to this after we've looked deeper into the issue. The problem can only arise after running a query in which the same constant occurs both in FROM NAMED and somewhere else in the query. You can still use the plan cache but you'd need to disable one particular optimization which inlines FROM NAMED constants into graph variables. Such things are possible in Stardog by using special hints:

SELECT  ?relation ?object
FROM NAMED <urn:Beatles>
WHERE {
  #pragma optimizer.inline.from off 
  GRAPH <urn:Beatles> { 
    <urn:Beatles> ?relation  ?object
  }
}

Try this if disabling the cache causes noticeable drop in query throughput, otherwise Stephen's suggestion should work OK for you till July.

Thanks again for the report,
Pavel

Is this problem fixed in the 6.2.2 release?

We have fixed it in the development branch but it didn't get into 6.2.2 (it was reported just a day before 6.2.2 went out). Will be in the July's release.

Best,
Pavel