Hi, I'm trying to run the following federated query on our internal triplestore:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
PREFIX qa: <http://www.wdaqua.eu/qa#>
prefix fqaac: <urn:fqaac:>
prefix prov: <http://www.w3.org/ns/prov#>
prefix qado: <urn:qado#>
SELECT DISTINCT ?answerCandidate ?qId ?verbalizedText ?isCorrect ?questionText
WHERE {
fqaac:experiment:qanswer:qald-9-plus-wikidata:test:en prov:generated ?answerCandidate .
?answerCandidate fqaac:hasNaturalLanguageRepresentation ?nl ;
fqaac:qaF1Score ?qaF1score ;
fqaac:relatedTo ?qId .
?nl fqaac:algorithm "2022" ;
fqaac:text ?verbalizedText .
SERVICE <http://user:pass@host:40100/RDFized-datasets/query> {
VALUES ?hasQuestion { qado:correctedQuestion qado:hasQuestion qado:questionEng qado:questionText }
?qId ?hasQuestion ?questionText .
FILTER(LANG(?questionText) = 'en')
}
BIND (IF(?qaF1score = 1.0, "True", "False") as ?isCorrect) .
FILTER(LANG(?verbalizedText) = "en")
}
LIMIT 500
This query works fine, however, if I increase the value of the LIMIT statement to 5000, it fails. Here are the query plans:
LIMIT 500
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix oa: <http://www.w3.org/ns/openannotation/core/>
prefix qa: <http://www.wdaqua.eu/qa#>
prefix fqaac: <urn:fqaac:>
prefix prov: <http://www.w3.org/ns/prov#>
prefix qado: <urn:qado#>
Slice(offset=0, limit=500) [#500]
`─ Distinct [#500]
`─ Projection(?answerCandidate, ?qId, ?verbalizedText, ?isCorrect, ?questionText) [#500]
`─ Bind(IF(?qaF1score = "1.0"^^xsd:decimal, "True", "False") AS ?isCorrect) [#500]
`─ ServiceJoin [#500]
+─ Service <http://user:pass@host:40100/RDFized-datasets/query> {
│ +─ Filter("en" = Lang(?questionText))
│ +─ `─ {
│ +─ `─ Scan[SPO](?qId, ?hasQuestion, ?questionText)
│ +─ `─ VALUES (?hasQuestion) {
│ +─ +─ ( <urn:qado#correctedQuestion> )
│ +─ +─ ( <urn:qado#hasQuestion> )
│ +─ +─ ( <urn:qado#questionEng> )
│ +─ `─ ( <urn:qado#questionText> )
│ +─ }
│ +─ }
│ }
`─ Filter("en" = Lang(?verbalizedText)) [#1]
`─ MergeJoin(?answerCandidate) [#1.1K]
+─ Scan[SPOC](<urn:fqaac:experiment:qanswer:qald-9-plus-wikidata:test:en>, prov:generated, ?answerCandidate) [#8.1K]
`─ MergeJoin(?answerCandidate) [#2.6K]
+─ Scan[PSOC](?answerCandidate, fqaac:qaF1Score, ?qaF1score) [#17K]
`─ BindJoin(?nl) [#1.3K]
+─ MergeJoin(?answerCandidate) [#1.7K]
│ +─ Scan[PSOC](?answerCandidate, fqaac:relatedTo, ?qId) [#16K]
│ `─ Scan[PSOC](?answerCandidate, fqaac:hasNaturalLanguageRepresentation, ?nl) [#2.1K]
`─ MergeJoin(?nl) [#1.7K]
+─ Scan[POSC](?nl, fqaac:algorithm, "2022") [#1.7K]
`─ Scan[PSOC](?nl, fqaac:text, ?verbalizedText) [#1.7K]
and LIMIT 5000
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix oa: <http://www.w3.org/ns/openannotation/core/>
prefix qa: <http://www.wdaqua.eu/qa#>
prefix fqaac: <urn:fqaac:>
prefix prov: <http://www.w3.org/ns/prov#>
prefix qado: <urn:qado#>
Slice(offset=0, limit=5000) [#1]
`─ Distinct [#1]
`─ Projection(?answerCandidate, ?qId, ?verbalizedText, ?isCorrect, ?questionText) [#1]
`─ Bind(IF(?qaF1score = "1.0"^^xsd:decimal, "True", "False") AS ?isCorrect) sortedBy=?answerCandidate [#1]
`─ MergeJoin(?answerCandidate) [#1]
+─ MergeJoin(?answerCandidate) [#8.5K]
│ +─ Scan[SPOC](<urn:fqaac:experiment:qanswer:qald-9-plus-wikidata:test:en>, prov:generated, ?answerCandidate) [#8.1K]
│ `─ Scan[PSOC](?answerCandidate, fqaac:qaF1Score, ?qaF1score) [#17K]
`─ Sort(?answerCandidate) [#1]
`─ Filter("en" = Lang(?verbalizedText)) [#1]
`─ MergeJoin(?nl) [#82]
+─ MergeJoin(?nl) [#1.7K]
│ +─ Scan[POSC](?nl, fqaac:algorithm, "2022") [#1.7K]
│ `─ Scan[PSOC](?nl, fqaac:text, ?verbalizedText) [#1.7K]
`─ Sort(?nl) [#1.0K]
`─ HashJoin(?qId) [#1.0K]
+─ Service <http://user:pass@host:40100/RDFized-datasets/query> {
│ +─ Filter("en" = Lang(?questionText))
│ +─ `─ {
│ +─ `─ Scan[SPO](?qId, ?hasQuestion, ?questionText)
│ +─ `─ VALUES (?hasQuestion) {
│ +─ +─ ( <urn:qado#correctedQuestion> )
│ +─ +─ ( <urn:qado#hasQuestion> )
│ +─ +─ ( <urn:qado#questionEng> )
│ +─ `─ ( <urn:qado#questionText> )
│ +─ }
│ +─ }
│ }
`─ MergeJoin(?answerCandidate) [#1.7K]
+─ Scan[PSOC](?answerCandidate, fqaac:relatedTo, ?qId) [#16K]
`─ Scan[PSOC](?answerCandidate, fqaac:hasNaturalLanguageRepresentation, ?nl) [#2.1K]
I noticed in the visual representation of the Plan that some of the steps are marked as red (when using LIMIT 5000):
In comparison to LIIMT 500 (the Plan is a little bit different, though):