Federated recursive query

Hi,

I am trying to run the following query for combining data from Stardog with data from another SPARQL Endpoint:

Prefix ex:    <https://example.org/> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
 
 SELECT * WHERE  {
             ?subj ex:category "StatusCat";
                 rdfs:label ?status.
 
 SERVICE <http://endpoint1:3030/sparql> {		
        	?uri rdfs:label ?status.
       }
 }}

Basically, this SPARQL Endpoint is built on top of a normal JSON Web API where conversion of the JSON response to RDF triples is done on the fly. So, this service accepts as a parameter the variable ?status, therefore the query should be executed in an iterative way, i.e. for each value of the ?status variable there should be initiated a call to <http://endpoint1:3030/sparql>.

When I run directly:

 Prefix ex: <https://example.org/> 
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  
  SELECT * WHERE  {
        SERVICE <http://endpoint1:3030/sparql> {		
         	?uri rdfs:label "status1".
         }
  }

it works as expected, but this is only for one value.

I also tried without success the following ways:

Prefix ex: <https://example.org/> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    SELECT * WHERE  {
        SERVICE <http://endpoint1:3030/sparql> {		
        	?uri rdfs:label ?status.
             {
                 select  ?status where{ 
                   ?subj ex:category "StatusCat";
                    rdfs:label ?status.
                 }
             }
        } 
 }

and

Prefix ex: <https://example.org/> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT * WHERE  {

      ?subj ex:category "StatusCat".

       SERVICE <http://endpoint1:3030/sparql> {		
        	?uri rdfs:label ?status.

             {
                   ?subj rdfs:label ?status.
             }
        }
     
 }

The issue seems to be the same or similar to the one posted here: https://community.stardog.com/t/federated-query-with-inline-data-returns-empty-result/487/2

Maybe I am doing something wrong.

Best,
Lav

Hey Lav,
Are you implementing Stardog's Java-based Service API? There's a method on the query called getRequiredInputBindings which lets you specify which input bindings are required. In this case you should return the ?status variable.
Jess

Hi Jess,

thanks for your reply.

No, I am using another solution, which basically uses RDF to describe the WEB API Service and map their required parameter(s) directly to desired RDF properties, in this case "rdfs:label". So whatever value is next to rdfs:label (as a literal value) it is passed to the WEB API service, which after the invocation returns a JSON response. Next, this response is converted to triples using a SPARQL construct query, which allows defining a customized RDF graph in the way that I need/want.

Lav,

We do this type of optimization in certain situations but we would send a batch of ?status bindings to minimize the number of queries. The batch would be sent as a VALUES list which I'm guessing might not play well with your SPARQL interpreter on the remote end.

Jess

Jess,

I already tried with VALUES list, but unfortunately (as you also said) that doesn't work for this scenario, since the WEB API accepts only one value per time. There is a possibility to write code some work on top of the the SPARQL interpreter but that wouldn't be a clean solution that we are looking for.

Anyway, could you post here the optimization that you mentioned using VALUES list?
-> I would like to compare with the way how I am doing it, maybe there is something that can help temporarily.

There's no straightforward way to force this at the moment. It's possible we'll add a hint for this in the not too distant future. Given a query that includes a SERVICE pattern and another arbitrary solution-producing pattern, we would implement it as follows.

Query:

{
  ?x a :LocalThing
  SERVICE <...> {
    ?x :someAttribute ?y
  }
}

The naive approach here would send select ?x ?y where { ?x :someAttribute ?y } to the remote endpoint and join the results with the solutions produced by the local pattern.

When the above mentioned optimization is applied, we send bindings of ?x to the remote endpoint to avoid returning all possible bindings of ?x. If the local database contains :X1 a :LocalThing and :X2 a :LocalThing, we might send the following query:

select ?x ?y {
  ?x :someAttribute ?y
  VALUES (?x) { (:X1) (:X2) }
}

If there were only one binding of ?x we could potentially replace instances of the variable with the constant. It's conceivable that the hint would allow specifying the batch size which could be set to 1 in your case leading to the behavior you need.

Jess

Hi Jess,

thank you for your explanation. I see and can imagine that this kind of issue is a bit out of traditional query-matching-binding way of the SPARQL and a bit tricky to implement. In the meantime, I will have to think for a workaround solution.

Best,
Lav

Can you modify your service to extract values expression and iterate on the service side? It would be a lot more performant than a network round trip for each iteration.

Yes, this was one idea that I already wrote here: Federated recursive query - #5 by lavhal . So, in principle it is possible, but that would be not a clean solution and very constrained to each use case that we would have.

You can use the example here to get started on your masterpiece.

image

Thank you, I will have a look.