Feasibility for Combining Multiple Virtual Graphs

Hi Team,

Trying to experiment with the interoperability through virtual graph by connecting multiple data sources.

Could able to see that we can combine multiple virtual graphs in the SPARQL query to get the required result.

But, if we need to combine data from multiple data sources (i.e.., 10 or 20) each connected to an different virtual graphs, hope it's not feasible to mention all the virtual graphs (10 or 20 graphs) in the sparql query .

So, is there any way we can create an implicit connected multiple virtual graphs in the backend and expose only a explicit single virtual graph (connected multiple virtual graphs in the backend ) to the consuming application so that there's no chaos in writing SPARQL query by mentioning multiple virtual graphs in a query ?

Could anybody please help on this clarification ?

Hi,

You might be interested in using named graph aliases. You can create, for example, an alias A that would point to VG1, VG2, VG3, and so on. When you query alias A, you would then be issuing the query to all the virtual graphs (and/or other named graphs) that the alias points to. You can read more about this in our documentation:

Cheers,
Noah

1 Like

Hi @noahgorstein ,

Thanks for your input. By following this link,

I Could able to create alias for named graphs for data in the Stardog db , by executing a below command by selecting the stardog DB which has named graph in the dropdown from stardog studio

INSERT DATA {
 graph <tag:stardog:api:graph:aliases> {
  :AlAr <tag:stardog:api:graph:alias> :Album, :Artist .
}
}

But how can we create alias for a virtual graph ?

Tried to issue the same format of insert command for a virtual graph, after enabling the virtual.transparency property in the stardog db.

"sqlds" -> name of virtual graph
But, if i query it, i'm unable to see the results..

Could please advice on procedure to create a proper alias for virtual graphs ?, as i could not able to find anything related to this in the documentation .

Hi,

Please be sure you enabled the graph.alias database configuration option.

Additionally, to create a named graph alias for a virtual graph, see my example below. The name of the virtual graph (employees in my example) should be appended to virtual://:

INSERT DATA {
    GRAPH <tag:stardog:api:graph:aliases> {
        :myAlias <tag:stardog:api:graph:alias> <virtual://employees> .
    }
}

Let us know if that helps.

Best,
Noah

thanks @noahgorstein .

With your input i'm able to create a alias for Virtual graphs.

But facing a different issue,
(VG -> virtual graph)
If i create an alias1 for a VG1 , i'm able to issue a query and get the response.
if i create an alias 2 for a VG1 and VG2 , i'm not able to get response for the same query that i issued against with alias 1 .

Below is the scenario:

There are 2 virtual graphs, one connected with mongodb and the other with mysql db

VG1 - "moviesds"
VG2 - "sqlds" ( default auto generated mapping )

Creating alias as follows,

INSERT DATA {
    GRAPH <tag:stardog:api:graph:aliases> {
        :alias2 <tag:stardog:api:graph:alias> <virtual://sqlds>, <virtual://moviesds> .
    }
}
INSERT DATA {
    GRAPH <tag:stardog:api:graph:aliases> {
        :alias1 <tag:stardog:api:graph:alias> <virtual://moviesds> .
    }
}

for the below query, the response i could see the data of both the virtual graphs

Whereas for alias1 query there's only one virtual graph's data that's completely normal.

Now changing the query as follows

select * from :alias1{
    ?s a :Movie ;
       :genre ?name
}

Able to get perfect response for alias 1 which is associated only to one virtual graph as follows :

The below same query for alias2 should also return the same, as both virtual graphs are aliased to "alias2".

select * from :alias2{
    ?s a :Movie ;
       :genre ?name
}

but getting no values in the response as below, when querying the alias which is linked to 2 virtual graphs

References :
Mapping:

Attached mapping file and data file

mongomap.ttl (2.9 KB)
sqlmap.ttl (731 Bytes)
mongo_data.json (1.4 KB)
image

Requesting for advice if i'm missing or need to correct anything, it would be a great help.

Hi Muthu,

You will need to enable the virtual.transparency option if your Stardog version is before 7.7.0. There was a minor change documented in 7.7.0 Release (2021-07-07).

Otherwise, please click "Show Plan" and choose the text format and share the result here.

Jess

Thanks @jess. i'm using version 7.7.1 .
Query1:

select * from :alias1{
    ?s a :Movie ;
       :genre ?name
}

Query Plan 1:

From <virtual://moviesds>
Projection(?s, ?name) [#3]
`─ Slice(offset=0, limit=1000) [#3]
   `─ Projection(?s, ?name) [#3]
      `─ Projection(?s, ?name) [#3]
         `─ ServiceJoin [#3]
            +─ VirtualGraphMongoDB<virtual://moviesds> [#2] {
            │  +─ RelNode=
            │  +─    LogicalFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
            │  +─      UnwoundRel(coll=[movies], unwind=[[$genre]], projection=[movieId=$_id, genre=$genre])
            │  +─ Query=
            │  +─    {$unwind: "$genre"},
            │  +─    { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"genre" : {$exists : true, $ne : null}} ]} }
            │  +─ Vars=
            │  +─    ?s <- TEMPLATE({movieId/0})
            │  +─    ?name <- COLUMN($1)^^xsd:string
            │  +─    ?var100 <- COLUMN($0)^^xsd:string
            │  }
            `─ VirtualGraphMongoDB<virtual://moviesds> [#1] {
               +─ RelNode=
               +─    LogicalProject(movieId=[$0])
               +─      LogicalFilter(condition=[IS NOT NULL($0)])
               +─        UnwoundRel(coll=[movies], key=[[$_id]], projection=[movieId=$_id, boxOffice=$boxOffice, datePublished=$datePublished, description=$description, name=$name])
               +─ Query=
               +─    { $match : {"_id" : {$exists : true, $ne : null}} },
               +─    {$project: {movieId: '$_id'} }
               +─ Vars=
               +─    ?var100 <- COLUMN($0)^^xsd:string
               }

Query 2: (empty result)

select * from :alias2{
    ?s a :Movie ;
       :genre ?name
}

Query Plan 2:

From <virtual://moviesds>
From <virtual://sqlds>
Slice(offset=0, limit=1000) [#1]
`─ Projection(?s, ?name) [#1]
   `─ ServiceJoin [#1]
      +─ VirtualGraphMongoDB<virtual://moviesds> [#1] {
      │  +─ RelNode=
      │  +─    LogicalProject(movieId=[$0])
      │  +─      LogicalFilter(condition=[IS NOT NULL($0)])
      │  +─        UnwoundRel(coll=[movies], key=[[$_id]], projection=[movieId=$_id, boxOffice=$boxOffice, datePublished=$datePublished, description=$description, name=$name])
      │  +─ Query=
      │  +─    { $match : {"_id" : {$exists : true, $ne : null}} },
      │  +─    {$project: {movieId: '$_id'} }
      │  +─ Vars=
      │  +─    ?s <- TEMPLATE({movieId/0})
      │  }
      `─ VirtualGraphMongoDB<virtual://moviesds> [#2] {
         +─ RelNode=
         +─    LogicalFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
         +─      UnwoundRel(coll=[movies], unwind=[[$genre]], projection=[movieId=$_id, genre=$genre])
         +─ Query=
         +─    {$unwind: "$genre"},
         +─    { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"genre" : {$exists : true, $ne : null}} ]} }
         +─ Vars=
         +─    ?s <- TEMPLATE({movieId/0})
         +─    ?name <- COLUMN($1)^^xsd:string
         }

Any help would be greatful :slight_smile:

Hi @jess @noahgorstein ,

Is there any suggestions for this ?

Thanks.

Hi,

I think you're hitting a bug with MongoDB in one of the join implementations. Can you try this query, and share the updated query plan?

select * from :alias2
{ #pragma join.service off
    ?s a :Movie ;
       :genre ?name
}

Jess

Thanks @jess

The query that you gave has given me a valid output.

Could you please advise/brief how the addition of some comments `#pragma join.service off' has given a required answer ? Is there any documentation referring to these kind of stuffs ?

Below is the query plan .

From <virtual://moviesds>
From <virtual://sqlds>
Slice(offset=0, limit=1000) [#3]
`─ Projection(?s, ?name) [#3]
   `─ HashJoin(?s) [#3]
      +─ VirtualGraphMongoDB<virtual://moviesds> [#2] {
      │  +─ RelNode=
      │  +─    LogicalFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
      │  +─      UnwoundRel(coll=[movies], unwind=[[$genre]], projection=[movieId=$_id, genre=$genre])
      │  +─ Query=
      │  +─    {$unwind: "$genre"},
      │  +─    { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"genre" : {$exists : true, $ne : null}} ]} }
      │  +─ Vars=
      │  +─    ?s <- TEMPLATE({movieId/0})
      │  +─    ?name <- COLUMN($1)^^xsd:string
      │  }
      `─ VirtualGraphMongoDB<virtual://moviesds> [#1] {
         +─ RelNode=
         +─    LogicalProject(movieId=[$0])
         +─      LogicalFilter(condition=[IS NOT NULL($0)])
         +─        UnwoundRel(coll=[movies], key=[[$_id]], projection=[movieId=$_id, boxOffice=$boxOffice, datePublished=$datePublished, description=$description, name=$name])
         +─ Query=
         +─    { $match : {"_id" : {$exists : true, $ne : null}} },
         +─    {$project: {movieId: '$_id'} }
         +─ Vars=
         +─    ?s <- TEMPLATE({movieId/0})
         }

Muthu,

The comment is a special comment called a query hint. You can find details in the documentation sections Query Hints and List of Query Hints. The "service" join control hint was recently added and looks to be missing from the documentation. We'll get this updated shortly.

The service join is an efficiency decision made by Stardog to avoid retrieving large amounts of data from remote sources. In this case, the service join fails due to a known issue with MongoDB queries. This is logged as bug [PLAT-448].

Hope this helps.
Jess

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.