Query response time very high

Hi I'm running following query on dataset of count 44,000 (mongodb)

SELECT ?seqId {
GRAPH virtual://aus {
?images :existence "0707" ;
:seqId ?seqId .
}
}
LIMIT 50

Its taking almost 2 minutes to get the response. Is this the usual behavior or is there something i have to do to make it run faster...?

What version of stardog are you running and can share the query plan?

Stardog query explain

I'm using 7.0.1 @zachary.whitley

Plan is
Slice(offset=0, limit=50) [#50]
`─ Projection(?seqId) [#18K]
   `─ Projection(?images, ?seqId) [#18K]
      `─ ServiceJoin [#18K]
         +─ VirtualGraphMongoDB<virtual://austria> [#36212] {
         │  +─ Query=
         │  +─    { $match : {$and : [ {"frameId" : {$exists : true, $ne : null}}, {"seqId" : {$exists : true, $ne : null}} ]} },
         │  +─    {$project: {frameId: '$frameId', seqId: '$seqId'} }
         │  +─ Vars=
         │  +─    ?seqId <- COLUMN($1)^^xsd:string
         │  +─    ?var100 <- COLUMN($0)^^xsd:string
         │  }
         `─ VirtualGraphMongoDB<virtual://austria> [#9053] {
            +─ Query=
            +─    {$unwind: "$existence"},
            +─    { $match : {$and : [ {"existence" : {$eq : "0707"}}, {"frameId" : {$exists : true, $ne : null}} ]} },
            +─    {$project: {frameId: '$frameId'} }
            +─ Vars=
            +─    ?images <- TEMPLATE(http://austria.com/images/{frameId/0})
            +─    ?var100 <- COLUMN($0)^^xsd:string
            }

It looks like it's breaking it into two separate queries and doing a ServiceJoin which probably isn't very performant. I'm not sure why. You might want to include you mapping if you can so you can avoid a round trip when someone more knowledgable about this takes a look at this. You might also want to add some details about your mongodb setup, schema, indexes, etc

Okay.. I have shared mapping file and schema.. let me know if anything else is required

Mapping
    FROM JSON {
      "aus":{
        "_id": "?imageId",
        "frameId": "?frameId",
        "frameNo": "?frameNo",
        "seqId": "?seqId", 
        "existence": ["?existence"],
        "edge": ["?edge"],
        "population": ["?population"],
        "nearEdge": ["?nearEdge"],
        "farEdge": ["?farEdge"],
        "centerDirection": ["?centerDirection"],
        "leftDirection":["?leftDirection"],
        "rightDirection": ["?rightDirection"], 
        "dualTruncation":["?dualTruncation"],
        "leftTruncation": ["?leftTruncation"],
        "rightTruncation": ["?rightTruncation"],
        "entropy": "?entropy",
        "roadEntropy": "?roadEntropy"
      }
    }
    TO {
    ?image a :Images;
            :_id  ?imageId;
            :frameId ?frameId;
            :frameNo "?frameNo"^^xsd:integer;
            :seqId ?seqId;
            :existence ?existence;
            :edge ?edge;
            :population ?population;
            :nearEdge ?nearEdge;
            :farEdge ?farEdge;
            :centerDirection ?centerDirection;
            :leftDirection ?leftDirection;
            :rightDirection ?rightDirection;
            :dualTruncation ?dualTruncation;
            :leftTruncation ?leftTruncation;
            :rightTruncation ?rightTruncation;
            :entropy "?entropy"^^xsd:integer;
            :roadEntropy "?roadEntropy"^^xsd:integer;
        }
    WHERE {
      BIND (template("http://austria.com/images/{frameId}") AS ?image)
    }

Have attached screen shot of schema

Any help on this?
cc @zachary.whitley

Hi Fasal,

This happens because frameId is not known to be a unique field. frameId is the field that's used to create the IRI for ?image, which in turn is the variable we're joining on in the query.

If frameId is not a unique field in the aus collection, then Stardog needs this join to create the cross product of existence and seqId fields per frameId.

If this is not what's intended, you have a couple options, depending on your data, what it means, and how you wish to model it.

If the combination of the _id and frameId fields represent some concept (an image element, say) you could change your model such that each aus document represents an "ImageFrame" with an IRI who's template includes both the _id and frameId fields. See this blog post for a detailed explanation: https://www.stardog.com/blog/mapping-denormalized-data/

If there is a 1 to 1 relationship between frames and documents (between the frameId field and the _id field), then you have a couple choices.

  1. You could use the _id field in place of the frameId field in the image template.
  2. You could indicate to Stardog that frameId is a unique field in aus collection using the unique.key.sets virtual graph property.

In this example, the property would be set to:
unique.key.sets=(aus.[].frameId)

See https://www.stardog.com/docs/#_setting_unique_keys_manually for details.

-Paul

Hi @PaulJackson,

Thanks for the response..
frameId is unique in my dataset but as you suggested I tried following things:

  1. I did set unique.key.sets=(aus.[].frameId) in my stardog studio virtual graph configuration options under other options as property & value keys..
    Output:- I couldn't find any improvement in the response time of the query.

  2. Then i tried using _id in place of frameId field to create IRI
    WHERE { BIND (template("http://austria.com/images/{_id}") AS ?image) }

In this case simple query like

SELECT * {GRAPH <virtual://austria> { ?image a :Images ; :frameId ?frameId ;}}

fetches around 44k records in just a second which is what i'm looking for.. But now when i add one more condition to this query I don't get any output.. This happens when I try to add a condition from any of the array fields (in this query existence is an array object)..

SELECT * {GRAPH <virtual://austria> {?image a :Images ; :frameId ?frameId ; :existence "0707" .}}

I just see header fields as image and frameId without any data.

Observation: In 2nd case queries are working fine if I query only on int and string fields but when I'm including array objects not able to see any result..

  1. Other option you suggested was combination of frameId and _id for IRI, again with this change query is behaving like 2nd case..

Not sure where i'm going wrong!!!

Hi Fasal,

The inclusion of [] in the unique.key.sets property indicates an array.

What was the result of using _id in the subject? Can you share the query plan?

I was wrong about the []. Can you include the query plan for both changes you tried?

I have updated my response above @jess please have a re-look

Hi Fasal,

It sounds like you're describing a case where you query for images where one of the values of the existence field is "0707". You used this query as an example:

SELECT * {GRAPH <virtual://austria> {
?image a :Images ; :frameId ?frameId ; :existence "0707" .}}

Can you share the query plan for this query?

What result do you get if you replace the constant value for existence with a variable, eg:

SELECT * {GRAPH <virtual://austria> {
?image a :Images ; :frameId ?frameId ; :existence ?existence .}}

What values are bound to the ?existence variable?

@jess query plan

   ` Projection(?image, ?frameId) [#9.1K]
    `─ Projection(?image, ?frameId) [#9.1K]
       `─ ServiceJoin [#9.1K]
          +─ VirtualGraphMongoDB<virtual://austria> [#36212] {
          │  +─ Query=
          │  +─    { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"frameId" : {$exists : true, $ne : null}} ]} },
          │  +─    {$project: {_id: '$_id', frameId: '$frameId'} }
          │  +─ Vars=
          │  +─    ?image <- TEMPLATE(http://austria.com/images/{_id/0})
          │  +─    ?frameId <- COLUMN($1)^^xsd:string
          │  +─    ?var100 <- COLUMN($0)^^xsd:string
          │  }
          `─ VirtualGraphMongoDB<virtual://austria> [#9053] {
             +─ Query=
             +─    {$unwind: "$existence"},
             +─    { $match : {$and : [ {"existence" : {$eq : "0707"}}, {"_id" : {$exists : true, $ne : null}} ]} },
             +─    {$project: {_id: '$_id'} }
             +─ Vars=
             +─    ?var100 <- COLUMN($0)^^xsd:string
             } `

for this query
SELECT * {GRAPH virtual://austria {
?image a :Images ; :frameId ?frameId ; :existence ?existence .}}
Nothing gets displayed

Deleted!!!!!!!!!!!!!!!!!!!!!!

Do you get results for this query:

?image :existence ?existence

@jess No results... nothing gets displayed .

query plan follows:

Projection(?image, ?frameId, ?existence) [#72K]
─ Projection(?image, ?frameId, ?existence) [#72K] ─ ServiceJoin [#72K]
+─ VirtualGraphMongoDBvirtual://austria [#54319] {
│ +─ Query=
│ +─ {$unwind: "$existence"},
│ +─ { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"existence" : {$exists : true, $ne : null}} ]} }
│ +─ Vars=
│ +─ ?existence <- COLUMN($1)^^xsd:string
│ +─ ?var100 <- COLUMN($0)^^xsd:string
│ }
`─ VirtualGraphMongoDBvirtual://austria [#36212] {
+─ Query=
+─ { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"frameId" : {$exists : true, $ne : null}} ]} },
+─ {project: {_id: '_id', frameId: '$frameId'} }
+─ Vars=
+─ ?image <- TEMPLATE(http://austria.com/images/{_id/0})
+─ ?frameId <- COLUMN($1)^^xsd:string
+─ ?var100 <- COLUMN($0)^^xsd:string
}

SELECT * {GRAPH <virtual://austria> {
?image :existence ?existence .}}

Please use this exact query. Can you share the query plan?

Oh okay!! yeah i got the results for

SELECT * {GRAPH virtual://austria {
?image :existence ?existence .}}

plan is

Projection(?image, ?existence) [#54K]
`─ VirtualGraphMongoDBvirtual://austria [#54319] {
+─ Query=
+─ {$unwind: "$existence"},
+─ { $match : {$and : [ {"_id" : {$exists : true, $ne : null}}, {"existence" : {$exists : true, $ne : null}} ]} }
+─ Vars=
+─ ?image <- TEMPLATE(http://austria.com/images/{_id/0})
+─ ?existence <- COLUMN($1)^^xsd:string
}

Great, and what about if you include the constant:

SELECT * {GRAPH <virtual://austria> {
?image :existence "0707" .}}

Yes records are displayed following is the query plan

Projection(?image) [#9.1K]
`─ VirtualGraphMongoDBvirtual://austria [#9053] {
+─ Query=
+─ {$unwind: "$existence"},
+─ { $match : {$and : [ {"existence" : {$eq : "0707"}}, {"_id" : {$exists : true, $ne : null}} ]} },
+─ {project: {_id: '_id'} }
+─ Vars=
+─ ?image <- TEMPLATE(http://austria.com/images/{_id/0})
}