Option clause very slow

Hi all, I observed the option clauses in the query below makes it very slow.

Any suggestion on how to improve it ?

SELECT ?addition ?character ?number ?number_end ?postal_code ?thoroughfare ?bag_id ?city ?city_id ?country ?country_id ?municipality ?municipality_id ?state ?state_id ?lat ?long

WHERE {
 
            VALUES ?address
      {
          <http://geophy.io/addresses/1129>
         
}
 
     ?address address:number ?number .
          ?address address:thoroughfare ?thoroughfare  .
 ?address wgs84:lat ?lat .
          ?address wgs84:long ?long .
 ?address address:city ?city .
          ?city ontologies:id ?city_id .
   ?city gn:locatedIn   ?municipality .
 ?municipality a geotmp:municipality .
         ?municipality ontologies:id ?municipality_id .
   ?municipality gn:locatedIn ?state .
 ?state a geotmp:state .        
 ?state ontologies:id ?state_id .
   ?state gn:locatedIn ?country .
 ?country a geospatial:country .
  
            ?country ontologies:id ?country_id .
OPTIONAL {?address address:addition ?addition .}
OPTIONAL { ?address address:character ?character .}
 OPTIONAL {           ?address address:number_end ?number_end .}
  OPTIONAL {        ?address address:postal_code ?postal_code .}
    OPTIONAL {                ?address tmp:bag_id ?bag_id .}
       

                 }

Hi, it’d be good to know the following:

  • version of Stardog
  • query plan with and without the optionals
  • how much time it takes with and without the optionals.

Do you mean that adding all optionals make it slow or some optionals?

Thanks,
Pavel

Query plan with OPTIONALS

$STARDOG/bin/stardog query explain coredb "SELECT ?addition ?character ?number ?number_end ?postal_code ?thoroughfare ?bag_id ?city ?city_id ?country ?country_id ?municipality ?municipality_id ?state ?state_id ?lat ?long WHERE { VALUES ?address { http://geophy.io/addresses/1129 } ?address address:number ?number . ?address address:thoroughfare ?thoroughfare . ?address wgs84:lat ?lat . ?address wgs84:long ?long . ?address address:city ?city . ?city ontologies:id ?city_id . ?city gn:locatedIn ?municipality . ?municipality a geotmp:municipality . ?municipality ontologies:id ?municipality_id . ?municipality gn:locatedIn ?state . ?state a geotmp:state . ?state ontologies:id ?state_id . ?state gn:locatedIn ?country . ?country a geospatial:country . ?country ontologies:id ?country_id . OPTIONAL {?address address:addition ?addition .} OPTIONAL { ?address address:character ?character .} OPTIONAL { ?address address:number_end ?number_end .} OPTIONAL { ?address address:postal_code ?postal_code .} OPTIONAL { ?address tmp:bag_id ?bag_id .} }"
Explaining Query:

SELECT ?addition ?character ?number ?number_end ?postal_code ?thoroughfare ?bag_id ?city ?city_id ?country ?country_id ?municipality ?municipality_id ?state ?state_id ?lat ?long WHERE { VALUES ?address { http://geophy.io/addresses/1129 } ?address address:number ?number . ?address address:thoroughfare ?thoroughfare . ?address wgs84:lat ?lat . ?address wgs84:long ?long . ?address address:city ?city . ?city ontologies:id ?city_id . ?city gn:locatedIn ?municipality . ?municipality a geotmp:municipality . ?municipality ontologies:id ?municipality_id . ?municipality gn:locatedIn ?state . ?state a geotmp:state . ?state ontologies:id ?state_id . ?state gn:locatedIn ?country . ?country a geospatial:country . ?country ontologies:id ?country_id . OPTIONAL {?address address:addition ?addition .} OPTIONAL { ?address address:character ?character .} OPTIONAL { ?address address:number_end ?number_end .} OPTIONAL { ?address address:postal_code ?postal_code .} OPTIONAL { ?address tmp:bag_id ?bag_id .} }

The Query Plan:

From all

Projection(?addition, ?character, ?number, ?number_end, ?postal_code, ?thoroughfare, ?bag_id, ?city, ?city_id, ?country, ?country_id, ?municipality, ?municipality_id, ?state, ?state_id, ?lat, ?long) [#9223372036854.8M]
`─ MergeJoinOuter(?address) [#9223372036854.8M]
   +─ MergeJoinOuter(?address) [#9223372036854.8M]
   │  +─ MergeJoinOuter(?address) [#9223372036854.8M]
   │  │  +─ MergeJoinOuter(?address) [#9223372036854.8M]
   │  │  │  +─ MergeJoinOuter(?address) [#9223372036854.8M]
   │  │  │  │  +─ Sort(?address) [#40679.6M]
   │  │  │  │  │  `─ Bind(<http://geophy.io/addresses/1129> AS ?address) [#40679.6M]
   │  │  │  │  │     `─ LoopJoin(_) [#40679.6M]
   │  │  │  │  │        +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://www.w3.org/2003/01/geo/wgs84_pos#long>, ?long) [#1]
   │  │  │  │  │        `─ LoopJoin(_) [#40679.6M]
   │  │  │  │  │           +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#number>, ?number) [#1]
   │  │  │  │  │           `─ LoopJoin(_) [#40679.6M]
   │  │  │  │  │              +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#thoroughfare>, ?thoroughfare) [#1]
   │  │  │  │  │              `─ LoopJoin(_) [#40679.6M]
   │  │  │  │  │                 +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://www.w3.org/2003/01/geo/wgs84_pos#lat>, ?lat) [#1]
   │  │  │  │  │                 `─ HashJoin(?state) [#40679.6M]
   │  │  │  │  │                    +─ MergeJoin(?country) [#2.0M]
   │  │  │  │  │                    │  +─ Sort(?country) [#972]
   │  │  │  │  │                    │  │  `─ NaryJoin(?state) [#972]
   │  │  │  │  │                    │  │     +─ Scan[POSC](?state, rdf:type, <http://geophy.io/ontologies/geotmp#state>) [#972]
   │  │  │  │  │                    │  │     +─ Scan[PSOC](?state, <http://geophy.io/ontologies#id>, ?state_id) [#11.1M]
   │  │  │  │  │                    │  │     `─ Scan[PSOC](?state, <http://www.geonames.org/ontology#locatedIn>, ?country) [#27.7M]
   │  │  │  │  │                    │  `─ MergeJoin(?country) [#252]
   │  │  │  │  │                    │     +─ Scan[POSC](?country, rdf:type, <http://geophy.io/ontologies/geospatial#country>) [#252]
   │  │  │  │  │                    │     `─ Scan[PSOC](?country, <http://geophy.io/ontologies#id>, ?country_id) [#11.1M]
   │  │  │  │  │                    `─ MergeJoin(?municipality) [#415]
   │  │  │  │  │                       +─ Sort(?municipality) [#1]
   │  │  │  │  │                       │  `─ MergeJoin(?city) [#1]
   │  │  │  │  │                       │     +─ Scan[PSOC](?city, <http://www.geonames.org/ontology#locatedIn>, ?municipality) [#27.7M]
   │  │  │  │  │                       │     `─ MergeJoin(?city) [#1]
   │  │  │  │  │                       │        +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#city>, ?city) [#1]
   │  │  │  │  │                       │        `─ Scan[PSOC](?city, <http://geophy.io/ontologies#id>, ?city_id) [#11.1M]
   │  │  │  │  │                       `─ NaryJoin(?municipality) [#19K]
   │  │  │  │  │                          +─ Scan[POSC](?municipality, rdf:type, <http://geophy.io/ontologies/geotmp#municipality>) [#19K]
   │  │  │  │  │                          +─ Scan[PSOC](?municipality, <http://geophy.io/ontologies#id>, ?municipality_id) [#11.1M]
   │  │  │  │  │                          `─ Scan[PSOC](?municipality, <http://www.geonames.org/ontology#locatedIn>, ?state) [#27.7M]
   │  │  │  │  `─ Scan[PSOC](?address, <http://geophy.io/ontologies/address#addition>, ?addition) [#532K]
   │  │  │  `─ Scan[PSOC](?address, <http://geophy.io/ontologies/address#character>, ?character) [#898K]
   │  │  `─ Scan[PSOC](?address, <http://geophy.io/ontologies/address#number_end>, ?number_end) [#74K]
   │  `─ Scan[PSOC](?address, <http://geophy.io/ontologies/address#postal_code>, ?postal_code) [#5.4M]
   `─ Scan[PSOC](?address, <http://geophy.io/ontologies/tmp#bag_id>, ?bag_id) [#5.4M]

Query plan without OPTIONALS:

samur@samur-XPS-15-9550:~/development/coredb-ontologies$ $STARDOG/bin/stardog query explain coredb "SELECT ?addition ?character ?number ?number_end ?postal_code ?thoroughfare ?bag_id ?city ?city_id ?country ?country_id ?municipality ?municipality_id ?state ?state_id ?lat ?long WHERE { VALUES ?address { http://geophy.io/addresses/1129 } ?address address:number ?number . ?address address:thoroughfare ?thoroughfare . ?address wgs84:lat ?lat . ?address wgs84:long ?long . ?address address:city ?city . ?city ontologies:id ?city_id . ?city gn:locatedIn ?municipality . ?municipality a geotmp:municipality . ?municipality ontologies:id ?municipality_id . ?municipality gn:locatedIn ?state . ?state a geotmp:state . ?state ontologies:id ?state_id . ?state gn:locatedIn ?country . ?country a geospatial:country . ?country ontologies:id ?country_id . }"
Explaining Query:

SELECT ?addition ?character ?number ?number_end ?postal_code ?thoroughfare ?bag_id ?city ?city_id ?country ?country_id ?municipality ?municipality_id ?state ?state_id ?lat ?long WHERE { VALUES ?address { http://geophy.io/addresses/1129 } ?address address:number ?number . ?address address:thoroughfare ?thoroughfare . ?address wgs84:lat ?lat . ?address wgs84:long ?long . ?address address:city ?city . ?city ontologies:id ?city_id . ?city gn:locatedIn ?municipality . ?municipality a geotmp:municipality . ?municipality ontologies:id ?municipality_id . ?municipality gn:locatedIn ?state . ?state a geotmp:state . ?state ontologies:id ?state_id . ?state gn:locatedIn ?country . ?country a geospatial:country . ?country ontologies:id ?country_id . }

The Query Plan:

From all

Projection(?addition, ?character, ?number, ?number_end, ?postal_code, ?thoroughfare, ?bag_id, ?city, ?city_id, ?country, ?country_id, ?municipality, ?municipality_id, ?state, ?state_id, ?lat, ?long) [#68.1M]
`─ Bind(<http://geophy.io/addresses/1129> AS ?address) [#68.1M]
   `─ LoopJoin(_) [#68.1M]
      +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://www.w3.org/2003/01/geo/wgs84_pos#long>, ?long) [#1]
      `─ LoopJoin(_) [#68.1M]
         +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#number>, ?number) [#1]
         `─ LoopJoin(_) [#68.1M]
            +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#thoroughfare>, ?thoroughfare) [#1]
            `─ LoopJoin(_) [#68.1M]
               +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://www.w3.org/2003/01/geo/wgs84_pos#lat>, ?lat) [#1]
               `─ HashJoin(?state) [#68.1M]
                  +─ MergeJoin(?country) [#2.0M]
                  │  +─ Sort(?country) [#972]
                  │  │  `─ NaryJoin(?state) [#972]
                  │  │     +─ Scan[POSC](?state, rdf:type, <http://geophy.io/ontologies/geotmp#state>) [#972]
                  │  │     +─ Scan[PSOC](?state, <http://geophy.io/ontologies#id>, ?state_id) [#11.1M]
                  │  │     `─ Scan[PSOC](?state, <http://www.geonames.org/ontology#locatedIn>, ?country) [#27.7M]
                  │  `─ MergeJoin(?country) [#252]
                  │     +─ Scan[POSC](?country, rdf:type, <http://geophy.io/ontologies/geospatial#country>) [#252]
                  │     `─ Scan[PSOC](?country, <http://geophy.io/ontologies#id>, ?country_id) [#11.1M]
                  `─ MergeJoin(?municipality) [#1]
                     +─ Sort(?municipality) [#1]
                     │  `─ MergeJoin(?city) [#1]
                     │     +─ Scan[PSOC](?city, <http://www.geonames.org/ontology#locatedIn>, ?municipality) [#27.7M]
                     │     `─ MergeJoin(?city) [#1]
                     │        +─ Scan[SPOC](<http://geophy.io/addresses/1129>, <http://geophy.io/ontologies/address#city>, ?city) [#1]
                     │        `─ Scan[PSOC](?city, <http://geophy.io/ontologies#id>, ?city_id) [#11.1M]
                     `─ NaryJoin(?municipality) [#19K]
                        +─ Scan[POSC](?municipality, rdf:type, <http://geophy.io/ontologies/geotmp#municipality>) [#19K]
                        +─ Scan[PSOC](?municipality, <http://geophy.io/ontologies#id>, ?municipality_id) [#11.1M]
                        `─ Scan[PSOC](?municipality, <http://www.geonames.org/ontology#locatedIn>, ?state) [#27.7M]

You might be interested in these three blog posts

Hi Samur,

Thanks for the plans. I have one guess what might cause the slowness but it’s hard to tell without data or further information. Could you answer the remaining questions and also tell how many results the query without the optionals return?

Also, we’d really appreciate if you send plans in an attached text file next time, that’d preserve the formatting.

Cheers,
Pavel

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.