Scalability issues in Stardog Cloud

I'm having perfomance issues with queries that yield many results. My database contains 11 million triples in two named graphs. All queries in the largest of the graphs, containing 7 million triples, take 400 miliseconds or more when limited to 1000 results in Stardog studio, and several seconds in queries without the limit in my application.

I created the database in with these settings:

search.enabled=true
spatial.enabled=true
query.all.graphs=true
search.wildcard.search.enabled=true
preserve.bnode.ids=true
search.default.query.operator=AND
search.index.properties.included=https://data.spraksamlingane.no/stadnamn/archive/bsn/navn,https://data.spraksamlingane.no/stadnamn/archive/bsn/navnform,https://data.spraksamlingane.no/stadnamn/archive/bsn/oppslord,https://data.spraksamlingane.no/stadnamn/archive/bsn/pf_navn,https://data.spraksamlingane.no/stadnamn/archive/hord/alternatiForm,https://data.spraksamlingane.no/stadnamn/archive/hord/namn,https://data.spraksamlingane.no/stadnamn/archive/hord/normertForm,rdfs:label

The graph contains many blank nodes.

My use case is a semantic web portal that queries a sparql endpoint and loads all instances in a dataset into a map. A demo website of the framework I'm using, SampoUI, has achieved acceptable performance using Fuseki instead of Stardog, in a dataset that is larger than mine:
https://sampo-ui.demo.seco.cs.aalto.fi/en/perspective3/faceted-search/map

Hi Henrik,

Can you share a couple of queries with poor performance? You can use the query profiler (from CLI, your application, or directly in Studio with a larger LIMIT) and share its output with us.

Best,
Pavel

I thought it was an issue with any SPARQL query, even the most basic ones:

SELECT * FROM <http://data.stadnamn.uib.no/stedsnavn/rygh> WHERE { ?s ?p ?o } 

When querying the same data in Fuseki however, I found that Stardog performs better than Fuseki as the limit increases. I suspect the Fuseki server of the demo website I linked to caches the results of the query that returns more than 300 000 coordinates for the cluster map. Is there some way to accomplish this with Stardog?

Fuseki does however perform better for queries that return few results. When adding LIMIT 100 to the query above, it takes 29ms in Fuseki and 390ms in Stardog.

In my application, I would like the following query to perform well when getting 180 000 coordinates:

PREFIX  wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX  dct:  <http://purl.org/dc/terms/>
PREFIX hord: <https://data.spraksamlingane.no/stadnamn/archive/hord/>

SELECT DISTINCT ?id ?lat ?long 
 FROM <https://data.spraksamlingane.no/stadnamn/archive/hord> {
    ?row a hord:Row ;
         ^dct:hasPart ?id .

    ?id dct:hasPart ?coordinates .
    ?coordinates wgs84:lat ?lat ;
                 wgs84:long ?long .
}

With a limit of 200 000 it has taken between 5 and 12 seconds
to run the query in Stardog studio. This is the query plan:

From local
From named local named
Slice(offset=0, limit=200000) [#13K]
`─ Distinct [#13K]
   `─ Projection(?id, ?lat, ?long) [#13K]
      `─ MergeJoin(?row) [#13K]
         +─ Scan[POS](?row, <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, hord:Row){<https://data.spraksamlingane.no/stadnamn/archive/hord>} [#110K]
         `─ Sort(?row) [#57K]
            `─ MergeJoin(?id) [#57K]
               +─ Scan[PSO](?id, dct:hasPart, ?row){<https://data.spraksamlingane.no/stadnamn/archive/hord>} [#290K]
               `─ Sort(?id) [#38K]
                  `─ MergeJoin(?coordinates) [#38K]
                     +─ Scan[POS](?id, dct:hasPart, ?coordinates){<https://data.spraksamlingane.no/stadnamn/archive/hord>} [#290K]
                     `─ MergeJoin(?coordinates) [#89K]
                        +─ Scan[PSO](?coordinates, wgs84:long, ?long){<https://data.spraksamlingane.no/stadnamn/archive/hord>} [#150K]
                        `─ Scan[PSO](?coordinates, wgs84:lat, ?lat){<https://data.spraksamlingane.no/stadnamn/archive/hord>} [#150K]

When querying the same data in Fuseki however, I found that Stardog performs better than Fuseki as the limit increases. I suspect the Fuseki server of the demo website I linked to caches the results of the query that returns more than 300 000 coordinates for the cluster map. Is there some way to accomplish this with Stardog?

No, Stardog doesn't provide any functionality for caching query results. I imagine it can be set up externally with something like Memcached.

Fuseki does however perform better for queries that return few results. When adding LIMIT 100 to the query above, it takes 29ms in Fuseki and 390ms in Stardog.

390ms definitely sounds a lot for reading 100 triples. How exactly did you measure it? Did you warm up the system or average over multiple runs? Did you measure on the client side (eg Studio) or use the profiler to measure on the server side?

With a limit of 200 000 it has taken between 5 and 12 seconds to run the query in Stardog studio.

First of all, 5 to 12 seconds is a very large variance. We need to establish the source of it. I suggest using query explain --profile (or the built-in Studio profiler) multiple times to run this query and sharing all outputs with us. You can also access it through the Java API from your application. Then we can tell what is happening on the server side. At this point, there's not enough evidence there's much in common between this issue and the 390ms to run the simple query.

Best,
Pavel

I didn't do any systematic benchmarking, I only looked at the timing next to the number of results in Stardog Studio for a few runs.

I've now looped this command (in windows subsystem for linux), and I can't reproduce the variation i saw yesterday:

stardog query explain --profile -u $STARDOG_USERNAME -p $STARDOG_AUTH https://sd-33591800.stardog.cloud:5820/archive "PREFIX  wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#> PREFIX  dct:  <http://purl.org/dc/terms/> PREFIX hord: <https://data.spraksamlingane.no/stadnamn/archive/hord/> SELECT DISTINCT ?id ?lat ?long FROM <https://data.spraksamlingane.no/stadnamn/archive/hord> { ?row a hord:Row ; ^dct:hasPart ?id . ?id dct:hasPart ?coordinates . ?coordinates wgs84:lat ?lat ; wgs84:long ?long . }" >> explain.txt

Output:
explain.txt (48.9 KB)

Variation aside, are these execution times longer than normal?

Well "normal" depends on environment, particularly hardware. I don't see any particular problems in the query plan, other than spending ~25% of execution time on sorting data (which is a CPU-bound activity).

For now let's get back to the time variance on the client side. Do I understand correctly that when you use the CLI and the profiler, the server-side execution time is consistently around 2.5s but when you measure in Studio, i.e. on the client, it varies between 5 and 12s?

Cheers,
Pavel

In Stardog Studio it is now consistently around 5s. I had a few execution times above 10s yesterday, before it dropped to 5s. Unfortunately I didn't look at the query profiler when it happened.

Hi Henrik,

If you ran a traceroute from your current location to stardog cloud, I believe you'll find either a few significantly slow network hops along the way, or a very long series of hops - as Stardog cloud is currently hosted in US west cloud servers. We expect to offer something more local for our EU friends in the not too distant future.

If you have demand there for a potential enterprise deal, enterprise prospects may get approved for a trial license that would enable you to run the Stardog server locally. Otherwise, Stardog cloud typically provides a good customer journey from learning on free, to prototyping on Essentials, to larger enterprise environments later on. Note that the front end tools that you are using in Stardog Cloud are all single page applications that run in your browser, so that would bring this to locally for you instead of around-the-world.

A very low cost short term solution may be to look at a VPN client for your machine, and see if any of the VPNs have a better series of hops.

Thanks,
Al