We are experiencing performance problems with Stardog requests (about 500 000ms minimum to get an answer). We followed the Debian Based Systems installation described in the Stardog documentation and have a stardog service installed in our Ubutu VM.
Knowing that there is only stardog server installed in this VM, 8G JVM Heap Memory & 20G Direct Memory for Java, is it normal to have 1.9G in memory (No process in progress)
and 4.1G (when the query is in progress)
Profiling results:
Query executed in 430029 ms and returned 17334 result(s)
Total used memory: 9.4M
Pre-execution time: 16 ms (0.0%)
Post-processing time: 13 ms (0.0%)
according to the provided profile, the high query execution time is not related to the memory setting. Instead, the query plan optimizer seems to select a sub-optimal operator to retrieve data from the other database (db://johndoe_DICO). You can see in the profile that the ServiceJoin operator takes the largest portion in the total execution time:
ServiceJoin [#5.0K], results: 101K, wall time: 428910 ms (99.7%)
This can occur due to stale statistics or to cardinality estimation errors. Therefore, as a first step, I suggest to run db optimize to update the statitsics. If this does not help, you can use the following query hint to force the optimizer not to use a ServiceJoin:
#pragma join.service off
According to your profile, you seem to be using an older version of Stardog. Updating the version can also be helpful as there have recently been updates that should improve the query plans for this type of query (e.g., #PLAT-2650).
My version of Stardog Server : 8.1.1
I optimized the two bases used by the request then I put the #pragma join.service off in the request and launch a profiler, very fast results on profiler but not when i launch the query.
Time Profiler results : 678 ms Time Query results : 468 737 ms
Profiling results:
Query executed in 678 ms and returned 17334 result(s)
Total used memory: 11M
Pre-execution time: 12 ms (1.8%)
Post-processing time: 12 ms (1.8%)
the profiling results that you shared look good. However, it is unexpected that running the query (not profiling) still yields the high execution times. To rule out that this is caused by the query plan cache, you can either take the DBs offline and then online again. This will clear the plan cache. (Alternatively, you can add #pragma plan.cache off to make sure that no cached plans are reused)
Regardless, I suggest updating to the latest release (v8.2.1). As previously mentioned, the newer version includes improvements when querying federations (e.g., other databases). This should alleviate the problems without requiring the query hint.