I’m profiling our use of ICV, because it is adding a large overhead to our data processing (3x) and after some digging I see that the biggest difference between using ICV with reasoning vs. without it how much time is used on the query plan.
With reasoing: com.complexible.stardog.plan.eval.QueryEngine.getExecutablePlan(Query) 47%
Without reasoing: com.complexible.stardog.plan.eval.QueryEngine.getExecutablePlan(Query) 16%
Seems to be caused by createOptimized which uses 41.9 % of the time for icv with reasoning, vs 5.7% without reasoning.
Stack Trace
Sample Count
Percentage(%)
WITH REASONING com.complexible.stardog.plan.eval.ExecutablePlanFactory.createOptimized(ExecutionContext, Plan)
1,044
41.961
WITHOUT REASONING com.complexible.stardog.plan.eval.ExecutablePlanFactory.createOptimized(ExecutionContext, Plan)
113
5.762
Anything I can do here?
Cheers,
Håvard
PS: I can send the java flight recorder files if that helps.
This is the step on which reasoning (query rewriting) happens. Stardog tries to reuse query plans even when reasoning is enabled but it’s not always possible. One obvious example is when schema changes. But there could be more subtle situations, for example, when some predicates (classes or properties) which used to be empty (i.e. there were no data assertions for them in the database) become non-empty, or vice versa. That means some optimizations which were applied when the plan was previously generated can no longer be applied, and the query should be rewritten again.
It’s also possible that plans aren’t reused for a wrong reason. Any chance you can create the smallest update sequence (2 transactions wrt the same constraints) which don’t update the schema but createOptimized takes same time for both updates? We can then take a look at the obfuscated data.