Interesting performance issue

Has anyone noticed this kind of behaviour:

stop and start the server. run a query (I use the command line). I get the result in 7 seconds.
Now I repeat the same query 10 times. The average is 11 seconds, variance between 10.5-12.8.

Strange. Normally the first query is faster due to caching.

Yes, this is indeed unusual. Normally the first run is slower.

Can you share some more details, like the query plan, what other features you might be using (reasoning, search, spatial, etc.), Stardog version? Does the query plan remain the same after server restart?

Thanks,
Pavel

Hi Pavel, all,
yes. The query is complex but I can sure give all the details because the data is public. It comes from NIH articles. The database has 300 000 documents (when I posted the question there were a bit less). Most documents have a section about this article's references to other documents. The complex query is the only one with this behaviour. All query output files are correct and identical.

I'm not saying there's anything wrong, just curious about what happens.

for i in seq 1 10; do time bash bin/stardog query myDB "select ?c (count(?c) AS ?total) { ?a <http://purl.org/dc/terms/references> ?c } group by ?c order by ?total limit 100" > /dev/null; done

10.267

19.241

17.633

17.532

19.138

19.416

18.226

16.356

19.275

19.195

It's difficult to say what's going on, it's possible there're some environmental factors in play here. If you share the data, we'll be happy to try to reproduce this behaviour locally and provide a detailed answer.

Thanks,
Pavel

Sure! Here is a sample of 100 000 RDF documents that match with query:
http://natto.mooo.com/r1.zip

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.