ICV slow when graph without ICV has 30 million triples - Part II

Hi @stephen

My previous post was closed due to > 14 days

I’ve been looking more into why ICV was so slow and I think I’ve discovered something. I had reasoning.consistency.automatic=true in my database. I’ve turned it off, and now ICV is much snappier again.

With reasoning.consistency.automatic=true I could insert data into my database and then run the following query with reasoning:

# Constraint: AxiomConstraint{bruker:Innsynskrav rdfs:subClassOf (bruker:forsendelsesmåte min 1 owl:Thing)}

SELECT DISTINCT *
FROM <http://data.einnsyn.no/innsynskravGraph>
FROM <http://www.arkivverket.no/standarder/noark5/arkivstruktur/ontologyGraph>
FROM <http://data.einnsyn.no/osloKommuneVirksomheterGraph>
FROM <http://data.einnsyn.no/brukereGraph>
FROM <http://data.einnsyn.no/virksomheterGraph>
WHERE {
   ?x0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.einnsyn.no/brukermeta/Innsynskrav> .
   FILTER NOT EXISTS {
      ?x0 <http://data.einnsyn.no/brukermeta/forsendelsesmåte> ?x1 .
   }
}

Which would take around 7 seconds on a test database.

Turning reasoning.consistency.automatic=true to false and following the same procedure, upload data and run the query now takes 50 ms.

@stephen, do you know why reasoning.consistency.automatic=true makes that query so slow after inserting data?

Mind you I had most of my classes set to disjoint in my ontology as a safeguard.

Hi Håvard,

Running reasoning.consistency.automatic will do an automatic check of consistency across the entire database on each commit. Having that many disjoints will certainly slow that down.

What you likely want is instead icv.consistency.automatic, which will only do the check as part of ICV, and only over the icv.active.graphs. That should help with performance, especially in the case where you’re inserting large data sets into graphs that aren’t included in icv.active.graphs, though the large number of disjoints could still have some impact on that.

The automatic reasoning consitency has been on all along. It’s only in conjunction with ICV that it’s affected the performance of any read queries.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.