java.lang.OutOfMemoryError: Java heap space

Hi,
I loaded 8.2 billion triples into Stardog 4.2.4, and the following message is in the stardog.log and no result was obtained when I issued a SPARQL query such as listing up all the classes in the data.
java.lang.OutOfMemoryError: Java heap space
I set the following environment variables.
STARDOG_JAVA_ARGS=-Xms16g -Xmx16g -XX:MaxDirectMemorySize=256g
So, could you tell me what I should do to work around this issue?

The followings are the memory options recorded in the log file.
[main] com.complexible.stardog.cli.impl.ServerStart:call(250): Min Heap Size: 16G
[main] com.complexible.stardog.cli.impl.ServerStart:call(251): Max Heap Size: 15G
[main] com.complexible.stardog.cli.impl.ServerStart:call(252): Max Off-Heap : 256G
[main] com.complexible.stardog.cli.impl.ServerStart:call(253): System Memory: 1.2T

Regards.

You may want to consider allocating more heap space.

We’ve had a lot of success with overly large heap size. For our workload we’ve not seen much allocation from our off-heap. I would recommend experimenting, maybe trying to have 64GB heap and 128GB off-heap.

This may not be a problem as soon as Stardog 5 comes out.

I cannot wait for the release! Could you tell me when?

Well, as this document (Home | Stardog Documentation Latest) says, 16GB JVM and 256GB Off-hap memory are recommended for 50 billion, I set the above parameters even though our dataset has only 8.2 billion triples, but I learnt that it needs more. This is unpractical for us to use.

What method were you using to load the data? For billion triples and more the recommended way is to use db create command with the files on the server side.

Best,
Evren

I loaded the data as follows:
stardog data add -g <graph> --server-side http://localhost:1111/MyDB <.nt files>
Since there are many named graphs, I cannot load the entire data at once when creating the DB.

You can still use the db create command if you put each named graph IRIs into the files which should be loaded into that named graph, for example, using the TriG format. It’s likely it will be faster than data add.

Cheers,
Pavel

You can also use the API which allows you to define the named graph for each file. See the example code here:

Best,
Evren

The problem I have is to get OutOfMemoryError when I issue a SPARQL query. Will to load an RDF dataset of 8.2 billion triples at creating a db be the solution to this?

Sorry, no, loading and query answering are not related. I mixed up this thread with the other question you were asking about loading.

The very large direct memory recommendations in the documentation are for getting the best loading speed. During query answering Stardog 4 does not use direct memory very much so you should increase the heap memory. You would see a warning about -XX:+UseCompressedOops when you increase the heap size to more than 32gb but you can ignore this.

It is also possible that something in your query might be improved for better performance and/or memory usage. If you can share the query along with the query plan printed by the stardog query explain command then we can provide some suggestions.

Best,
Evren

PS: Stardog 5 beta version will be released in a week or so. Stardog 5 uses direct memory heavily for query answering and it will have different modes for memory usage. We will provide detailed information about that in the documentation.

Thank you for your information.

After modifying the memory environment, I got the following errors when issuing some SPARQL queries including the query of [Stardog query plan for query 2 · GitHub], while OutOfMemoryError doesn't occur now.

ERROR 2017-04-24 11:44:04,751 [SPEC-Server-1-12] com.complexible.common.protocols.server.rpc.ServerHandler:exceptionCaught(401): exceptionCaughtServerHandler

ERROR 2017-04-24 11:44:04,755 [StardogServer.WorkerGroup-11] com.complexible.stardog.protocols.http.server.HttpMessageEncoder:createErrorResponse(293): The result encoder received an error message it could not encode, error was:
java.io.IOException: 接続が相手からリセットされました

ERROR 2017-04-24 12:40:00,013 [StardogServer.WorkerGroup-7] com.complexible.stardog.protocols.http.server.HttpMessageEncoder:write(171): There was an error writing the HTTP response
org.openrdf.query.QueryEvaluationException: com.complexible.stardog.plan.eval.operator.OperatorException: Query execution cancelled: Execution time exceeded query timeout 300000

Best regards

This means the query execution time exceeded the default timeout value and the server killed it.

This might indicate either a hard query (given the amount of data) or a sub-optimal query plan. If you send us the output of the query explain command for that query, we might be able to tell. If it’s sensitive, feel free to send privately.

You may also disable the timeout (or set it to a higher value) using the query.timeout property in stardog.properties. query.timeout=0m will disable it.

Cheers,
Pavel

Thanks, and here is the query plan I got.

Best regards

The problem are the loop joins which usually indicate disconnected queries (i.e. there are parts of the query which do not have a join condition). Can we see the query?

This is the query.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX faldo: <http://biohackathon.org/resource/faldo#>

SELECT DISTINCT ?parent_label ?label ?begin_location ?end_location ?seq_length ?comment (GROUP_CONCAT(?substitution; SEPARATOR = ", ") AS ?substitutions) ?seq ?feature_identifier
FROM <http://togogenome.org/graph/uniprot>
FROM <http://togogenome.org/graph/tgup>
WHERE {
  {
    SELECT ?gene
    {
      <http://togogenome.org/gene/103690:PCC7120DELTA_RS09085> skos:exactMatch ?gene .
    } ORDER BY ?gene LIMIT 1
  }
  <http://togogenome.org/gene/103690:PCC7120DELTA_RS09085> skos:exactMatch ?gene ;
    rdfs:seeAlso ?id_upid .
  ?id_upid rdfs:seeAlso ?protein .
  ?protein a up:Protein ;
           up:annotation ?annotation .

  ?annotation rdf:type ?type .
  ?type rdfs:label ?label .

  ?type rdfs:subClassOf* ?parent_type .
  ?parent_type rdfs:subClassOf up:Sequence_Annotation ;
               rdfs:label ?parent_label .

  ?annotation up:range ?range .
  OPTIONAL { ?annotation rdfs:comment ?comment . }
  ?range faldo:begin/faldo:position ?begin_location ;
         faldo:end/faldo:position ?end_location .

  ?protein up:sequence ?isoform .
  BIND( REPLACE( STR(?protein), "http://purl.uniprot.org/uniprot/", "") AS ?up_id)
  FILTER( REGEX(STR(?isoform), ?up_id))
  ?isoform rdf:value ?value .

  OPTIONAL {
    ?annotation up:substitution ?substitution . 
    ?isoform rdf:value ?seq .
  }
  OPTIONAL {
    ?isoform rdf:value ?seq_txt .
    BIND (STRLEN(?seq_txt) AS ?seq_length) .
  }
  OPTIONAL {
    ?annotation rdf:type ?type .
    BIND (STR(?annotation) AS ?feature_identifier) .
    FILTER REGEX(STR(?annotation), "http://purl.uniprot.org/annotation")
  }
}
GROUP BY ?parent_label ?label ?begin_location ?end_location ?seq_length ?comment ?seq ?feature_identifier
ORDER BY ?parent_label ?label ?begin_location ?end_location

The main problem is this part:

  OPTIONAL {
    ?annotation up:substitution ?substitution . 
    ?isoform rdf:value ?seq .
  }

Currently Stardog evaluates this pattern separately from the rest which results in a Cartesian product. This is a known problem and we have an open ticket for it.

In most cases, however, the intention would be to do

  OPTIONAL {
    ?annotation up:substitution ?substitution . 
  }
  OPTIONAL {
    ?isoform rdf:value ?seq .
  }

(possibly with the reverse order). That’d eliminate the loop join and the query will be faster (can’t say without trying how much faster though since there’re other complex parts in the query, e.g. grouping on many variables).

PS. There’s already a ?isoform rdf:value ?seq_txt . in another OPTIONAL in your query so maybe you can drop ?isoform rdf:value ?seq completely.
PPS. ?annotation rdf:type ?type . is both in the main pattern and in an OPTIONAL block which is unnecessary. If you remove it from the OPTIONAL, it’ll eliminate a potentially expensive join.

1 Like

Thank you very much for telling me the valuable information.
Now the performance has been improved.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.