Can Stardog load files larger than the DirectMemorySize limit? How?

alexj · September 3, 2018, 8:06pm

Stardog Server 5.3.3 crashed on loading the file larger than the DirectMomorySize.

WARN 2018-09-04 05:26:11,700 [Stardog.Executor-3] com.complexible.stardog.dht.impl.PagedDiskHashTable:createPage(1123): Available direct memory is low, will try to reduce memory usage
WARN 2018-09-04 05:26:13,817 [Stardog.Executor-3] com.complexible.common.rdf.rio.RDFStreamProcessor:setException(586): Error during loading /data/Downloads/yago3.1_entire_ttl/yagoSources.ttl: java.lang.OutOfMemoryError: Direct buffer memory
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory

then

: Parsing triples: 100% complete in 11:37:25 (475.0M triples - 11.4K triples/sec)
INFO 2018-09-04 05:26:13,842 [stardog-user-1] com.complexible.stardog.index.Index:stop(326):
INFO 2018-09-04 05:26:13,842 [stardog-user-1] com.complexible.stardog.index.Index:stop(329): Parsing triples finished in 11:37:25.777
ERROR 2018-09-04 05:26:13,856 [stardog-user-1] com.stardog.http.server.undertow.ErrorHandling:writeError(138): Unexpected error on the server
java.lang.NoClassDefFoundError: Could not initialize class io.netty.util.internal.Cleaner0

stephen · September 4, 2018, 1:41pm

Hi Alex,

Can we get some more details on what you’re trying to do? Can you also let us know how big the file is and how much direct/total memory there is on the machine?

alexj · September 4, 2018, 1:58pm

Hi Stephen,
I am trying to load YAGO

stephen · September 4, 2018, 2:14pm

How big is the yagoSources.ttl file you’re loading? How much memory does your machine have? How much memory is allocated to Stardog?

alexj · September 4, 2018, 10:08pm

The biggest file (yagoSources.ttl) is 30GB.
The computer has 32Gb of RAM, plus 16Gb of swap. I try not to use swap, it is there to prevent OOM killing
The primary hard drive is 256Gb NvME. The data drive is 2Tb spinning drive

stephen · September 5, 2018, 11:23am

How about the memory (heap/direct) allocated to the Stardog process, either through STARDOG_JAVA_ARGS or STARDOG_SERVER_JAVA_ARGS?

pierluigi · September 5, 2018, 3:38pm

Alex, just reporting my experience with YAGO and other large KBs:

Yes (to your original question), it’s definitely possible. In experimental (=non-production) mode, I at one point loaded both a substantial subset of YAGO and the entirety of FreeBase (the last publicly available version) on a machine with 48G RAM. Stardog is not an in-memory DB (by default at least) and “loading” doesn’t mean that every bit of deserialized stuff needs to be in memory at the same time. I wasn’t interested in seeing “how low you can go” on RAM, so I don’t know whether it could have worked in 32 or 16G, but there you have it.

For my case, I found that 48G for loading all that stuff were barely working. Queries otoh were much less happy (but still perfectly possible once you tweaked the env parameters Stephen is pointing to). For actual usage, I expanded at some point RAM on the virtual machine to 92G, and set about 3/4 of that for direct access.

The data loading was much more efficient if you change the default memory mode parameters to the value expressing “optimized for updates” (it’s in the doc), and that was really helpful in practical terms.

alexj · September 6, 2018, 9:40pm

Hi Stephen, I use 12Gb for Heap and 12Gb for External Memory…Somehow it ends up using 29Gb.
I also use bulk_load.
Should I use write_optimised instead? Because it looks like it loads first 30% very fast, then it becomes non-stop memory management.

zachary.whitley · September 6, 2018, 10:11pm

What do you have your swappiness set to? (Kernel param not Stardog)

alexj · September 6, 2018, 10:13pm

The swap file is 16Gb, but I can increase it, there is plenty of space on the NvME drive.

alexj · September 6, 2018, 10:14pm

The swap file is 16Gb, but I can increase it, there is plenty of space on the NvME drive. Sorry, first posted it without reply

zachary.whitley · September 6, 2018, 10:16pm

Not the amount of swap but the vm.swappiness kernel param. You might want to set it to 1

NvME is fast but you still want to avoid swapping if you can help it.

system · September 20, 2018, 10:16pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Out-Of-Memory exception while uploading not large triple files (100mb) Support	5	896	July 11, 2017
Performance deteriorates while loading large dataset Support	4	467	September 11, 2018
Prevent deteriorating load speed and swapping during bulk load Support	7	833	March 22, 2018
What is the bast way to load amassive dataset (YAGO) to Stardog Support	5	454	September 7, 2018
java.lang.OutOfMemoryError: Java heap space Support	19	3035	May 12, 2017

Can Stardog load files larger than the DirectMemorySize limit? How?

Related topics