Can Stardog load files larger than the DirectMemorySize limit? How?

Stardog Server 5.3.3 crashed on loading the file larger than the DirectMomorySize.

WARN 2018-09-04 05:26:11,700 [Stardog.Executor-3] com.complexible.stardog.dht.impl.PagedDiskHashTable:createPage(1123): Available direct memory is low, will try to reduce memory usage
WARN 2018-09-04 05:26:13,817 [Stardog.Executor-3] com.complexible.common.rdf.rio.RDFStreamProcessor:setException(586): Error during loading /data/Downloads/yago3.1_entire_ttl/yagoSources.ttl: java.lang.OutOfMemoryError: Direct buffer memory
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory

then

: Parsing triples: 100% complete in 11:37:25 (475.0M triples - 11.4K triples/sec)
INFO 2018-09-04 05:26:13,842 [stardog-user-1] com.complexible.stardog.index.Index:stop(326):
INFO 2018-09-04 05:26:13,842 [stardog-user-1] com.complexible.stardog.index.Index:stop(329): Parsing triples finished in 11:37:25.777
ERROR 2018-09-04 05:26:13,856 [stardog-user-1] com.stardog.http.server.undertow.ErrorHandling:writeError(138): Unexpected error on the server
java.lang.NoClassDefFoundError: Could not initialize class io.netty.util.internal.Cleaner0

Hi Alex,

Can we get some more details on what you’re trying to do? Can you also let us know how big the file is and how much direct/total memory there is on the machine?

Hi Stephen,
I am trying to load YAGO

How big is the yagoSources.ttl file you’re loading? How much memory does your machine have? How much memory is allocated to Stardog?

The biggest file (yagoSources.ttl) is 30GB.
The computer has 32Gb of RAM, plus 16Gb of swap. I try not to use swap, it is there to prevent OOM killing
The primary hard drive is 256Gb NvME. The data drive is 2Tb spinning drive

How about the memory (heap/direct) allocated to the Stardog process, either through STARDOG_JAVA_ARGS or STARDOG_SERVER_JAVA_ARGS?

Alex, just reporting my experience with YAGO and other large KBs:

Yes (to your original question), it’s definitely possible. In experimental (=non-production) mode, I at one point loaded both a substantial subset of YAGO and the entirety of FreeBase (the last publicly available version) on a machine with 48G RAM. Stardog is not an in-memory DB (by default at least) and “loading” doesn’t mean that every bit of deserialized stuff needs to be in memory at the same time. I wasn’t interested in seeing “how low you can go” on RAM, so I don’t know whether it could have worked in 32 or 16G, but there you have it.

For my case, I found that 48G for loading all that stuff were barely working. Queries otoh were much less happy (but still perfectly possible once you tweaked the env parameters Stephen is pointing to). For actual usage, I expanded at some point RAM on the virtual machine to 92G, and set about 3/4 of that for direct access.

The data loading was much more efficient if you change the default memory mode parameters to the value expressing “optimized for updates” (it’s in the doc), and that was really helpful in practical terms.

Hi Stephen, I use 12Gb for Heap and 12Gb for External Memory…Somehow it ends up using 29Gb.
I also use bulk_load.
Should I use write_optimised instead? Because it looks like it loads first 30% very fast, then it becomes non-stop memory management.

What do you have your swappiness set to? (Kernel param not Stardog)

The swap file is 16Gb, but I can increase it, there is plenty of space on the NvME drive.

The swap file is 16Gb, but I can increase it, there is plenty of space on the NvME drive. Sorry, first posted it without reply

Not the amount of swap but the vm.swappiness kernel param. You might want to set it to 1

NvME is fast but you still want to avoid swapping if you can help it.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.