We are trying to upload a few Rdf files to Stardog but end up getting a "RDFParseException" at (what I believe is) close to the end of the process that results in the whole operation being rolled over.
The triples parsing seems to end successfuly as I get:
INFO 2018-08-08 12:26:39,298 [stardog-user-21] com.complexible.stardog.index.Index:printInternal(314): Parsing triples: 100% complete in 00:53:12 (128.6M triples - 40.3K triples/sec)
INFO 2018-08-08 12:26:39,299 [stardog-user-21] com.complexible.stardog.index.Index:stop(326):
INFO 2018-08-08 12:26:39,299 [stardog-user-21] com.complexible.stardog.index.Index:stop(329): Parsing triples finished in 00:53:12.761
But then immediately after that I get the following error:
ERROR 2018-08-08 12:26:39,299 [stardog-user-21] com.complexible.tx.api.impl.DefaultTransaction:computePrepareResult(451): There was a fatal failure during preparation of 8b0e8a55-231d-4350-92dd-78c2a6436e0c
java.lang.RuntimeException: org.openrdf.rio.RDFParseException: Unexpected end of file
at com.complexible.stardog.index.IndexWriterDataMapImpl.waitForUpdates(IndexWriterDataMapImpl.java:129) ~[stardog-5.3.3.jar:?]
at com.complexible.stardog.index.IndexWriterDataMapImpl.applyChanges(IndexWriterDataMapImpl.java:238) ~[stardog-5.3.3.jar:?]
at com.complexible.stardog.index.IndexWriterImpl.prepare(IndexWriterImpl.java:234) ~[stardog-5.3.3.jar:?]
I am not sure what is causing this issue, especially since it happens after the successful parsing of triples.
Am still waiting on whether I can link a file but I did a little more digging and tried to upload some of the files individually to Stardog. It worked with some of the smaller files but when I tried with the biggest trig file (3.2GB in size) I get a "not enough memory to allocate buffers for rehashing" error during 'computing statistics' stage.
Log is as follows:
INFO 2018-08-16 14:30:51,255 [stardog-user-16] com.complexible.stardog.index.Index:stop(329): Parsing triples finished in 00:02:45.721
INFO 2018-08-16 14:30:57,280 [Stardog.Executor-800] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 40% complete in 00:00:06 (1018.7K triples/sec)
INFO 2018-08-16 14:30:58,300 [Stardog.Executor-910] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 48% complete in 00:00:07 (1045.5K triples/sec)
INFO 2018-08-16 14:30:59,331 [Stardog.Executor-792] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 56% complete in 00:00:08 (1064.0K triples/sec)
INFO 2018-08-16 14:31:00,371 [Stardog.Executor-792] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 64% complete in 00:00:09 (1077.1K triples/sec)
INFO 2018-08-16 14:31:01,503 [Stardog.Executor-792] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 73% complete in 00:00:10 (1092.9K triples/sec)
INFO 2018-08-16 14:31:02,515 [Stardog.Executor-800] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 81% complete in 00:00:11 (1103.7K triples/sec)
INFO 2018-08-16 14:31:03,763 [Stardog.Executor-919] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 88% complete in 00:00:12 (1079.4K triples/sec)
INFO 2018-08-16 14:31:04,949 [Stardog.Executor-919] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 94% complete in 00:00:13 (1053.1K triples/sec)
INFO 2018-08-16 14:31:06,176 [Stardog.Executor-910] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 98% complete in 00:00:14 (1007.7K triples/sec)
INFO 2018-08-16 14:31:07,373 [Stardog.Executor-910] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 100% complete in 00:00:16 (951.9K triples/sec)
INFO 2018-08-16 14:31:11,758 [stardog-user-16] com.complexible.stardog.index.Index:printInternal(314): Indexing triples: 100% complete in 00:00:20 (748.3K triples/sec)
INFO 2018-08-16 14:31:11,758 [stardog-user-16] com.complexible.stardog.index.Index:stop(326):
INFO 2018-08-16 14:31:11,759 [stardog-user-16] com.complexible.stardog.index.Index:stop(329): Indexing triples finished in 00:00:20.503
INFO 2018-08-16 14:32:33,073 [Stardog.Executor-927] com.complexible.stardog.index.Index:printInternal(314): Computing statistics: 20% complete in 00:00:07 (424.3K triples/sec)
INFO 2018-08-16 14:32:48,679 [Stardog.Executor-927] com.complexible.stardog.index.Index:printInternal(314): Computing statistics: 21% complete in 00:00:22 (141.1K triples/sec)
INFO 2018-08-16 14:33:09,374 [Stardog.Executor-924] com.complexible.stardog.index.Index:printInternal(314): Computing statistics: 22% complete in 00:00:43 (77.5K triples/sec)
INFO 2018-08-16 14:33:11,025 [Stardog.Executor-924] com.complexible.stardog.index.Index:printInternal(314): Computing statistics: 23% complete in 00:00:45 (78.1K triples/sec)
INFO 2018-08-16 14:33:57,660 [Stardog.Executor-923] com.complexible.stardog.index.Index:printInternal(314): Computing statistics: 23% complete in 00:01:31 (38.4K triples/sec)
ERROR 2018-08-16 14:34:16,425 [Stardog.Executor-792] com.complexible.stardog.index.statistics.ConcurrentCharacteristicSetStatisticsBuilder:build(121): Error computing the selectivity statistics
com.complexible.stardog.index.IndexException: com.complexible.common.base.Streams$UncheckedException: java.util.concurrent.ExecutionException: com.carrotsearch.hppc.BufferAllocationException: Not enough m
emory to allocate buffers for rehashing: 1,024 -> 2,048
at com.complexible.stardog.index.statistics.ConcurrentCharacteristicSetStatisticsBuilder.build(ConcurrentCharacteristicSetStatisticsBuilder.java:93) ~[stardog-5.3.3.jar:?]
at com.complexible.stardog.index.disk.statistics.DiskCharacteristicSetsStatisticsBuilder.build(DiskCharacteristicSetsStatisticsBuilder.java:114) ~[stardog-5.3.3.jar:?]
at com.complexible.stardog.index.AbstractIndex.updateStats(AbstractIndex.java:410) ~[stardog-5.3.3.jar:?]
at com.complexible.stardog.index.AbstractIndex.lambda$scheduleStatsUpdate$1(AbstractIndex.java:389) ~[stardog-5.3.3.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.util.concurrent.ExecutionException: com.complexible.common.base.Streams$UncheckedException: java.util.concurrent.ExecutionException: com.carrotsearch.hppc.BufferAllocationException: Not en
ough memory to allocate buffers for rehashing: 1,024 -> 2,048
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_181]
at com.complexible.common.util.concurrent.ExecutionGroup$1.executeAndWait(ExecutionGroup.java:78) ~[stardog-utils-common-5.3.3.jar:?]
at com.complexible.stardog.index.statistics.ConcurrentCharacteristicSetStatisticsBuilder.build(ConcurrentCharacteristicSetStatisticsBuilder.java:88) ~[stardog-5.3.3.jar:?]
... 8 more
Suppressed: java.util.concurrent.ExecutionException: com.complexible.stardog.index.IndexException: java.lang.OutOfMemoryError: GC overhead limit exceeded
Yes, as described in the docs, as your data gets bigger, Stardog will require more memory, and generally the more memory you are able to give it, the better. Try setting STARDOG_SERVER_JAVA_ARGS="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=8g" and see if that helps your load succeed.
So I tried with higher memory setting as suggested. With STARDOG_SERVER_JAVA_ARGS="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=8g" as well as STARDOG_SERVER_JAVA_ARGS="-Xms16g -Xmx16g -XX:MaxDirectMemorySize=32g" and now it just takes a lot of time on the 'computing statistics' step while initializing stardog. What might be the problem here? Log is as follows:
INFO 2018-08-21 14:03:00,149 [Stardog-Linux-Memory-Monitor] com.complexible.stardog.api.LinuxMemoryMonitor:run(118): Memory monitor thread shutdown
Waiting for running tasks to complete.....INFO 2018-08-21 14:04:33,701 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(176): Initializing R2RML registry
INFO 2018-08-21 14:04:33,907 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(185): Loaded R2RML registry with 0 sources
INFO 2018-08-21 14:04:36,429 [main] com.complexible.stardog.StardogKernel:start(2333): Initializing Stardog
Computing statistics: 29% complete in 00:00:06 (728.8K triples/sec)
Computing statistics: 31% complete in 00:00:08 (577.4K triples/sec)
Computing statistics: 33% complete in 00:00:10 (487.5K triples/sec)
Computing statistics: 33% complete in 00:01:37 (51.8K triples/sec)
Computing statistics: 34% complete in 00:01:39 (52.3K triples/sec)
Computing statistics: 39% complete in 00:01:50 (54.3K triples/sec)
Computing statistics: 40% complete in 00:01:54 (53.7K triples/sec)
Computing statistics: 43% complete in 00:01:58 (55.6K triples/sec)
Computing statistics: 45% complete in 00:02:02 (56.5K triples/sec)
Computing statistics: 47% complete in 00:02:05 (57.3K triples/sec)
Computing statistics: 50% complete in 00:02:28 (51.6K triples/sec)
Computing statistics: 52% complete in 00:02:35 (51.4K triples/sec)
Computing statistics: 54% complete in 00:02:39 (51.9K triples/sec)
Computing statistics: 57% complete in 00:10:09 (14.4K triples/sec)
Computing statistics: 58% complete in 00:17:53 (8.3K triples/sec)
Computing statistics: 59% complete in 00:18:09 (8.3K triples/sec)
Computing statistics: 60% complete in 00:25:38 (6.0K triples/sec)