File name can cause file not found exceptions during bulk load

I am using Stardog 5.0.1 on Ubuntu and while loading ttl files into my database, certain file names can cause the bulk load to fail. The command looks like this:

./bin/stardog data add irish-gen ~/irishGens/irish-gen/LL/*.ttl

The error looks like this:

Adding data from file: /home/cyocum/irishGens/irish-gen/LL/aisneidem_di_araill.ttl
Adding data from file: /home/cyocum/irishGens/irish-gen/LL/na_n_dese_breg.ttl
Adding data from file: /home/cyocum/irishGens/irish-gen/LL/h_celicain.ttl
An error occurred adding RDF to the index: /home/cyocum/irishGens/irish-gen/LL/na_fortÃÂșath.ttl

The file name is actually "LL/na_fortĂșath.ttl"

The log has this in it

ERROR 2017-07-30 18:05:40,817 [XNIO-1 task-21] com.stardog.http.server.undertow.ErrorHandling:writeError(179): Unexpected error on the server
com.complexible.stardog.db.DatabaseException: An error occurred adding RDF to the index: /home/cyocum/irishGens/irish-gen/LL/na_fortÃÂșath.ttl
at com.complexible.stardog.db.DatabaseConnectionImpl.throwDatabaseException(DatabaseConnectionImpl.java:662) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DatabaseConnectionImpl.add(DatabaseConnectionImpl.java:685) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DelegatingDatabaseConnection.add(DelegatingDatabaseConnection.java:213) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DatabaseImpl$DBConnectionWrapper.add(DatabaseImpl.java:1355) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DelegatingDatabaseConnection.add(DelegatingDatabaseConnection.java:213) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.StardogKernel$KernelDbConnection.add(StardogKernel.java:3172) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DelegatingDatabaseConnection.add(DelegatingDatabaseConnection.java:213) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.StardogKernel$NotifyingDatabaseConnection.add(StardogKernel.java:3268) ~[stardog-5.0.1.jar:?]
at java.util.stream.Streams$StreamBuilderImpl.forEachRemaining(Streams.java:419) ~[?:1.8.0_131]
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) ~[?:1.8.0_131]
at com.complexible.stardog.protocols.http.server.TransactionService.add(TransactionService.java:170) ~[stardog-protocols-http-server-5.0.1.jar:?]
at com.stardog.http.server.undertow.jaxrs.ExtractRoutes.lambda$handleIt$91(ExtractRoutes.java:186) ~[stardog-protocols-http-server-5.0.1.jar:?]
at org.apache.shiro.subject.support.SubjectRunnable.doRun(SubjectRunnable.java:120) ~[shiro-core-1.2.3.jar:1.2.3]
at org.apache.shiro.subject.support.SubjectRunnable.run(SubjectRunnable.java:108) ~[shiro-core-1.2.3.jar:1.2.3]
at com.stardog.http.server.undertow.ErrorHandling.lambda$safeDispatch$46(ErrorHandling.java:70) ~[stardog-protocols-http-server-5.0.1.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: com.complexible.stardog.StardogException: An error occurred adding RDF to the index: /home/cyocum/irishGens/irish-gen/LL/na_fortÃÂșath.ttl
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl$IndexResourceTransaction.add(ConnectableIndexRWConnectionImpl.java:640) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl.add(ConnectableIndexRWConnectionImpl.java:400) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DatabaseConnectionImpl.add(DatabaseConnectionImpl.java:681) ~[stardog-5.0.1.jar:?]
... 16 more
Caused by: com.complexible.stardog.index.IndexException: /home/cyocum/irishGens/irish-gen/LL/na_fortÃÂșath.ttl
at com.complexible.stardog.index.IndexWriterDataMapImpl.update(IndexWriterDataMapImpl.java:119) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.IndexWriterDataMapImpl.add(IndexWriterDataMapImpl.java:70) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.IndexWriterImpl.add(IndexWriterImpl.java:145) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.impl.DelegatingIndexWriter.add(DelegatingIndexWriter.java:51) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.Indexes.add(Indexes.java:375) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl$IndexResourceTransaction.add(ConnectableIndexRWConnectionImpl.java:636) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl.add(ConnectableIndexRWConnectionImpl.java:400) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DatabaseConnectionImpl.add(DatabaseConnectionImpl.java:681) ~[stardog-5.0.1.jar:?]
... 16 more
Caused by: java.nio.file.NoSuchFileException: /home/cyocum/irishGens/irish-gen/LL/na_fortÃÂșath.ttl
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:1.8.0_131]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:1.8.0_131]
at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_131]
at java.nio.file.Files.newByteChannel(Files.java:407) ~[?:1.8.0_131]
at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) ~[?:1.8.0_131]
at java.nio.file.Files.newInputStream(Files.java:152) ~[?:1.8.0_131]
at com.complexible.common.rdf.rio.RDFStreamBuilder$RDFFileStream.openInputStream(RDFStreamBuilder.java:369) ~[stardog-utils-rdf-5.0.1.jar:?]
at com.complexible.common.rdf.rio.RDFStreamBuilder$RDFAbstractStream.parse(RDFStreamBuilder.java:196) ~[stardog-utils-rdf-5.0.1.jar:?]
at com.complexible.common.rdf.rio.RDFStreamProcessor$ConcurrentLoadManagerImpl.add(RDFStreamProcessor.java:489) ~[stardog-utils-rdf-5.0.1.jar:?]
at com.complexible.common.rdf.rio.RDFStreamProcessor.add(RDFStreamProcessor.java:208) ~[stardog-utils-rdf-5.0.1.jar:?]
at com.complexible.stardog.index.IndexWriterDataMapImpl.update(IndexWriterDataMapImpl.java:114) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.IndexWriterDataMapImpl.add(IndexWriterDataMapImpl.java:70) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.IndexWriterImpl.add(IndexWriterImpl.java:145) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.impl.DelegatingIndexWriter.add(DelegatingIndexWriter.java:51) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.index.Indexes.add(Indexes.java:375) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl$IndexResourceTransaction.add(ConnectableIndexRWConnectionImpl.java:636) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.index.ConnectableIndexRWConnectionImpl.add(ConnectableIndexRWConnectionImpl.java:400) ~[stardog-5.0.1.jar:?]
at com.complexible.stardog.db.DatabaseConnectionImpl.add(DatabaseConnectionImpl.java:681) ~[stardog-5.0.1.jar:?]
... 16 more

This seems to happen on files that have accents in them but not all of them, I have had other files in bulk load that upload fine and appear fine on the console. Additionally, my collaborator on the project works on Windows and I have seen this more often on files that come from his machine. I work on Linux exclusively and I have not seen a problem.

A small niggle is that this error seems to stop bulk loading so nothing else is done after it is encountered. It would be helpful if the bulk load just skipped these files then produced a report at the end with a list of files that it couldn't process. That way I could identify the files not loaded and do something with them.

You can look at the files on my github page: GitHub - cyocum/irish-gen: Traditional Irish genealogies represented as TRiG RDF named graphs.

Thanks for all your hard work on the 5.0 release!

Hi,

Thanks for the bug report. This is indeed a bug on the client where we aren’t correctly setting the charset. It appears that it only affects files whose NAMES contain accented characters. This will be addressed in a future release.

As a workaround, you can upload the affected files separately via the HTTP API and setting Content-Type to text/turtle; charset=utf-8

1 Like

Hi,

Great. Thank you very much for looking into this.