Server not responding after enabling search.enabled and datetime parsing error

Hi

Using 7.6.4 after loading a large number of triples into a db, i then took it offline to enable (search.enabled=true) search. However when trying to put it back online it seemed to hang. After leaving it sometime nothing appeared in the log, there was some minimal cpu usage but no disk space being used up. I was able to stop the server and restart it without any errors or warnings in the log. However upon start up the last line in the log is "Initializing Stardog",. I assume it may be working on the indexing but again low CPU usage (not quite one cores worth), memory is creeping up very slowly and log has no further info after 20 minutes.

Should it be logging anything whilst indexing? None of the CLI tools can connect to it for any kind of status info. Is there an increased logging level that will provide some information?

Thanks

Tony

Talk about timing, after trying restarts etc, and then just waiting it finally started doing something, i'm seeing indexes created under waldo directory, and the log finally started getting messages. First line:

INFO 2021-07-12 13:41:31,801 [ForkJoinPool.commonPool-worker-15] com.complexible.stardog.search.waldo.DefaultIndexer:print(386): Indexing text: 2.0M literals in 00:26:17.535

So 26 minutes before it logged anything. The server though still seems to be unreachable. So the only way to monitor it is via the log file, and cannot do anything else with the server whilst this is running it would seem.

Tony

So it had been indexing for over 6 hours, no idea how far along this was, but it just stopped and threw out an error related to the previous problem I'd reported with Stardog and compatability with RDF/1.1 and XML Schema 1.1 and datetime datatype. I'd turned of strict parsing so i could load the data. It seems the same data caused a problem during indexing:

com.complexible.stardog.db.DatabaseException: com.complexible.stardog.db.ConnectableException: java.lang.IllegalArgumentException: Year = 0, Month = 1, Day = 1, Hour = 0, Minute = 0, Second = 0, fractionalSecond = -2,147,483,648, Timezone = 0 , is not a valid representation of an XML Gregorian Calendar value.

This was caused by:

Caused by: java.lang.IllegalArgumentException: Year = 0, Month = 1, Day = 1, Hour = 0, Minute = 0, Second = 0, fractionalSecond = -2,147,483,648, Timezone = 0 , is not a valid representation of an XML Gregorian Calendar value.
	at org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl.<init>(Unknown Source) ~[xercesImpl-2.12.1.jar:?]
	at org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl.createDateTime(Unknown Source) ~[xercesImpl-2.12.1.jar:?]

The documentation Full-text Search | Stardog Documentation Latest says that by default only xsd:string and rdf:langString are indexed by default.

I assume the error is likely the same datetime validation as during the loading without strict.parsing disabled. Is there anyway around this, and also given how long it was taking approx 1M literals per minute, anyway to speed it up on a subsequent run.

Thanks

Tony

Second go at this reply.....

I was able to get the indexing to finish by making some changes. Strangely the last log before the completion was the same as the previous go when it crashed. I checked the config and the search index datatypes was set only to xsd:string and rdf:langString as per the documentation so I dont quite understand why it would have failed previously on the date string. The indexing finished with the following messages in the log followed by the usual banner and startup messages:

INFO  2021-07-13 16:22:21,315 [ForkJoinPool.commonPool-worker-11] com.complexible.stardog.search.waldo.DefaultIndexer:stop(392): Indexing text: Finished in 06:51:32.891
Loading Databases: 100% complete in 06:51:36
INFO  2021-07-13 16:22:23,303 [main] com.complexible.stardog.StardogKernel:write(77): Loading Databases: 100% complete in 06:51:36
Loading Databases: 100% complete in 06:51:36
INFO  2021-07-13 16:22:23,303 [main] com.complexible.stardog.StardogKernel:write(77): Loading Databases: 100% complete in 06:51:36

I went through this twice (at 6.5 hours each) because when it finished the first time, i tried putting the database into online mode and it output the following error:

Lock held by this virtual machine: /data/stardog/dataset/waldo/write.lock

When i checked the waldo directory there were no files except the write.lock in it. At this point i checked the status and config parameters, and after restarting the server it just started indexing all over again, and I again couldnt use the server. So when it finished the second time i went and checked the waldo directory and it had a set of lucene index files along with search.metadata and write.lock files. So i took a backup incase it got deleted and i might be able to restore it. Again upon trying to put the db online it gave the same error and removed all the files. I put a copy back but upon restarting the server it started the indexing process over.

So currently I cannot use the server and cannot disable indexing unless i let it go through the whole thing again, and even then im not guaranteed to be able to disable the indexing. I couldnt find any documented startup options to skip indexing or anything like that, are there any such options?

Thanks

Tony

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.