Loading invalid dates

We have some triples we'd like to load in stardog, but the import crashes as soon as it finds an invalid date, such as november 31st. Any way to have it skip these invalid lines ? Marklogic loads the files without complaint or errors.

There was a fatal failure during preparation of 626a6519-5522-48ed-acfd-288250913acf com.stardog.stark.io.InvalidRDF: '1989-11-31' is not a valid value for datatype http://www.w3.org/2001/XMLSchema#date [L12671941]

This is Stardog's default behavior, but you can configure it to ignore the invalidity and load anyway by setting strict.parsing=false when creating the database or by including that line in your stardog.properties file.

Yeah, it worked when I created the database via command line with a few caveats

  • IN SD Studio, strict parsing was checked to off, even though the databases get created with the option defaulting to ON as per the doc & behavior. Version 6.1.0
  • The actual command line doesn't work unless -- is added. As per the manpage, it should, since there are no files loaded at the same time

/opt/stardog/bin/stardog-admin db create -n KnowledgeBase -i --options strict.parsing=false --

I wrote a small user defined function for just this situation. So now you've got it loaded but you've got invalid dates (I saw the exact problem a lot with dbpedia 31st day in a month with only 30 days) and for some dates it isn't the easiest to identify the problem like leap days. It's a Boolean function isValidXsd() that just returns if it's valid or not so you can identify bad dates and clean up your data. Let me know if you're interested and I'll find it. It's probably built against an old version of Stardog so it might take a little to get it built against the latest version.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.