Hi,
I am pretty new to Stardog. I am using SNARL to create and load triples into Stardog. The triples are generated from a Java data structure which is the end result of a UIMA pipeline. I am having two specific issues: handling nulls that crop up in the data, and logging.
First of all, how do you enable logging when inserting data from SNARL? My stardog.log file only has records of inserts through the admin interface, but has nothing for inserts that happen from SNARL. That has meant that it took much longer to find the source of the following error.
What was happening was that Stardog was throwing a null pointer exception intermittently. I traced it down to a place in the underlying data that contained a null value. It is hard to do anything about that because there is a lot of data. So I am wondering if there is a standard SNARL way of handling null values before they get into the Statement structure. Here is the code - the null pointer exception is thrown at the line “Model ctgraph = Models2.newModel(stmts);”
private void writeToStardog(ClinicalTrialInfo trial)
{
System.err.println("about to write to stardog");
System.err.println("Trial is " + trial.getNctId());
// establish a connection to the ctkr database
Connection aConn = ConnectionConfiguration
.to("ctkr")
.server("http://localhost:5820")
.credentials("admin", "admin")
.connect();
//put in data
aConn.begin();
ArrayList<Statement> stmts = new ArrayList<Statement>();
stmts.add(Values.statement(Values.iri(CT, trial.getNctId()),
Values.iri(RDF, "type"),
Values.iri(CT, "Trial")));
stmts.add(Values.statement(Values.iri(CT, trial.getNctId()),
Values.iri(CT, "hasNCT"),
literal(trial.getNctId())));
ArrayList<Statement> interventionStmts = makeInterventionRep(trial);
ArrayList<Statement> criteriaStmts = makeCriteriaRep(trial);
stmts.addAll(interventionStmts);
stmts.addAll(criteriaStmts);
System.err.println("dumping stmts");
for (Statement curr : stmts)
System.err.println(curr);
Model ctgraph = Models2.newModel(stmts);
aConn.add().graph(ctgraph);
aConn.commit();
System.err.println("wrote triples for trial " + trial.getNctId() + "to Stardog");
aConn.close();
}
But the root cause is elsewhere - in this code snippet
Literal val = literal(criterion.getCriterionType());
System.err.println("literal for criterion type is " + val);
if (val != null)
{
System.err.println("criterion type is valid it is " + criterion.getCriterionType());
cStmts.add(Values.statement(criteriaIRI,
Values.iri(CT,"criteriaType"),
literal(criterion.getCriterionType())));
}
else
System.err.println("criteria type is not valid");
You can see that I have wrapped this with a test to make sure the criterion type is not null, since that happens every so often in the data. But this is ugly, so I am wondering if there is a better way to handle this. And why does Stardog fail so abjectly when this happens? This is the actual exception
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:308)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:150)
at ecproj.krgen.PipelineSystem.<init>(PipelineSystem.java:84)
at ecproj.krgen.PipelineSystem.main(PipelineSystem.java:99)
Caused by: java.lang.NullPointerException
at com.complexible.common.rdf.model.AbstractStardogLiteral.hashCode(AbstractStardogLiteral.java:53)
at com.complexible.common.rdf.model.StardogStringLiteral.hashCode(StardogStringLiteral.java:14)
at java.util.HashMap.hash(HashMap.java:338)
at java.util.HashMap.get(HashMap.java:556)
at org.openrdf.model.impl.LinkedHashModel.asNode(LinkedHashModel.java:543)
at org.openrdf.model.impl.LinkedHashModel.add(LinkedHashModel.java:171)
at org.openrdf.model.impl.AbstractModel.add(AbstractModel.java:49)
at org.openrdf.model.impl.AbstractModel.addAll(AbstractModel.java:139)
at com.google.common.collect.Iterables.addAll(Iterables.java:352)
at com.complexible.common.openrdf.model.Models2.newModel(Models2.java:90)
at ecproj.krgen.KRWriter.writeToStardog(KRWriter.java:328)
at ecproj.krgen.KRWriter.process(KRWriter.java:184)
at org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385)
... 9 more
It took a couple of hours to find the culprit - with logging, it might have been much quicker, so I would really like to know if it is possible to log inserts that come from SNARL. And again, what is best practice for handling nulls?