Stardog has transitioned from using openrdf to their own stark api. The extractors have been updated to use the new api but you'll have to checkout the repo and build the latest code from master.
I'm building it now. What version of Stardog are you running? It looks like it's building against 6.2.0 so if you're running 7.x it may need some minor updates to run against 7
thanks a lot!. it works in Linux. I was downloading in window and change extension not working.
i am able to restart Stardog server after that.
Do you have doc on how to use stardog for nlp? I created a document with following sentence: "The Orioles are a professional baseball team based in Baltimore."
run the extractor as
stardog doc put --rdf-extractors CoreNLPMentionRDFExtractor documents test.doc
return edu/stanford/nlp/pipeline/CoreDocument
I try to query any triple in database, but there is nothing returned.
@zachary.whitley you just built the "normal" Jar file, but I'm pretty sure you'll need the fat Jar. BITES uses NLP based on Stanford CoreNLP which uses very huge pre-trained models. Given that BITES is an extension, those models are not shipped with the standard Stardog direstribution.
I built it now for Stardog 7.0.2 , the whole Jar file is ~1.8GB on my machine
Thanks!
it works for CoreNLPMentionRDFExtractor . The extracted triples (Entities) are loaded in the named graph.
However, got java heap space error if I run
stardog doc put --rdf-extractors CoreNLPRelationRDFExtractor documents test.doc
Stardog.log:
ERROR 2019-10-18 19:57:27,165 [stardog-user-2] com.stardog.http.server.undertow.ErrorHandling:writeError(138): Unexpected error on the server
java.lang.OutOfMemoryError: Java heap space
at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:661) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:643) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.parser.nndep.DependencyParser.initialize(DependencyParser.java:1186) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.parser.nndep.DependencyParser.loadModelFile(DependencyParser.java:630) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.parser.nndep.DependencyParser.loadFromModelFile(DependencyParser.java:499) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.DependencyParseAnnotator.(DependencyParseAnnotator.java:57) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.AnnotatorImplementations.dependencies(AnnotatorImplementations.java:240) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$57(StanfordCoreNLP.java:559) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$1305/836250417.apply(Unknown Source) ~[?:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$69(StanfordCoreNLP.java:625) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$1314/285508813.get(Unknown Source) ~[?:?]
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:495) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:201) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:194) ~[bites-corenlp-all-1.2.jar:?]
at edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:181) ~[bites-corenlp-all-1.2.jar:?]
at com.complexible.stardog.docs.corenlp.CoreNLPRelationRDFExtractor.getPipeline(CoreNLPRelationRDFExtractor.java:77) ~[bites-corenlp-all-1.2.jar:?]
at com.complexible.stardog.docs.corenlp.CoreNLPRelationRDFExtractor.extractFromText(CoreNLPRelationRDFExtractor.java:87) ~[bites-corenlp-all-1.2.jar:?]
at com.complexible.stardog.docs.extraction.tika.TextProvidingRDFExtractor.extract(TextProvidingRDFExtractor.java:46) ~[stardog-bites-core-7.0.1.jar:?]
at com.complexible.stardog.docs.extraction.tika.TextProvidingRDFExtractor.extract(TextProvidingRDFExtractor.java:28) ~[stardog-bites-core-7.0.1.jar:?]
at com.complexible.stardog.docs.db.ConnectableBitesConnectionImpl.lambda$extract$2(ConnectableBitesConnectionImpl.java:162) ~[stardog-bites-core-7.0.1.jar:?]
at com.complexible.stardog.docs.db.ConnectableBitesConnectionImpl$$Lambda$1260/369201145.apply(Unknown Source) ~[?:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_222]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_222]
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_222]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_222]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_222]
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_222]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_222]
at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:541) ~[?:1.8.0_222]
The text in document is from sample: "The Orioles are a professional baseball team based in Baltimore."
Are you able to run CoreNLPRelationRDFExtractor for entities and links?
Anything i need to configure to avoid java heap memory issue for this simple text?
The NLP models would require additional heap space so you'll need to allocate more heap space by adjusting your STARDOG_SERVER_JAVA_ARGS environment variable.