IllegalArgumentException in entityExtractor SPARQL Service


(Nolan Nichols) #1

Hi,

I'm trying to use the entityExtractor SPARQL service, but running into an out of bounds error. Here is the query I am running:

select * {
  ?iri dct:description ?text
  service docs:entityExtractor {
    []  docs:text ?text ;
        docs:mention ?mention .
  }
} 

Here is part of the error from stardog.log:

Caused by: java.lang.IllegalArgumentException: The span [227..243) is outside the given text which has length 197!
	at opennlp.tools.util.Span.getCoveredText(Span.java:231) ~[opennlp-tools-1.9.0.jar:1.9.0]
	at opennlp.tools.util.Span.spansToStrings(Span.java:351) ~[opennlp-tools-1.9.0.jar:1.9.0]
	at opennlp.tools.tokenize.AbstractTokenizer.tokenize(AbstractTokenizer.java:25) ~[opennlp-tools-1.9.0.jar:1.9.0]
	at opennlp.tools.tokenize.TokenizerME.tokenize(TokenizerME.java:76) ~[opennlp-tools-1.9.0.jar:1.9.0]
	at com.complexible.stardog.docs.nlp.impl.OpenNLPDocumentParser.apply(OpenNLPDocumentParser.java:120) ~[stardog-bites-core-6.0.0.jar:?]

If I limit this to just one result, it will occasionally return a result. Any help understanding what's going on is much appreciated.


(Jess Balint) #2

Hey Nolan,

Are you able to share one of the ?text values that causes this exception? If necessary you can email it to me at jess@stardog.com. Thanks.

Jess


(Nolan Nichols) #3

Thanks for following up with me on this via email. As we discussed, I was able to submit each of the individual ?text values via a script without error.

The issue only seems to occur when using ?text as a bound variable.


(system) #4

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.