CoreNLP extractors only work for Stardog V7?

@zachary.whitley I run on OSX, but the windows scripts DO handle STARDOG_EXT inside of helpers.bat

Ah, I had been looking at helpers.sh. On a side not I've often thought it would be nice if Stardog included some information on what extensions where loaded so you could either check or verify that an expected extension was picked up.

1 Like

That should be a reminder to me that hacks are almost never a good idea. I should have just suggested to fix it right in the first place. I'd go back in an fix up the build like I suggested previously or maybe I'm completely wrong. Either way I'm out of here. I'm taking a break. Good luck.

@zachary.whitley. Thanks for your suggestions and enjoy your break. This is an update to document my follow up on your suggestions. I believe you are right about it not finding the .jar file. I made the changes you suggest within stardog-admin.bat. I could only find one reference to classpath. I get the same issue that it cannot find the jar.

In looking through stardog-admin.bat I see that it looks for .jar files in /dbms, so on a lark I put the .jar in there. Sure enough it appears to find it and I am back to the same error i had when I put the file in /ext , what seems to be a reference to the wrong version of Lucene:

C:\Stardog\stardog-6.1.0\bin>stardog-admin.bat server start
An unexpected error occurred.
java.lang.NoSuchFieldError: LUCENE_5_1_0

@stephen : Thoughts on next steps?

I am also on break until Tuesday, 21 Jan. Actually on break now, but this problem is intriguing. :slight_smile:

@NovasTaylor I would still try setting STARDOG_EXT to a completely exterior folder that isn't otherwise on the classpath of the server, putting the corenlp jar in there, and removing it from all of the server/* directories. That would come closest to replicating the 6.1.0 install that I am working with where this is working

I moved the "rogue version 1.1" jar into a folder entirely outside of the Stardog server path and outside of known classpaths:

C:\_downloads\StarDog\nlpTestJars\rogueV11\bites-corenlp-all-1.1.jar

I set the environment variable STARDOG_EXT to the rogueV11 folder.
Proof:

C:\Stardog\stardog-6.1.0\importSource>dir %STARDOG_EXT%
 Volume in drive C is Windows
 Volume Serial Number is 980A-2B1D

 Directory of C:\_downloads\StarDog\nlpTestJars\rogueV11Jar

01/21/2020  07:39 PM    <DIR>          .
01/21/2020  07:39 PM    <DIR>          ..
01/15/2020  02:10 PM     1,896,131,699 bites-corenlp-all-1.1.jar

Started Stardog server.

Ran the extractor command:

stardog doc put --rdf-extractors CoreNLPMentionRDFExtractor testDB foo_.docx

Result:
Unknown extractor name: CoreNLPMentionRDFExtractor

Is Stardog not recognizing the STARDOG_EXT location? (how could I even tell?). Could there be a configuration issue for my Stardog install? For example, something as subtle as a case issue of STARDOG_EXT vs. stardog_ext ? I know from past use of mixed Linux and Windows environments that case can get hairy...

T

If you use Process Explorer or something to look at the command line which is used to start Stardog, you would see the STARDOG_EXT added to the classpath. On Windows, it can be a little bit confusing to understand the environments. Are you starting Stardog from the command line? Can you see the value of STARDOG_EXT in that shell?

That said, it really shouldn't have any impact at this point. Putting it in server/dbms is working and you're getting the class not found error. Changing the location isn't going to fix this.

It looks to me like the issue is that Lucene is pulled into the bites-corenlp jar. The Lucene version in the jar file is 4.10. Stardog 6.1.0 is using Lucene 5.something. There is a conflict in the classpath. First thing to try would be removing Lucene from the big jar and see if it works.

Jess

Interesting. I am starting Stardog from command line and using Process Explorer found that the value for STARDOG_EXT was:
C:\Stardog\stardog-6.1.0\test
I missed deleting this test folder that contained the bites-corenlp-all-1.1.jar supplied by Stephen for earlier testing.

The path is not the same as my STARDOG_EXT environment variable so I am unsure where Stardog is getting the old value of C:\Stardog\stardog-6.1.0\test.

I will clean this up, do some more testing tomorrow morning, and provide an update. Suggestions for next steps are welcome.
T

I suggest you stop using STARDOG_EXT until it works with the jar present in server/dbms.

@jess:
I performed the following:

  1. Removed system environment variable STARDOG_EXT
  2. Placed bites-corenlp-all-1.1.jar supplied by Stephen into /server/dbms
  3. Started Stardog from command line.
    stardog-admin.bat server start

Result:

An unexpected error occurred.
java.lang.NoSuchFieldError: LUCENE_5_1_0

This ends my testing for tonight. Can try additional steps tomorrow.

Using some zip file editor, remove the org/apache/lucene directory from the zip file.

Steps taken:

  1. Used 7Zip to remove org/apache/lucene directory from bites-corenlp-all-1.1.jar supplied by Stephen .
  2. Placed .jar in /server/dbms
  3. Attempted server start from command line:

stardog-admin.bat server start

  1. Result : A new error.
C:\Stardog>stardog-admin.bat server start
        An unexpected error occurred.
        java.util.ServiceConfigurationError: An SPI class of type org.apache.lucene.codecs.Codec with classname org.apache.lucene.codecs.lucene3x.Lucene3xCodec does not exist, please fix the file 'META-INF/services/org.apache.lucene.codecs.Codec' in your classpath.
        Waiting for running tasks to complete...done. Executor service has been shut down.
        Stardog server 6.1.0 shutdown on Wed Jan 22 08:19:12 EST 2020.

A screenshot of the lucene files in /server/dbms is attached.

luceneInServerDbms

Removal of the modified .jar allows Stardog server start without error.

Try removing the aforementioned file from the jar.

please remove this file

You've cracked it! :grinning: :grinning:

Result:

C:\Stardog\stardog-6.1.0\importSource>stardog doc put --rdf-extractors CoreNLPMentionRDFExtractor testDB foo_.docx

Successfully put document in the document store: tag:stardog:api:docs:testDB:foo_.docx

Thanks for your help! I appreciate the troubleshooting and patience from everyone on this thread. Now I can start exploring NLP in Stardog! :+1:

Glad it's working. Thanks for your patience. We'll fix this in the "bites-corenlp" release to improve compatibility in the future.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.