SQLite Revisited

@zachary.whitley you posted something around using SQLite back in 2017. NPE on virtual graph wo username or password

Like you I would like to use SQLite3 to have a little extra functionality that is not available when using CSV's

I was wondering if it's still something that working? The jdbc driver that you use does not seem to be available. Also do you simply drop the jdbc driver in the stardog/server/dbms folder?

Thanks
-Serge

I was just playing around with it recently. The driver I use is this one. GitHub - xerial/sqlite-jdbc: SQLite JDBC Driver

You can put it into the directory you mentioned and then restart Stardog. The only gotcha is you need to set sql.default.schema=main in your mapping properties file.

We've also used it successfully for some projects internally. Just out of curiosity, what are you lacking from CSV import?

For example let assume we have 3 CSV file

1- One for Entity1, One for Entity 2, and the other for Foreign keys. Join is great
2- What is if want only a subset of data easily defined by a select query
3- ...

thanks,

So I guess my property file should look something like this

jdbc.url=jdbc:sqlite:/tmp/test.db
jdbc.username=blah
jdbc.password=blah
jdbc.driver=org.sqlite.JDBC
sql.default.schema=main

I assume I need to put dummy username/password based on your earlier post.

@zachary.whitley

I am getting this error

Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Object 'ABILITY' not found within 'main'; did you mean 'Ability'?

The log file is not really useful. Any hint how I can resolve :slight_smile:

You can quote the table names to preserve case sensitivity: select * from "Ability"

You can either change your table to upper case or quote the table name in the mappings.

One other thing in addition to what @serge mentioned about using SQLite. I've found it handy when you're still working on your mappings so you don't have to keep reimporting all the data every time you change the mappings.

Does the new Studio CSV import make that easier?

thank you @zachary.whitley, @jess

I knew that, just taking a hiatus from Stardog for 9 months and these little things get forgotten :-).

1 Like

@jess I did look at the new CSV import before deciding to proceed with SQLite. But it somewhat load it as a simple table (lack of a better term).

It true that it easy to load a CSV file , but then you would need to run SPARQL statement to transform the data into the format your want. While that would work, we are trying to do all transformation using SMS2.

However if you could do

MAPPING
FROM SPARQL {
   sparql query
TO {
}
WHERE {
}

Then loading the CSV as is would be useful.since it would remove the need for SQLite since I could then simply use Stardog as a staging area which in some company is not always possible (is machine lock down). I could use sparql to perform my join as well as be able to use the command such as the regex, concat, replace which is more feature rich in SPARQL than SQLite to pre-process the data.

food for thought.

(I guess if you configure the SQL interface when you loaded the CSV automatically that would work also)

The use case you previously mentioned was a join which would not be possible directly in mappings unless multiple CSV mappings were used in a single import step. I don't really such much of a distinction between importing multiple CSV files into SQLite and then doing a virtual import over a SQL join and just importing all files into Stardog and generating the data you need there. Some practical limitations like "machine lock down" might necessitate that. If you had something in mind for this "FROM SPARQL" that doesn't involve a join to another CSV file, it's something we can consider.

Jess

I've been thinking about this for a bit because it's something I've come across as well. One reason, that @serge mentioned, is the possible access to more functions which I've tried to address by writing Stardog functions for common database functions. The other reason I've come across is if you have a messy csv file and need to do a lot of iterations on the mapping file it's much faster than having to materialize the csv file each time. I can see how an alternative would just be to materialize the csv with a simple mapping and do the transformations with sparql update queries. The reason I don't do that is that eventually I'm going to want to back out an update or I'll be unsure about what exact transforms I've applied and I'll have to rematerialize to get back to a known state. There's also no way to attach or automatically apply the update queries on import which is maybe what I think @serge might be suggesting with the "mapping from sparql". A mapping from sparql looks a lot like a sparql update query but it might be nice to be able to do something like stardog-admin virtual import myDB mappings.ttl input.csv transforms.ru where transforms.ru would be a collection of sparql update queries to run after import.

The only situation that I can think of where using sqlite might still be a good option would be if the simple mapping materialized a large number of triples that were subsequently deleted after the transform.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.