Stardog virtual graph bulkload csv

Gaurav_Malhotra · March 16, 2020, 8:28am

Hi all,
I am currently new to virtual graph concept in stardog and recently started working on it. This works like a charm. I have successfully been able to add csv file to virtual graph as described in the guide.
However, I am curious to know if we can do a bulk load to virtual graph. I couldn't find any documentation related to that. Can anyone help me with this?

I am using Stardog 7.2.

Thanks,
Gaurav Malhotra

zachary.whitley · March 16, 2020, 12:17pm

Are you asking about bulk importing csv files or something else? I believe that you can pass multiple csv files when importing, although they'd all have to have the same columns. Or at least compatible columns, I don't know what might happen if one had additional unmapped columns. If not it should be fairly easy to combine multiple csv files into a single monster file. Now if you're taking about importing many csv files with multiple mappings I think you'd have to script it. I looked into using csvw, the once good use that I can think of for using that insane spec, to allow you to specify information about a csv and import by pointing to this metadata but they only seem to have anticipated specifying csvw by the data publisher and not that I might want to provide the metadata about someone else's csv file.

Gaurav_Malhotra · March 16, 2020, 12:59pm

I am looking to bulk importing csv files. I agree with you that we have to provide different sms files if they csv are not identical. I tried it out on csv with same columns and it worked as expected. However when I tried importing without sms file(instead I gave a .properties file for csv), it threew an error saying primary key not found in error. I am wondering how I could do that

PaulJackson · March 16, 2020, 3:00pm

Gaurav,

The multi-file support was broken when we switched to SMS2 support. I'll create a ticket for this and get it fixed.

In the meantime, assuming a linux client, you can use the the following bash command with the CLI:

for n in multi1.csv multi2.csv ; do bin/stardog-admin virtual import csv multicsv.properties multi.sms2 $n ; done

Thank you for raising the issue.

-Paul

Gaurav_Malhotra · March 16, 2020, 3:13pm

This is big help... Thanks Paul.. I ll surely try this out..

zachary.whitley · March 16, 2020, 3:19pm

I might also suggest that you simply strip the headers and concatenate the files if possible. Depending on the number of files and their size it might be really slow just due to the JVM startup time for each file.

system · March 30, 2020, 3:19pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Virtual import from csv in studio - feature needed Feature Request	3	634	April 16, 2019
Import multiple csv at the same time into a virtual graph Support	2	294	April 9, 2019
A true virtual graph for Excel and CSV hosted remotely Feature Request	5	675	June 10, 2020
Ability to import CSV data via Studio Feature Request	8	1044	May 8, 2020
Virtual Graph CSV Import Support	3	1229	June 1, 2018

Stardog virtual graph bulkload csv

Related topics