Use pandas to import dataframe into a virtual graph

(Radu Marian) #1

It would be nice to be able to import a data from from a pandas dataframe right inside a Jupyter notebook.

This cabability exist at the command line - stardog-admin import ttl csv

  • But why should I write dataframe to a CSV when I can import directly to a virtual graph while looping through a dataframe?

Regards,
Radu

(zachary.whitley) #2

You can try using pandas.DataFrame.to_sql . You'd need to setup a relational database but it would be more flexable since you wouldn't be required to import it.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

(Radu Marian) #3

Well then I have to fully describe my use case.

I am querying Splunk with Python SDK in a Jupyter environment. Here is the code a fragment:
...
res_list = list(splresults.ResultsReader(job.results(count=0)))

    if(len(res_list) > 0):
        df = pd.DataFrame(res_list, columns=res_list[0].keys())
...
df.to_csv('some.csv')
# - later on command line to import csv as virtual graph using stardog-admin - but why?

Can I instead just say?

df.to_virtualgraph('mydb', 'csv.ttl')

(zachary.whitley) #4

You could get something very close to what you're looking for by using the Stardog Rest API, the to_sql call, and a relational database. I don't know what exactly the Stardog folks are using for importing csv files but I guess that they might just fire up a java embedded db like h2, derby, hsqldb, etc, import the csv and then import it just like any relational source and then just throw the db away but that's just a guess. They might run it directly through the mapper.