I am a real nooby to the whole topic knowledge graphs and graphs in general, so hopefully my question is reasonable
For mapping data sources like CSV, JSON, NoSQL, and SQL the specific Stardog Mapping Syntax is used for both virtual graphs and graphs with hard imported data:
So far so good. Now I am having the Use-Case that I want to include some Legacy System like ALM, PLM or CRM Systems which are having their own database, data models or data types. So there has to be some kind of sms2 mapping for that systems like for example you did on SAP Hana technology for the SAP erp systems, right?
My Question is now if I would be able to build my own connectors / sms2 mappings for my Use-Case specific systems which I want to include into my graph e.g. SIEMENS Teamcenter. For me this is a relevant feature / requirement for the Data as a Service Use-Case. I wasn't able to find anything in the docs regarding to my request.
Maybe you can somebody can help me out?
I am very thankful for some help, thank you very much in advance!
Welcome to the world of knowledge graphs. It's a good question that you raised. It's impossible for us to natively support integration with every possible data source. We currently expose a "service" layer inside Stardog to add connectors to third party systems. You can see an example here. One caveat is that this uses the internal SPARQL query representation. This is a very powerful approach but also presents a steep learning curve. We're working on developing a simpler API, still Java-based, which allows looking up data in a restricted format from an external system. This would allow you, for example, to look up individual items and relationships in Teamcenter and return the results in a SPARQL query. Can you share a little bit about the exact types of queries you would need to perform against Teamcenter? Are you going to be joining that data to another source, such as data stored in Stardog?
Really cool to here, that there is the possibility of writing our own connectors to include and query data from third party systems. Yes the SPARQL syntax is kind of complex in the first place, but the steep learning curve and your future work are giving me hope and trust to the Stardog solution!
Yeah, I can provide you some more information. It's mainly about getting product data out of Teamcenter, so the possible configurations, materials, variants and match that with production systems for providing a digital twin. Then the integration of CRM Data is interesting, especially to match appearing errors with product variants for further analysis. So first of all defining semantic models is relevant. Maybe a second question for that, where I also wasn't able to find some infos:
Is there the support of Stardog to automatically create or recommend graph structures out of table structures, so that a user as knowledge engineer wouldn't have to deal to much with modeling the graph of one system, but can focus on connecting these created models/data with other system models and data? So in other words: Is there some AI/Procedure to mine RDF models from conected sytems?
After that including/mapping the real data is one more obstacle to pass, but should be doable...
Thanks for the details. That sounds relatively straightforward for dealing with CRM and PLM data. The new API would allow bulk retrieval of data, or individual lookups given some identifying information. In the case of multiple lookups, the API call would be invoked individually for each one.
Regarding the creation of a graph model from tables, we do provide that in Stardog's Virtual Graph feature. The documentation on it is a little thin, but essentially you create a virtual graph by pointing Stardog to a database (this is easiest with Stardog Studio), giving an optional list of tables to include/exclude, and it will generate a mapping to RDF which will allow instantly querying the data with SPARQL.