parentTriplesMap

timmisset · July 27, 2018, 7:09am

Hi,

I’m trying to use a virtual graph import from a SQLServer with parent-child relationships. I have one 1 table that describes the parents which should all be subjects with some predicate-object maps. Another table with the children which also have to be added as subjects with some singular predicate-object maps and one with a parentTriplesMap to the parent.
The join is made succesfully and I get a triple for every parent-child relationship. However, the child predicate-object maps are also added as many times as there are parent-child relationships.
From an SQL perspective, this is actually normal behavior (a 1 to many join will be forced to a matrix and replicate the 1 to match the many). However, for triples this shouldn’t have to be and is actually counter-intuative, why would I ever want to have the same exact triple in my graph multiple times?

Is this something that I can configure in Stardog or the R2RML format (sorry, not used SMS yet, if that provides a solution I will definitely switch).

Example data:
Parent = sample container. Child = sample. A sample can be present in multiple containers. A container can only contain a single sample.

Parent works fine, only the correct triples are loaded
Child
predicateObjectMap --> samplename
predicateObjectMap --> samplebarcode
predicateObjectMap --> hasOwner --> Parent
If the sample is present in 5 containers, the samplename and samplebarcode triple are also replicated 5 times in the graph.

I know I can simply a totally seperate <#TriplesMap> based on the same table which will add the parent child relationships and only add samplename and samplebarcode in the current <#TriplesMap> but according to the way I understand the R2RML syntax, this shouldn’t be required.

Hope anyone can help.
Kr, Tim

zachary.whitley · July 27, 2018, 12:56pm

You can't have the exact same triple triple in a graph multiple times. Adding the same triple multiple times would only result in adding a single triple. Now what it actually does and what the performance impact might be is another story. I think I see what you're getting at and I've often wondered what the difference is in providing an explicit join and simply mapping the values and letting the query planner do it. I would guess is that technically the query planner would come up with the exact same query plan but the explicit one is like a query hint but that's just a guess.

I don't think that SMS provides any more expressiveness than R2RML. It's mostly easier to read and write. (it might be somewhat less expressive as I believe there were some issues with mappings and blank nodes but that may have been fixed)

timmisset · July 30, 2018, 11:19am

Hi Zachary,

thanks for your awnser. The issue described was actually using a virtual add not virtual import. When using virtual import the triple duplication is indeed negated by it constantly overwriting itself.
I have discussed this with support and this a performance decision not to include distinct in this situation. I will simply split up this part in 2 separate the ones that are supposed be singular from the others.

Kr, Tim

system · August 13, 2018, 11:21am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Csv virtual graph multiple triples for the same subject Support	13	544	April 11, 2019
What will happen if the same csv is loaded twice into the virtual graph? Support	3	240	April 9, 2019
Update data via "stardog-admin virtual import" Support	8	354	April 29, 2022
Handle multiple R2RML mapping files Support	6	1005	April 12, 2018
Named virtual graphs Support	16	298	January 18, 2025

parentTriplesMap

Related topics