About Stardog Rule Syntax - aggregate

Everyone,

I’ve triples describing cars and here, rides.
Rides are described by some properties but here three are importants : the UID, the car of the ride and the distance made.
An example :

Ride number 1234 - car number 5820 - distance of 20,52 miles. Please see below a triple sample.

<http://www.fluidops.com/resource/V2TRAJETS201706/20469> a <http://www.comp.eu/VoitureConnectee#Rides> ;
	:uid "20469"^^xsd:string ;
	<http://www.comp.eu/VoitureConnectee#carID> "608642"^^xsd:string ;
	<http://www.comp.eu/VoitureConnectee#distanceMade> "0.7300000191" .

What I want to do is a bit simple : I want to find, the cars which made more than 300 miles.
The problem is that a ride is unique, but carID isn’t : a car can have more than 1 ride. For example, the car 60842 made 0.73 miles but also made 53.73 miles in another triple, and also made other distances on other triples.

What I though is to create a rule to aggregate those datas. What I tried is bellow: (let’s say I have the good prefix, to be clearer)

IF {
	?ride :carID ?carID ;
	      :distanceMade ?distanceMade.
	 BIND( SUM(?distanceMade) as ?sum)
	 FILTER(?sum >=300)
}

But it doesn’t work (the data import doesn’t work).

Moreover, I tried to remove the filter. It works but when I request on reasoning, it says : Internal Server Error

Maybe you have an idea, because I do believe this is not the best way to do what I want to create.

Thanks a lot,
Clément

Aggregation is not supported in rules. You’d have to do it in queries instead.

Please see [1] for what can and cannot be used in rules.

Best,
Pavel

[1] http://www.stardog.com/docs/#_rule_limitations_gotchas

Thanks for the answer.

What I have to do is to move to SWRL isn’t it ?

Regards,
Clément

No, aggregation is not supported in SWRL (otherwise it’d be supported in Stardog rules). You need to use aggregation in queries, not in rules.

Best,
Pavel

Right.

But there, my goal is to make Stardog deduce something like “the big travelers” (using the reasoning option of Stardog). We can imagine a request like “SELECT * WHERE { ?client a :bigTraveler}” .

The purpose is : where to put this aggregation ? In queries, I understood but queries in the .ttl file ?

Indeed, curently I’m working with triple on a turtle file and in this file, I have my rules.

Thanks,
Clément

You cannot deduce that someone is a big traveller. You can only query for it by using aggregation functions directly in your SPARQL query (instead of the :bigTraveler predicate), not in your rules.

Best,
Pavel

I’m a bit frustrated.

Let’s speak about my case : if I want to find the big travelers, reasoning isn’t the best way ? Like I just have to do a basic SPARQL query without reasoning but with an aggregation, right ?

Thanks,
Clément

Yes, the aggregation part of the computation needs to be done in queries. This is a pretty standard restriction for rule languages. There are other restrictions too, for example related to negation (FILTER NOT EXISTS cannot be used in rules either). I sent you a link to the doc section which describes those. Just because the Stardog rule syntax looks like SPARQL, it doesn’t mean that you can use arbitrary SPARQL patterns there.

Deciding what goes into rules and what needs to go into queries is is a basic part of modeling. Even when you have to aggregate and thus have to do it in a query, aggregation can work over inferred facts. There are cases, of course, which are all about aggregation in which case SWRL rules cannot help.

Best,
Pavel

I see.

I was trying to show how powerful is the reasoning but there, I can’t find an example (for my data). Indeed, in my case, everything can be (and I think should be) done in SPARQL queries…

Anyway, thanks for all @pavel
Clément

I think not being able to use aggregates in a rule might itself be an example of how powerful rules are. With great power comes great responsibility and as stated in the documentation, rules are a double barreled shotgun that you don’t want to pointed at your feet. Not being able to use aggregates are one of these foot protection mechanisms. If you just think of them in isolation it might seem arbitrarily limiting. “Why can’t I do that? Seems simple enough.” but you start to see some of the potential problems when you take a step back and look at the entire system. Rules and reasoning work as a complete system to be computable (it isn’t going to do you any good if you get the answer in 100 years) and the answers are sound.

Even if you could do your aggregation with a rule an argument against it would be you’d have to recompute it every time you query for it which wouldn’t be very efficient if you have a lot of rides. You’d probably want to have a running total that you updated when you added new data. Then your rule simply checks the total ride distance. You might even be able to do it with a custom transaction listener plugin.

@zachary.whitley , thanks for the answer.

I wasn’t seeing rules on that point of view. I though that if you have, let’s say 3 rules and every rules take around 10 seconds, then if you call the rule number 2, it will only take 10 seconds, and not recompute the whole rules (so it will take 30 seconds).

Of what I understood, rules are like “predefined requests” where you store the results in a new (or not) class.

Regards,
Clément

The way that reasoning works in Stardog is via query rewriting and not materialization. This means that any given rule will only be invoked when you run a query (with reasoning enabled) that requires it. The “results” of a given rule are not technically stored anywhere; they’re calculated on the fly, ensuring that the data are always up to date.

On the fly, that’s what I wanted to say.

Again, I don’t see the power of rules.
It’s like : if a rule is launched everytime we call it (and not all rules are launched) why aggregation in rules will be a suicide ?

I am a little stuck in my vision of things, and I can not find the reason of how rules are powerful. But I know they are powerful, and I know there is at least one reason !

Thanks,
Clément

I don’t know why you’re now discussing something like the “power of rules”…rules, or to be more specific deduction rules, have an obvious purpose, which in fact is to encode implicit knowledge. How this knowledge is taken into account doesn’t matter. The obvious advantage is that you don’t have to add facts explicitly as stated in the conclusion of a rule based on the premise.

I’m starting at learning the ontology system, and I still don’t see the “purpose”. Or yes, I see, but I don’t know the usefulness.

Take the well-knowed example of reasoning : the family (deduce if two persons are brother/sisters thanks to their parents).

To deduce that, a very simple rule will work well. But as a simple SPARQL SELECT, isn’t it ? That’s why I can’t understand the “purpose” as you say of rules.

Thanks @lorenz_b
Regards,
Clément

I think it might help to thing of rules as truth statements. Trying to focus on them firing or not firing puts too much emphasis on the implementation and not the result. Rules work together with your model (axioms, TBox, ontology or whatever you want to call it) to be a logically consistent. Maybe your rule firing causes something else to be inferred and causes another rule to fire, etc, etc. You don’t really need to worry about any of that and if the outcome is logical. Stardog handles all of that. You just tell Stardog how the world should work and it will handle handle the consequences. So you’re not so much telling Stardog what to do as you’re telling it how the world should work.

At the risk of making Pavel’s head explode with my lame explanation I think the problem with aggregates is your model needs to be monotomic which means that you can’t add information and have fewer logical consequences. Or thinking about it another way, what is true is true, and learning more information later won’t make it untrue. I think and aggregation like the one you describe could cause it to be non-monotomic. Stardog doesn’t know anything about any semantics of the aggregation, just that it’s an aggregation so the sum could just as easily add negative distances. You could add a new negative distance that would force you to retract your previous assertion that someone was a longTraveler. It’s not such a great example because summing over a non-negative integer and comparing against greater than would be monotomic but that’s a lot of conditions and you’d have to have some way of telling Stardog all these things.

When you do it yourself with a select statement you’re responsible for the logical consequences which in your case as stated previously is probably ok but there is no way for Stardog to guarantee that it would be ok in every case so it restricts its use.

So instead of querying for

SELECT ?client WHERE { ?client a :bigTraveler}

you would select for

SELECT ?client where { ?ride :carID ?client ;
	    :distanceMade ?distanceMade.
	 BIND( SUM(?distanceMade) as ?sum)
	 FILTER(?sum >=300)
}

The logic is just how contained in your query rather than in the database so someone that was given arbitrary access to the database would never see :bigTraveler and would need to know how to construct a query to get them. One of the nice things about rules is it puts that information into the database.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.