Hierarchy level identification using only SPARQL


With SPARQL, we can easily go through a hierarchy recursively. But it is a lot harder to identify the level of each node in that hierarchy.

Anyone has a full SPARQL solution for that?


Do you have a Sparql query for your hierarchy we can start from?

In the following query:

  1. There is a reportsTo relation between employees. Each employee has a direct manager.
  2. An employee is selected by a values clause.
  3. We get all levels of managers, direct and indirect, for the selected employee.
  4. We get direct managers with the two last patterns.
select ?empl ?directMgr ?allMgr
where {
    values(?empl) {(<http://www.web.ca/data/person#525>)}
        ?empl org:reportsTo+ ?allMgr .
        ?directMgr org:reportsTo ?allMgr .
        ?empl org:reportsTo* ?directMgr .
order by   ?empl ?directMgr ?allMgr

The result looks, almost, like the following. In fact the ORDER BY clause is not able to produce that order by itself, since it would necessitate some form of windowing to refer to the previous row (like in SQL).

empl directMgr allMgr
525 525 842
525 842 225
525 225 551
525 551 626
525 626 608
525 608 555
525 555 987
525 987 202
525 202 857

We could count the number of direct managers by employee. Top managers have less. But that means many levels of sub-queries. If that is the only option, I will do it. But I would like to know if there is a shortcut.

Also, if we remove the ORDER BY CLAUSE, the order produced is valid. It goes from bottom to top.

I wonder if we can always count on that? But we are still missing the level.

Do you have some data that you can share?

So this isn't a tree but more like some matrixed org? How many managers does this poor person have?

I'm not quite clear on what you're looking for from your query. Are you looking for the shortest/longest distance from an employee to some node designated as the top of the hierarchy like the CEO?

a - reportsTo -> b - reportsTo -> c -reportsTo-> CEO
a - reportsTo -> d --------------reportsTo-----------> CEO

if you're looking for min to return

a | 2

Thanks Zachary for your answer.

Sharing data is always hard. I am sending you something that matches the query and results I have posted. It is fake data but still valid.constructEmplHierarchy_sparql_results.ttl (835 Bytes)

Each employee has a single direct manager. But his manager has himself a manager and so on. So each employee can have a second level manager who is the manager of his direct manager.

I am trying to get for each employee the hierarchy above him and to identify the level of each hierarchy node.

525 reportsTo 842 at level 1 (direct manager)
and indirectly reportsTo 225 at level 2
and indirectly reportsTo 551 at level 3
and so on

I am able to easily produce that with my query except the level. That's my goal and with multiple sub-queries I should be able to produce it.

But I wanted to know if someone had a shortcut.

Is this what you're looking for?

prefix org: <http://www.w3.org/ns/org#>
SELECT ?subordinate ?superior (count(?manager) AS ?level) WHERE {
    ?subordinate org:reportsTo+ ?manager .
    ?manager org:reportsTo+ ?superior .
} GROUP BY ?subordinate ?superior order by ?subordinate ?level

Your question reminded me of this SO post https://stackoverflow.com/questions/17523804/is-it-possible-to-get-the-position-of-an-element-in-an-rdf-collection-in-sparql

The query can be simplified with the use of the reasoner.

1 Like

Thanks Zachary, that's very smart! You are counting the number of managers between the subordinates and their superior.

It took only a very small change to make it work completely. We need a star on ?subordinate org:reportsTo* ?manager to make sure we don't skip the first level.

Thanks again