Specify Path Length in Path Queries

I enjoy the Path query feature a lot- it is powerful and it or something like it should be integrated into the core SPARQL specification (hopefully they will in 1.2).

Most graph algorithm implementations support the selection of a property attribute to use as a length metric (of course, these are also done using a property graph where that is more convenient). However, unless I'm missing something, Stardog's path queries seem to be limited by the fact that the only possible metric for path length is the number of "hops" made, whether specified by a single predicate, a SPARQL property path, or a SPARQL query pattern.

In cases where the path is specified by a SPARQL pattern, it would be very useful if you could bind an additional variable to use as the length metric instead of always using the number of hops.

For example, consider a database of airline flights. If I wanted to find the "shortest" route from New York to San Francisco, I would do something like the following:

PATHS
START ?x = :JFK
END ?y = :SFO
VIA {
  ?flight a :Flight ;
    :from ?x ;
    :to ?y ;
  .
}

This query would work, but only if my length metric of interest was the number of connecting flights. I cannot easily find the shortest route by any other critical metric, including but not limited to:

  • Cost
  • Flight time
  • Distance traveled
  • Total travel time (the query would have to be modified to support layovers, but the point still stands)

I could imagine something like the following to extend the above query example to support using, e.g., cost as the length metric instead of number of hops by specifying a binding made in the query pattern to use as the length value:

PATHS
START ?x = :JFK
END ?y = :SFO
VIA {
  ?flight a :Flight ;
    :from ?x ;
    :to ?y ;
    :price ?price ;
  .
}
LENGTH ?price

I suppose it would be possible to use the current functionality with a stored query and then use it as a subquery in which you calculate the path length using whatever metric you want after the fact, but that can't be the most efficient or reliable method. Furthermore, it would be much more convenient if you could use a Path query as a subquery directly without having to deal with the stored procedure.

Also, to make this even more useful, being able to specify a minimum length would be useful as discussed here: Variable Path Queries - Support - Stardog Community

Hi Matt,

Yes, you're correct on all points. It'd be a good feature and we've talked about it internally for a while but so far didn't get to adding it to the product. But we may well do later depending on customer demand.

Also thanks for the thought on path subqueries. Might be something for us to consider indeed. That'd probably require a syntax extension to the standard SPARQL query forms (eg SELECT), something we have avoided so far.

Thanks,
Pavel