Named Graphs demo feedback

Hi,
I'm writing an article on Named Graphs in which Stardog is the main focus and as sometimes this topic can be very confusing to people, I'm trying to show, step by step, how named graphs behave under different circumstances and configuration settings in Stardog.
I would appreciate your feedback on it before I increase the visibility of the article.

Thanks so much,
Regards,
Marcelo.

3 Likes

Hi Marcelo,

Thanks for the heads-up. I'll do that. Today is difficult but I should get it done tomorrow.

Best,
Pavel

2 Likes

Hi Marcelo,

This is a great post and we're sure it'll be very useful to Stardog users. We'll link to it from our Stardog Labs site. Most of my comments are pretty minor, please find them below.

The first time you mention "RDF Dataset", it's not fully clear that you talk about a dataset associated with a SPARQL query, as per Section 13 in the spec. I know that many are confused about the difference between "RDF dataset" as a collection of RDF quads in the database and "RDF dataset" as a collection of graphs in the query. The next paragraph makes it more clear.

Exactly one default graph: The default graph does not have a name and may not contain any triples.

I suggest rephrasing to "may be empty in the data" to not give the false impression that it's not allowed to contain any triples.

When you talk about FROM vs FROM NAMED, it'd be good to explicitly say that BGPs outside of graph g {} are evaluated against the default part of the RDF Dataset (i.e. defined using FROM) while BGPs within graph {} are evaluated for each graph in the named part of the RDF Dataset.
Misunderstanding that point is the number one reason for getting confused about empty query results. You say it in

The GRAPH keyword is used to make the active graph one of all of the named graphs in the dataset for part of the query

but maybe showing a query with FROM (and no FROM NAMED) and graph {} would be a good example of a query which cannot return results.

Counting triples in the default graph (Stardog extension)

This example is correct but looks a little strange to me. You could achieve the same with just

SELECT (count(*) as ?size)
WHERE
     { ?s ?p ?o }

(assuming query.all.graphs=false) or, if you don't want to rely on that option then

SELECT (count(*) as ?size)
FROM stardog:context:default
WHERE
     { ?s ?p ?o }

In other words, I am not sure what ORDER BY and GROUP BY are bringing to the table here since there's only one graph.

The following query is equivalent to the previous one, however it doesn’t rely on the Stardog extension.

True but with a caveat. The query

SELECT ?g (count(*) as ?size)
WHERE {
     {
          GRAPH ?g {?s ?p ?o}
     } UNION {
          ?s ?p ?o
          BIND(“default” AS ?g)
     }
}
GROUP BY
    ?g
ORDER BY
    asc(?size)

does not specify its RDF Dataset so to produce the same result as above, it should be evaluated over the RDF Dataset where the default part is the default graph in the data and the named part is all named graphs in the data. This is a reasonable default option (i.e. Stardog will use that with query.all.graphs = false) but the spec doesn't mandate it.

The query below can be used to achieve the same result. It will search for book1 across all graphs, named and default and union the results.

I'd swap the BIND and FILTER in this query to make it clearer to people less familiar with the SPARQL semantics that ?searchString is assigned before it's used in the filter.

Stardog has a database property called “Query All Graphs” (query.all.graphs), which provides the same behaviour as the stardog:context:all, but set at the database level

I'd be more explicit here. By this time you've defined that RDF Dataset is a structure of two parts (the default part and the named part). So you can say how exactly query.all.graphs affects that. With false, the default dataset will be <context:default, context:named>. With true, the default dataset will be <context:all, context:named>. And of course that option applies only when the query does not use any FROM or FROM NAMED and also the dataset is not set through the SPARQL Protocol (which happens when you select the graph in that drop-down list in Studio).

Again :+1: and thanks for your work,

Pavel

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.