Started exploring the new "Similarity Search" feature. Before using it at my own data, I thought it would be good to use it first on the movie dataset. Therefore, I tried to replicate the steps used in the blog post "Similarity Search | Stardog". However, I do not succeed. I load the movie dataset in a brand new repository in Stardog 5.3.3.
Ran the Insert query from the blog post.
prefix spa: <tag:stardog:api:analytics:>
INSERT {
graph spa:model {
:simModel a spa:SimilarityModel ;
spa:arguments (?genres ?directors ?writers ?producers ?metaCritic) ;
spa:predict ?movie .
}
}
WHERE {
SELECT
(agg:spa:set(?genre) as ?genres)
(agg:spa:set(?director) as ?directors)
(agg:spa:set(?writer) as ?writers)
(agg:spa:set(?producer) as ?producers)
?metaCritic
?movie
{
?movie :genre ?genre ;
:director ?director ;
:author ?writer ;
:productionCompany ?producer ;
:metaCritic ?metaCritic .
}
GROUP BY ?movie ?metaCritic
}
And the select query:
prefix spa: <tag:stardog:api:analytics:>
SELECT ?similarMovieLabel ?confidence
WHERE {
graph spa:model {
:simModel spa:arguments (?genres ?directors ?writers ?producers ?metaCritic) ;
spa:confidence ?confidence ;
spa:parameters [ spa:limit 5 ] ;
spa:predict ?similarMovie .
}
{ ?similarMovie rdfs:label ?similarMovieLabel }
{
SELECT
(agg:spa:set(?genre) as ?genres)
(agg:spa:set(?director) as ?directors)
(agg:spa:set(?writer) as ?writers)
(agg:spa:set(?producer) as ?producers)
?metaCritic
?movie
{
?movie :genre ?genre ;
:director ?director ;
:author ?writer ;
:productionCompany ?producer ;
:metaCritic ?metaCritic .
VALUES ?movie { t:tt0118715 } # The Big Lebowski
}
GROUP BY ?movie ?metaCritic
}
}
ORDER BY DESC(?confidence)
Something which might be relevant, with running the query in Stardog I get an "unknown prefix error on agg:" Therefore I added "PREFIX agg: urn:aggregate" to both queries.
After running the select query I get 0 results. Do I have to adjust some settings in the repository, or should it be able to work right out of the blue?
The unknown prefix error on agg: error seems to indicate that you are using the deprecated webconsole. The recommended way to query Stardog is Stardog Studio, which is actively maintained, and doesn't suffer from this kind of errors.
With the webconsole, you need to add prefix agg: <urn:aggregate> to the query, and it should work (just tested it). Webconsole has some issues with prefixes, I recommend using Studio instead. If that's not possible, can you share the results on running this query on your database?
SELECT
(agg:spa:set(?genre) as ?genres)
(agg:spa:set(?director) as ?directors)
(agg:spa:set(?writer) as ?writers)
(agg:spa:set(?producer) as ?producers)
?metaCritic
?movie
{
?movie :genre ?genre ;
:director ?director ;
:author ?writer ;
:productionCompany ?producer ;
:metaCritic ?metaCritic .
VALUES ?movie { t:tt0118715 } # The Big Lebowski
}
GROUP BY ?movie ?metaCritic
Can’t replicate that behaviour locally. Maybe add a prefix : <http://schema.org/> to the query? It’s hard to say. As I said, webconsole has serious issues with prefixes, and it will be gone from Stardog very soon. Let me know if you encounter the same issues when running it with Studio.
Ok, will try the same steps via the studio. What are the future plans according to the webconsole, will it eventually be removed from the online environment?
Started over via Stardog Studio, get the same empty results. Started over with creating a new repository loaded the movie data and ran the insert query for creating model and the select query.
Solved it by adding the prefix : <http://schema.org/>, which is apparently the issue!
Ah, I assumed that you had bulk loaded the movies.ttl file, which would import the namespaces in the file. By doing a data add, they are not added to the list of default namespaces, so they have to be manually specified in the query.