Performance issue with textmatch

joergunbehauen · March 27, 2019, 6:11pm

Hi,

i want to query stardog (6.1.2) on a bsbm ~35m dataset with a faceted-browsing like query, e.g.

PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>



SELECT  distinct ?o ?label ?desc {

    
    FILTER EXISTS {
        ?resource a bsbm:Product.
        ?resource bsbm:productFeature ?feature.
        ?resource bsbm:productPropertyTextual1 ?pf1.
        ?resource bsbm:productPropertyTextual2 ?pf2.
        ?review bsbm:reviewFor ?resource.

    }
    ?resource ?p ?o.
    
    FILTER (isiri(?o))
    optional {
        ?o rdfs:label ?label
    }

    optional {
        ?o rdfs:comment ?comment

    }

    ?o rdfs:label ?osearch.
    ?osearch <tag:stardog:api:property:textMatch> "da*".
    #FILTER (contains(?osearch,"da"))

}

limit 10000

performance dramatically drops, when using textmatch compared to contains.

contains ~1.1 s; 1600 results
textMatch ~400s; 437 results

As the results appear to be what is expected here, however the performance penalty is quite high.

The problem is much less pronounced, when the text-filtering is issued without other graph patterns, e.g.
PREFIX bsbm-inst: http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/
PREFIX bsbm: http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#

select distinct * {

    ?o rdfs:label ?oSearch.
    ?oSearch <tag:stardog:api:property:textMatch> "da*".
    #FILTER(contains(?oSearch,"da"))
}
limit 100000

Note:
The double use of rdfs:label might strike as odd, but as the query is generated, this might very well happen.

jess · April 10, 2019, 5:23pm

Hi Jörg,

Welcome to the forum. Thanks for the detailed report. Can you please share the query plans for the textMatch vs contains queries?

Jess

Topic		Replies	Views
Search and SPARQL textmatch return different results Support	5	547	August 15, 2018
Fast query on properties for large number of nodes Support	6	597	May 31, 2018
About textMatch (Lucene) usage Support	3	828	June 25, 2018
Scalability issues in Stardog Cloud Support	17	434	May 31, 2025
Problem running (apparently) simple query Support	5	364	November 9, 2018

Performance issue with textmatch

Related topics