Escape quotes in uri

bsteenwi · May 4, 2020, 8:17pm

Hi,

I've tried to execute the following sparql query in stardog studio:

 select ?o ?p WHERE {
     <http://yago-knowledge.org/resource/Leroy_"Twist"_Casey> ?o ?p
}

But this gives an error, but when I try:

 select ?o ?p WHERE {
     <http://yago-knowledge.org/resource/Leroy_%22Twist%22_Casey> ?o ?p
}

I get no results... (and I am sure the db has some relations for this uri)

Do I have to enable url encoding somewhere?

lorenz_b · May 5, 2020, 6:37am

Are you sure that the data is in the knowledge base? Anyways, any parser would fail if the raw data to load would have been

http://yago-knowledge.org/resource/Leroy_"Twist"_Casey

so ideally, the " was percent encoded in the source data. So I'm wondering if those lines have been ignored, but I don't know if this is possible in Stardog.

If you know that some data is there, maybe you can first try to find the resource backwards, i.e. based on some relation * value? Maybe some label via rdfs:label or similar things?

bsteenwi · May 5, 2020, 7:05am

Yeah I have found these uri's by doing:

 select ?s ?p ?o WHERE {
    ?s ?p ?o .
}

And I am quite sure the loading process went smooth (no errors in the logs when adding the data)
I have attached a simple example ttl file if someone wants to reproduce this problem.
test.ttl (777 Bytes)

Querying the data using a different query is not really an option for my use case...
It would be better for me to remove all these special characters in the data, but that is rather an ugly solution (the data is loaded from the yago2 benchmark dataset)

lorenz_b · May 5, 2020, 7:26am

well, I tested your test dataset with the Apache Jena toolkit from cli via
riot --output=N-Triples test.ttl
and it fails as expected with

09:23:45 ERROR riot                 :: [line: 20, col: 9 ] Illegal character in IRI (codepoint 0x22, '"'): <Leroy_["]...>

so I'm wondering why Stardog loader does not fail here. But yes, maybe I'm missing something, so I'd wait until the smarted people here and the Stardog devs will help you - should not take that long, those people are fast and great.

bsteenwi · May 5, 2020, 7:39am

Ok thanks for testing,
you have set the output argument to N-Triples but the example file is in a turtle format, but I guess the error will be the same.

I have loaded the dataset directly using the command line (and a second time using Stardog studio)
I will wait for the Stardog people to help

lorenz_b · May 5, 2020, 8:41am

it's just the output format in the command line after successful parsing

Well, maybe Apache Jena is too strict (which I doubt) or maybe Stardog is too relaxed. Or it does percent encoding for you, I don't know. But at least the parsing of the SPARQL query fails when using quotes in URIs, and thus you can't query it.
I'm interested in the solution as well, always happy to learn.

zachary.whitley · May 5, 2020, 1:49pm

I'm running 7.2.0 and can successdfully load the test file with the double quotes

$> stardog data add test test.ttl

$> stardog query test 'select * { ?s ?p ?o }

and there it is, quotes and all...

|                           s                            |                      p                       |                    o                    |
+--------------------------------------------------------+----------------------------------------------+-----------------------------------------+
| http://yago-knowledge.org/resource/Leroy_"Twist"_Casey | http://yago-knowledge.org/resource/hasGender | http://yago-knowledge.org/resource/male |
+--------------------------------------------------------+----------------------------------------------+-----------------------------------------+

Stardog definitely does some checks because if you change the double quotes to spaces it complains but if you change them to single quotes it loads it as well.

I suspect that it complains about the space, not because it's an invalid url character but because it causes a parse error and that possibly it's probably not checking IRI's because it would be very resource intensive when loading data.

bsteenwi · May 5, 2020, 2:02pm

Yeah but it is not the loading that bothers me, it is more how I can query this data with the predefined uri:

select ?o ?p WHERE { <http://yago-knowledge.org/resource/Leroy_"Twist"_Casey> ?o ?p }

zachary.whitley · May 5, 2020, 2:36pm

You can try using the IRI function and then filtering it. I’m on my phone but I’ll give it a try as soon as I can.

bsteenwi · May 5, 2020, 2:44pm

Ow wow thanks, that works indeed:

select distinct ?p ?o WHERE {
    BIND( IRI('http://yago-knowledge.org/resource/Leroy_"Twist"_Casey') AS ?t )
    ?t ?p ?o .
}

lorenz_b · May 6, 2020, 7:10am

Honestly, for me this is more like a workaround, not that I understand why the query parser doesn't fail it this point. That looks like an inconsistent behaviour, doesn't it?
I mean, ideally, you shouldn't be able to load ill-formed data (ok ideally there is no ill-formed data at all) - I understand that the parsing is expensive, but here we have a case where loading works, but then querying doesn't work (without a workaround). So one could ask why the query parser isn't that forgiving as the parser during loading is. So, we don't have a proper round trip here.
Any ideas/comments on this?

bsteenwi · May 6, 2020, 8:39pm

Yeah, I agree with your statement.
There is somehow a mismatch between the query parser and data parser (which should be more appropriately handled imo).

system · May 20, 2020, 8:39pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
URL encoding of strings containing a url Support	4	1014	December 26, 2018
Unable to load a validated url Bug	4	738	May 26, 2017
Loading error with invalid URI Support	4	303	April 6, 2022
Unable to load data through stardog studio Bug	3	640	October 21, 2020
ERROR:IRI included an unencoded space: '32' [line 31] Support	4	2215	October 2, 2017

Escape quotes in uri

Related topics