is there such feature with javascript api?
i've got this error after query all tripples
Cannot create a string from buffer longer than 0xffffff0 characters
is there such feature with javascript api?
i've got this error after query all tripples
Cannot create a string from buffer longer than 0xffffff0 characters
It's a javascript thing. Strings can't be larger than about 256Mb
is there a way to stream query?
or does limit and offset work with query recursively?
this is my query
SELECT
(stardog:functions:localname(?labelUri) as ?label)
?definition
(stardog:functions:localname(?broaderUri) as ?broader)
(stardog:functions:localname(?narrowerUri) as ?narrower)
(stardog:functions:localname(?relatedUri) as ?related)
(stardog:functions:localname(?similarUri) as ?similar)
WHERE {
{
?labelUri skos:definition ?definition .
} UNION {
?labelUri skos:broader ?broaderUri .
} UNION {
?labelUri skos:narrower ?narrowerUri .
} UNION {
?labelUri skos:related ?relatedUri .
} UNION {
?labelUri skosextends:similar ?similarUri .
}
}
The result is streamed from the server so it's up to the client to avoid buffering if you need that behavior.
what is "client" here? javascript stardog.js client?
const fs = require('fs-extra')
const _ = require('lodash')
global.constants = require('../constants')
const sparqlQuery = require('../utils/stardog')
const query = `
SELECT
(stardog:functions:localname(?labelUri) as ?label)
?definition
(stardog:functions:localname(?broaderUri) as ?broader)
(stardog:functions:localname(?narrowerUri) as ?narrower)
(stardog:functions:localname(?relatedUri) as ?related)
(stardog:functions:localname(?similarUri) as ?similar)
WHERE {
{
?labelUri skos:definition ?definition .
} UNION {
?labelUri skos:broader ?broaderUri .
} UNION {
?labelUri skos:narrower ?narrowerUri .
} UNION {
?labelUri skos:related ?relatedUri .
} UNION {
?labelUri skosextends:similar ?similarUri .
}
}
`
async function queryResult ({ limit, offset }) {
const resp = await sparqlQuery(query, {
reasoning: true,
limit,
offset,
})
return _.get(resp, 'body.results.bindings', [])
}
if (require.main === module) {
async function run () {
const data = {}
let count = 10000
let offset = 0
let limit = count
while (true) {
console.log('query', limit)
const result = await queryResult({ limit, offset })
if (!result.length) break
// if (limit > 200000) break
for (const node of result) {
const label = node?.label?.value
if (!data[label]) {
data[label] = {
uri: label,
definition: '',
broader: [],
narrower: [],
related: [],
similar: [],
}
}
const definition = node?.definition?.value
if (definition) data[label]['definition'] = definition
const broader = node?.broader?.value
if (broader) data[label]['broader'].push(broader)
const narrower = node?.narrower?.value
if (narrower) data[label]['narrower'].push(narrower)
const related = node?.related?.value
if (related) data[label]['related'].push(related)
const similar = node?.similar?.value
if (similar) data[label]['similar'].push(similar)
}
limit += count
offset = limit
}
fs.writeFileSync('./test.json', JSON.stringify(data, null, 2))
}
run()
}
i'm have about ~17M tripples
i want to query them all and build static page
but this looks never stop, and it slower after each iterate
don't know if this is right approach for such scenario?
or maybe there's a db cursor, i can stream it somehow?
what is "client" here? javascript stardog.js client?
You don't need to use stardog.js for this, but we recommend it. It does support streaming of query results. You just need to pass an onResponseStart
handler to the query.execute
method of stardog.js. When the response from Stardog begins, that handler will receive (as its only argument) the response stream, and you can then do what you wish with it. Also, if the handler returns false
, stardog.js will do no further processing, which is what it sounds like you're going for in this case (if the handler does not return false
, then stardog.js will also try to buffer the response as usual -- that is not typically the behavior you want (because the stream can't be read twice), but it can be useful if you just need to know when the response has started for other reasons, but otherwise want stardog.js to proceed as usual).
Here's an example of what this might look like using the latest version of stardog.js (the last argument to query.execute
here is the important part):
query.execute(
conn,
'myDatabaseName',
'select distinct ?s where { ?s ?p ?o }',
undefined,
{
limit: 10,
},
{
onResponseStart: (response) => {
// Stream the response body directly to file. No buffering or other processing.
const stream = fs.createWriteStream(someFilePath);
response.body.pipe(stream);
return false;
},
}
);
A couple of other comments about the code you just posted. I haven't read it extremely carefully (and this isn't the place to diagnose end-user code generally), but I did notice a couple of things:
JSON.stringify
is a synchronous call and it will always force the JavaScript engine to buffer the entirety of your data
into memory, with the result that you are losing most of the benefits of streaming (you ultimately have to "stop the world" while all data is buffered into memory). Instead, you probably want to stream results directly to file in the way that I indicated in the example above, or use some kind of streaming JSON stringifier.I hope this helps.
Jason
oh incrementing the limit that's a mistake in code.
thanks
will try onResponseStart~
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.