Stardog WebFunction plugin version 1.0.2 Funkalicious

I've just released a new version 1.0.2 of the WebFunction plugin. Please note the repo name has changed as well as the jar file.

In addition to the github repo the plugin can also be found at

http://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/stardog/plugin

The main changes with this release are

  • a bug fix where memory was being over allocated. You wouldn't have noticed the problem until you tried to load large files like I have been doing with some experimental tensorflow support but now that that is fixed you might be seeing that soon.

  • Resolve wasm binary by sha256 of plugin jar rather than version number. In order to make things more robust to possibly breaking changes I decided to resolve the webassembly to be run based on the sha256 of the plugin jar rather than a version number that identifies the webassembly API. This adds additional work on the back end to re-publish all plugins for each plugin release but it assures that every function is tested against a new plugin jar. I've automated the CI/CD pipeline enough that this shouldn't be a problem.

  • Updated the default IPFS gateway from gateway.ifps.io to wf.semantalytics.com . IPNS resolution is unfortunately very slow leading to unnecessarily long delays in resolving content. Resolving directly against wf.semantalytics.com should solve that problem.

  • add wf:get function. This was added to support downloading custom tensorflow or machine learning models.

  • Increased the size of the AWS instance hosting wf.semantalytics.com

  • SSL enabled the gateway at wf.semantaltyics.com

"So where are all these functions I keep hearing about?"

I've spent most of my time behind the scenes automating the build and deployment and it's starting to come together. New functions can now be released with a single command. Before they're released a snapshot is generated that can be found at http://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/snapshot/ with a folder for the date of the snapshot. If there's ever a problem with a release you can always point to a snapshot directory to get an old version but if you're concerned, stay tuned, I'm working on a tutorial on using IPFS to keep you running even when things change.

The first plugin functions I made for Stardog were some string comparison functions so I thought I'd start there. Several of the most common ones can be found at http://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/stardog/function/string/similarity

Example query:

prefix wf: <http://semantalytics.com/2021/03/ns/stardog/webfunction/>
prefix strsim: <http://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/stardog/function/string/similarity/>

select * where {bind(wf:call(strsim:damerauLevenshtein, "stardog", "starship") as ?result)}
+--------+
| result |
+--------+
| "4"    |
+--------+

You can also request the function from IPNS

prefix wf: <http://semantalytics.com/2021/03/ns/stardog/webfunction/>
prefix strsim: <ipns://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/stardog/function/string/similarity/>

select * where {bind(wf:call(strsim:damerauLevenshtein, "stardog", "starship") as ?result)}

What's coming next

  • Tutorial on using and setting up IPFS.
  • Functions using deeplearning models
  • property functions and custom service endpoints. For large deep learning models it should be more performant if run as property functions or custom service endpoints.
  • I'm working on a prototype to do BLAST genetic sequence alignment via webfunctions. If there are any Bio people out there that are interested please let me know.
  • You should see a lot more functions available, string, math, json, NLP, object detection, arrays etc. Requests get priority.

I've been poking around with an example to answer the question, "So what do i do with these string distance functions anyhow?"

How about computing the edit distance on perceptual hashes (phash) of images possibly stored in Stardog docs? That's kinda cool. (note: I realize that this is not only and old way of doing it but it's not very performant. You want to be building an ANN index. I'll get to that later. This is just an example)

prefix f: <file:///tmp/>
prefix strsim: <http://wf.semantalytics.com/ipns/k51qzi5uqu5dmf0b5flbl31mjgj2wfjj8i6lmzw07dksiiv1y4ymipl5i55hzd/stardog/function/string/similarity/>

select
	(wf:call(str(f:phash.wasm), wf:get(?img1)) as ?hash_1)
	(wf:call(str(f:phash.wasm), wf:get(?img2)) as ?hash_2)
    (wf:call(strsim:jaroWinkler, wf:call(str(f:phash.wasm), wf:get(?img1)), wf:call(str(f:phash.wasm), wf:get(?img2))) as ?dist)
where {
	select * where {
		values ?img1 {
			<https://kb.rspca.org.au/wp-content/uploads/2018/11/golder-retriever-puppy.jpeg>
			<https://www.thesprucepets.com/thmb/EhX1rvgbEhzPlTQe-fE1o6R44OA=/941x0/filters:no_upscale():max_bytes(150000):strip_icc():format(webp)/most-obedient-dog-breeds-4796922-hero-4440a0ccec0e42c98c5e58821fc9f165.jpg>
                        <https://media.nature.com/lw800/magazine-assets/d41586-020-01430-5/d41586-020-01430-5_17977552.jpg>
			<https://cdn.britannica.com/49/161649-050-3F458ECF/Bernese-mountain-dog-grass.jpg>
		}

		values ?img2 {
			<https://kb.rspca.org.au/wp-content/uploads/2018/11/golder-retriever-puppy.jpeg>
		}
	}
}
+----------------+----------------+----------------------+
|     hash_1     |     hash_2     |         dist         |
+----------------+----------------+----------------------+
| "GjkVETs1aeI=" | "GjkVETs1aeI=" | "1.0"                |
| "Bw8FYfnB4qQ=" | "GjkVETs1aeI=" | "0.3888888888888889" |
| "NUzM7LT04eE=" | "GjkVETs1aeI=" | "0.5"                |
| "aczxczgoaFk=" | "GjkVETs1aeI=" | "0.4444444444444444" |
+----------------+----------------+----------------------+

I haven't released the phash function yet. I'd want to look over the results and make sure it's working correctly.