Release of String library

I'm happy to announce the 1.0 release of a string library for Stardog.

The main purpose of this library is to ease data cleanup without having to transition out of SPARQL. The library consists of the entirety of StringUtils from Apache commons-lang3, a couple of functions from commons-text and CaseFormat from Google Guava.

Functions that normally return an array of strings will return a string with array elements separated by the Ascii record separator (\u001f) which can then be either returned by index with indexofArray(?myStrArray, 1) or bound to results with a property function (?str ?idx) string:array (?myStrArray)

Currently the library doesn't honor language tags but that is on the roadmap for a future release.

I understand that the library duplicates several functions from the standard library but I have some things that I would like to explore in a future release and for that I would need those functions available in the library.

There is currently a lack of documentation which I hope to address shortly.

Some use cases that I envisioned that the library might be used for are the following:

ETL, cleaning up datasets. Everything from basic transformation, stripping some types of characters, etc. Some functions can be replaced by a combination of standard functions but sometimes they might be difficult to construct making quick data cleaning difficult and would make the intent of the query difficult to understand by quickly examining it.

Easily generating ascii histograms for data exploration. I plan on releasing a future library for generating asciii charts, box plots, etc

Formatting output for reports. Centering text, surrounding with formatting characters, etc

I've registered a new git hub repo at the following address exclusively for Stardog projects and plugins.

I have also released a new version of the previously released string metrics library that adds a couple of new functions. Phonetic algorithms will be moved to a separate plugin shortly.

Also located in the new repo is the a new project called kibbles-lab. Kibbles is what I'm calling Stardog plugins. (If Maven can call plugins mojos I can call Stardog plugins, Kibbles). Lab is short for Laboratory and Labrador to keep with the dog theme. I will start projects and ideas there and eventually graduate them to separate repositories.

https://github.com/semantalytics-stardog/kibbles-lab

I've found these functions to be useful in my daily routine and I hope that others will as well. Feedback is always appreciated.

Thank you to the Stardog team for building such a wonderful tool that has always been exciting and a joy to work with.

--Zachary Whitley

Phonetic strings library as promised

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.