Stardog support for custom datatypes (CDT) related to units

Hi, I was looking into units for literal values, and I came across several methods such as QUDT / OM / ..., but also CDT:UCUM (Custom Datatypes), which seems really useful for applications that have to deal with a lot of property values with different units. It's based on the UCUM specification and covers a lot of units, both metric and imperial.

There is a playground available where the functionality is demonstrated, a draft specification and an implementation in Jena.

So my question to the community: how do you practically deal with units in Stardog?

To the developers: is CDT or other approaches something on the to-do list?

Curious to your answers!

Mathias

That's a good question and I'm looking forward to hear what other people have to say. I've had some questions about datatypes that I've been meaning to ask but haven't. Thanks for sharing the project. I wasn't aware if it and it looks interesting.

Hi Zachary, I recommend to go to that playground and explore their way of dealing with datatype properties and units. Would be great if Stardog would be able to support this :slight_smile:

I don’t think you can define custom datatypes with Stardog nor can you override operators (+,-,%, *, etc). I think you can define data type facets. (I think that’s the correct terminology. I find it somewhat confusing) although I’m not clear how a datatype facet would work in ICV vs OWL. Would it be basically the same? A violation in ICV vs an inconsistency in OWL? I suppose you could write function to extract the value and the units from literals which could be used in rules to give provide equivalent values in other units. What is the advantage to encoding the units in the literal rather than providing separate URLs for units?

With separate URL's for units, you mean the QUDT / OM approach? Advantage would be that you can make the overall DB structure more concise. Instead of the following in QUDT:

<FoI> ex:thermalTransmittance [
   a qudt11:QuantityValue ;
   qudt11:unit qudtunit11:WattPerSquareMeterKelvin ;
   qudt11:numericValue "0.27"^^xsd:double
] .

You can define the same in one triple using CDT:
<FoI> ex:thermalTransmittance "0.27 W/(m2.K)"^^cdt:ucum .
Other benefits are that you can have different property values from the same property but using different units, while you can query them using a simple filter without having to handle the unit conversions (this would be standardized). It just seems less cumbersome.

I know Stardog doesn't support the CDT approach right now, but I think it would make things a lot easier for users!

This question has sent me off on a task to get more familiar with datatypes. A quick question to someone at Stardog, am I correct that you can't write custom datatypes for Stardog? If there is can someone provide a quick example?

I'd also be interesting on hearing anyone's thoughts on the UCUM approach to handling units. It seems like an interesting approach and the frugality of triples is appealing but there's something about encoding it in a structured literal that makes me uncertain. The CDT specification sounds interesting as well and reminds me of a similar proposal for functions. Web of Functions - SPARQL examples I was a little confused about the reference to CDT:UCUM until I realized that they seem to be two different things.

This is correct, and I'm not sure that it's on the roadmap at this time.

Regarding the CDT/UCUM data, there's currently nothing stopping anyone from using those datatypes in Stardog; you just won't be able to perform operations on them. We've dipped our feet into QUDT a little for the geospatial stuff, so that or CDT would certainly be worth looking into.

1 Like