SMS: Create hash value from concatenated CSV columns?

Hi folks,

I am using SMS to convert data from a CSV, following the “cars” example in the manual. I need to create an IRI for an interval that has a start date and an end date. In this case, the lifespan of a person.

Start date is birthdate: {#birthdate}
End date is deathdate: {#deathdate}

I am trying to create something like:

cdp1:Lifespan_HASH
rdf:type study:Lifespan ;
time:hasBeginning cdp:Date_{#birthdate};
time:hasEnd cdp:Date_{#deathdate} .

How can I create the Lifespan hash value hash(“birthdate”+“deathdate”) ?

I’m trying to avoid pre-processing the data.

Also, is there an SMS syntax reference available? I expect I am going to need it.

Cheers,

Tim

Hi Tim,

We don’t currently expose a facility to execute functions over inputs. We have a special value you can use in the field template to generate a UUID. It is _UUID_. So you can create the lifespan IRI as cdp1:Lifespan_{_UUID_}. The IRI contents itself shouldn’t be relevant, but serve to identify the relationships with what it represents.

We haven’t published a grammar specification for SMS mappings but it looks like you have the hang of it. Basically you use the curly braces as a placeholder for field values. Let us know if you have any specific questions.

Best,
Jess

_UUID_ will work perfectly.

Thanks a bunch!

Tim

1 Like

Ok. I did not think this through in my quick reply above.

I need to refer to the same IRI in other triples.

So, creating the UUID is fine here:

cdp1:Lifespan_{_UUID_}
  rdf:type study:Lifespan ;
  time:hasBeginning cdiscpilot01:Date_{#birthdate};
  time:hasEnd cdiscpilot01:Date_{#deathdate} ;

But then how can I refer to that same UUID to assign a PERSON to that lifespan. Won’t this second _UUID_ create a new, unique ID (not what I need) ?

cdiscpilot01:Person_{#uniquePersonId}
  study:hasLifespan cdiscpilot01:Lifespan_{_UUID_}.

That is why I was thinking about concatenation: So I could recreate the identical hash value in multiple places during the data conversion.

In that case it would make sense to just use the date values in the IRI template, e.g. cdp1:Lifespan_{#birthdate}_{#deathdate}.

Jess

1 Like

Well that just makes perfect sense! I was overthinking it. :crazy_face:

Case closed! :+1:

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.