UMLS to Stardog

I am interested in loading some of the UMLS ontologies found on Bioportal into Stardog. I see that there are python scripts available for converting the UMLS Mysql release to RDF (GitHub - ncbo/umls2rdf: These python scripts connect to the Unified Medical Language System (UMLS) database and translate the ontologies into RDF/OWL files. This is part of the BioPortal project.). Has anyone here tried doing this and loading the results into Stardog? Are there any issues or gotchas?

thanks,
Bonnie

I haven’t tried it but I wouldn’t anticipate any problems as long as the script produces well formed RDF and the usual suggestions found in the documentation about bulk loading datasets. If you’re feeling really ambitious you can write some R2RML that does what the python scripts do and directly import it into Stardog. Depending on the dataset size you might require an enterprise or 30-day eval license.

I’d lend you some more assistance but I’m waiting 3 days for my license to be approved so I can get access to the data. This kind of stuff is really frustrating; licensing on a government funded dataset. There’s probably some reason for it and I’m totally unjustified in being angry about it but still. I think you’ll find that working with the data will be much easier with Stardog. Most of the work will involve getting in from whatever convoluted format they’ve got it in into RDF. From there it should be smooth sailing. Let me know if you run into any problems and I’ll let you know if I come across anything that you might find helpful when I finally get access.

​Hi,

They are usually quick with the UMLS license. I’ve had mine for years because I use Metamap a lot. What is an enterprise license? In truth, I don’t need the entire thing, just some parts, but the python scripts from Bioportal appear to load everything.

You'll find a comparison between the community and enterprise version here https://www.stardog.com/versions/

You'll be limited to 25M Nodes & Edges. I think that translates to 25M triples. I'm not sure how big the final data set would be so I don't know if you'll run up against this limit. Extracting a subset before loading it into Stardog might be somewhat tricky depending on exactly what you'd like to load.

On that note it might be really cool if there was some sort of read only Stardog image with some large but popular datasets that you could spin up under the community license.

​Thanks. Is there a link to a page for pricing?

No but I’m sure someone will reach out to you now that you’ve asked or you can email sales@stardog.com for that.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.