Python utilities to handle NUVA

A library available on GitHub allows to retrieve and explore NUVA. It is a work in progress, that will be progressively enriched to provide metrics on code systems based upon their mapping to NUVA codes.

The supported functions are so far:

get_nuva_version()

Returns the version index for the last publication of NUVA.

get_nuva(version)

Uploads in the current directory the referenced version in RDF/XML format as nuva_ans.rdf, and creates a rebased version nuva_ivci.rdf.

split_nuva()

From the uploaded nuva_ivci.rdf file, creates a split version as a collection of files in RDF/Turtle format:

nuva_core.ttl including the concepts for vaccines, valences, target diseases and their labels in English
nuva_lang_XX.ttl includes all translations for language XX
nuva_refcode_YYY.ttl includes the concepts and the NUVA alignments for code system YYY

refturtle_to_map(code)

Starting from the nuva_refcode_YYY.ttl file for the given code, creates a simple CSV file nuva_refcode_YYY.csv with alignments between the given code and NUVA.

map_to_turtle(code)

Assuming that the nuva_refcode_YYY.csv file has been copied to work file nuva_code_YYY.csv, then edited for enhancing the alignments, creates a Turtle work file nuva_code_YYY.ttl for further processing.

Note that the refcode file contains the NUVA English labels of vaccines for convenience, but these are not required nor processed from the work code file.

query_core(q)

Runs a SPARQL query q against the core graph loaded from nuva_core.ttl

query_code(q,code)

Runs a SPARQL query q against a graph formed by merging nuva_core.ttl and the work file nuva_code_YYY.ttl, thus allowing to run checks and measures on the alignment.

eval_code(code)

Produces the metrics for a code system, given a nuva_code_YYY.csv file for alignments.

Subproducts are:

nuva_reverse_YYY.csv : file with all NUVA codes matching a given external code
nuva_best_YYY.csv: file with the best possible external code for a given NUVA code

An example use sequence is included in the file:

# Here the main program - Adapt the work directory to your environment
 
os.chdir(str(Path.home())+"/Documents/NUVA")
get_nuva(get_nuva_version())
split_nuva()
refturtle_to_map("CVX")
shutil.copyfile("nuva_refcode_CVX.csv","nuva_code_CVX.csv")
map_to_turtle("CVX")
 
q = """ 
   # All vaccines against smallpox
    SELECT ?vcode ?vl WHERE { 
    ?dis rdfs:subClassOf nuva:Disease .
    ?dis rdfs:label "Smallpox-Monkeypox"@en .
    ?vac rdfs:subClassOf nuva:Vaccine .
    ?vac rdfs:label ?vl . 
    ?vac skos:notation ?vcode .
    ?vac nuvs:containsValence ?val . 
    ?val nuvs:prevents ?dis 
 }
"""
res = query_core(q)
for row in res:
     print (f"{row.vcode} - {row.vl}")
 
res = eval_code("CVX")
print ("Completeness {:.1%} ".format(res['Completeness']))
print ("Precision {:.1%} ".format(res['Precision']))