Verbose mode explained

Two modes are available for each converter: verbose and regular. The output of a verbose ontology conversion is compatible with the output of a verbose data conversion, and same for the regular mode. IMPORTANT: There is no point running your data conversion in verbose mode if you don’t have access to the ontology tables in verbose mode as well (CONCEPT_DIMENSION.csv and MODIFIER_DIMENSION.csv are enough)

Background knowledge: i2b2 features a basecode that allows to join the ontology concepts (metadata) with the observation records. This basecode is independently but deterministically generated by both the converters (ontology and data). Both basecode sets should match. Mismatches are equivalent to ignored data records (since i2b2 cannot identify the concept).

  • Generating tables in verbose mode allows you to take a look at the generated code and see if they look legit

  • Since i2b2 needs capped-length codes to work, we use a recursive hash to generate the production-ready tables.

Setting up for verbose mode

In the Docker setup, the Makefile allows you to specify a distinct target directory for verbose and regular modes, and to trigger each mode with a different recipe. If running from the source files, you should either modify the OUTPUT_TABLES_LOCATION variables to make it point to a directory where you want your verbose tables to be written/read ; you should also manually switch the DEBUG config variable to “True” in the i2b2_rdf_config.json file. Then, start the conversion normally.

Interpreting the results of verbose run

If you configured correctly your I/O and config files, the program (either on verbose or regular mode) will print (last line of execution) one of the following messages: - Success! All items are consistent with the ontology. - Some concepts or modifiers are not in the ontology. Please take a look at the “logs_missing_concepts” and “logs_missing_modifiers” logfiles. If unreadable, change the “DEBUG” variable in the config files to True, and run the “make verbose” command.

First case, you are good. Second case, two logfiles were created and you should look into them and try to understand why they do not match ontology items. It could be that the RDF structure is different, or that the data contains codes that are not part of the ontology, etc. Based on this knowledge, you can either correct your data and convert it again (or correct in-place the CSV if the issue is a minor typo), or keep going (but don’t forget all records featuring codes that are in the logfiles will be ignored by the i2b2 query system).

Turn the verbose tables into production-ready tables

A bash script will do it for you. After running the data conversion in verbose mode, you will find the script postprod.bash in the same folder where your verbose tables were written. Pass as named argument -outputF the folder in which your production ontology tables are, which is also where the production data tables will be written (optionnally, use the named argument -verboseF to specify the source: default setting is where the script is located.)