Skip to content

Workflow for data harmonization

Kathe Todd-Brown edited this page Mar 3, 2025 · 1 revision

Data harmonization is the process of transforming data from the original collection of machine readable information to a standardize format. This original data collection may be composed of a machine-readable set of relational data tables and/or graphical (list-based) data including both primary data and metadata objects. The transformation is done via a set of scripts and manually created annotation file. This final format is a single data table with the following columns: id(s) - of variable - is type - with entry. Provenance in this stage is maintained by a read script, an annotation file documented any manual data extractions or assignments, and documentation of who did this work.

*This page is a work in progress transitioning from Workflow for new data additions