�ݺ�ߣ

Common Data Quality Issues During Data Mapping
and How to Detect/Avoid Them
Remzi Celebi
remzi.celebi@maastrichtuniversity.nl
Maastricht University
Technical Coordinator, AIDAVA project
20 March 2025

Data Quality Issues during Mapping
Data quality issues can occur during mapping to a target
data model. These errors can be classified into:
○ Syntactic errors, such as incorrect date formatting or
mismatched data types.
○ Semantic errors, such as mapping a data element to the
wrong category.

Errors During Data Mapping
Variable Code Value Unit Date
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text
{
"resourceType": "Observation",
"id": "measurement-1",
"status": "final",
"code": {
"coding": [
{
"system": "http://loinc.org",
"code": "2339-0",
"display": "Glucose [Mass/volume] in Blood"
}
],
"text": "Glucose"
},
"effectiveDateTime": "2001-01-12T00:00:00Z",
"valueQuantity": {
"value": 130.0,
"unit": "mmol/L",
"system": "http://unitsofmeasure.org",
"code": "mmol/L"
}
}

The date 12.01.2001 could be in the US format.
This needs to be converted to the standard format (eg. ISO
8601 format YYYY-MM-DDThh:mm:ss+zz:zz) to ensure the
interoperability.
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text

Either the unit for Creatinine is incorrect, or the value itself is incorrect.
The value should be checked for plausibility.
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text

The identifiers in the Code column can mapped to SNOMED-CT using
Variable column, but these are the identifiers for substances being
measured, not the actual tests performed.
Therefore, mapping these to observation codes would be incorrect.
Glucose 67079006 130 mmol/l 12.01.2001
Creatinine 15373003 2.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text

RDFCraft -- tool to guide the mapping
● RDFCraft can be used to
semi-automate the mapping
process.
● Nodes are created corresponding
to schema classes.
● Links are created corresponding to
relations.
● Reduce possible errors during the
mapping.
https://github.com/MaastrichtU-IDS/RDFCraft

Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them

Standard Mapping Definitions
Click Save & Map
● Share standard mappings
● Multiple formats supported:
○ YARRRML
○ RML
○ RDF/TTL
● Use mapping engine to
execute the mapping

Data quality check services in AIDAVA
10
Missing Essential variable
- Birth date is missing.
Multiple values than expected
- A measurement must have only one quantity.
Wrong data type for property
- A hasValue for Quantity should take only string or
double.
Wrong object type for property
- A hasUnit for Quantity must take only Unit type.
Not valid code for type
- Diagnosis does not use valid code.
Data model Based Checks Medical & Common sense Checks
Conditional Completeness
- Flag If the patient is prescribed an allergy drug, and an
allergy is not in their record.
Incompatible information (date and time error);
- Flag if date of birth > date of admission
Incompatible information (gender and diagnosis)
- Flag if Gender is equal to Male, Diagnosis ‘Benign neoplasm
ovary’ is present.
Incompatible information (age and procedure)
- Flag if Age is over 12, Procedure “TONSILLECTOMY AND
ADENOIDECTOMY; UNDER AGE 12” is present..
Incompatible information (lab measurement and unit)
- Flag if Lab ‘LDL Cholesterol’ is present and Unit is not equal
to mg/dL

Key takeaways
● Different Errors can occurs
○ Syntactic errors: Mostly caused by incorrect mapping or
uncleaned data that does not conform to the target
format.
○ Semantic errors: Occur due to misinterpretation of the
source data and target model.
● To minimize errors, use appropriate tools during the
mapping process and implement data quality checks
after conversion.

�ݺ�ߣ

Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them

Recommended

More Related Content

Similar to Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them (20)

Recently uploaded (20)

Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them