際際滷

際際滷Share a Scribd company logo
Common Data Quality Issues During Data Mapping
and How to Detect/Avoid Them
Remzi Celebi
remzi.celebi@maastrichtuniversity.nl
Maastricht University
Technical Coordinator, AIDAVA project
20 March 2025
Data Quality Issues during Mapping
Data quality issues can occur during mapping to a target
data model. These errors can be classified into:
 Syntactic errors, such as incorrect date formatting or
mismatched data types.
 Semantic errors, such as mapping a data element to the
wrong category.
Errors During Data Mapping
Variable Code Value Unit Date
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text
{
"resourceType": "Observation",
"id": "measurement-1",
"status": "final",
"code": {
"coding": [
{
"system": "http://loinc.org",
"code": "2339-0",
"display": "Glucose [Mass/volume] in Blood"
}
],
"text": "Glucose"
},
"effectiveDateTime": "2001-01-12T00:00:00Z",
"valueQuantity": {
"value": 130.0,
"unit": "mmol/L",
"system": "http://unitsofmeasure.org",
"code": "mmol/L"
}
}
Errors During Data Mapping
The date 12.01.2001 could be in the US format.
This needs to be converted to the standard format (eg. ISO
8601 format YYYY-MM-DDThh:mm:ss+zz:zz) to ensure the
interoperability.
Variable Code Value Unit Date
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text
Errors During Data Mapping
Either the unit for Creatinine is incorrect, or the value itself is incorrect.
The value should be checked for plausibility.
Variable Code Value Unit Date
Glucose 2339-0 130 mmol/l 12.01.2001
Creatinine 2160-0 100.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text
Errors During Data Mapping
The identifiers in the Code column can mapped to SNOMED-CT using
Variable column, but these are the identifiers for substances being
measured, not the actual tests performed.
Therefore, mapping these to observation codes would be incorrect.
Variable Code Value Unit Date
Glucose 67079006 130 mmol/l 12.01.2001
Creatinine 15373003 2.0 mg/dL 07.02.2003
... ... ...
Observation
code
quantity
date
unit
value
text
RDFCraft -- tool to guide the mapping
 RDFCraft can be used to
semi-automate the mapping
process.
 Nodes are created corresponding
to schema classes.
 Links are created corresponding to
relations.
 Reduce possible errors during the
mapping.
https://github.com/MaastrichtU-IDS/RDFCraft
Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them
Standard Mapping Definitions
Click Save & Map
 Share standard mappings
 Multiple formats supported:
 YARRRML
 RML
 RDF/TTL
 Use mapping engine to
execute the mapping
Data quality check services in AIDAVA
10
Missing Essential variable
- Birth date is missing.
Multiple values than expected
- A measurement must have only one quantity.
Wrong data type for property
- A hasValue for Quantity should take only string or
double.
Wrong object type for property
- A hasUnit for Quantity must take only Unit type.
Not valid code for type
- Diagnosis does not use valid code.
Data model Based Checks Medical & Common sense Checks
Conditional Completeness
- Flag If the patient is prescribed an allergy drug, and an
allergy is not in their record.
Incompatible information (date and time error);
- Flag if date of birth > date of admission
Incompatible information (gender and diagnosis)
- Flag if Gender is equal to Male, Diagnosis Benign neoplasm
ovary is present.
Incompatible information (age and procedure)
- Flag if Age is over 12, Procedure TONSILLECTOMY AND
ADENOIDECTOMY; UNDER AGE 12 is present..
Incompatible information (lab measurement and unit)
- Flag if Lab LDL Cholesterol is present and Unit is not equal
to mg/dL
Key takeaways
 Different Errors can occurs
 Syntactic errors: Mostly caused by incorrect mapping or
uncleaned data that does not conform to the target
format.
 Semantic errors: Occur due to misinterpretation of the
source data and target model.
 To minimize errors, use appropriate tools during the
mapping process and implement data quality checks
after conversion.

More Related Content

Similar to Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them (20)

Chapter 2 Cond (1).ppt
Chapter 2 Cond (1).pptChapter 2 Cond (1).ppt
Chapter 2 Cond (1).ppt
kannaradhas
The challenges of 3D Personal Data
The challenges of 3D Personal DataThe challenges of 3D Personal Data
The challenges of 3D Personal Data
Big Data Value Association
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged Care
Altegra Health
Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.
Genest Benoit
Preprocessing data mining hhxdzsdsasaasa
Preprocessing data mining hhxdzsdsasaasaPreprocessing data mining hhxdzsdsasaasa
Preprocessing data mining hhxdzsdsasaasa
Suvedha8
machinelearning-191005133446.pdf
machinelearning-191005133446.pdfmachinelearning-191005133446.pdf
machinelearning-191005133446.pdf
LellaLinton
dq_fail.pdf
dq_fail.pdfdq_fail.pdf
dq_fail.pdf
arifulislam946965
Data Exploration and Transformation.pptx
Data Exploration and Transformation.pptxData Exploration and Transformation.pptx
Data Exploration and Transformation.pptx
lovepreet33653
Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast Review
Ahmad Ali Abin
ML-ChapterTwo-Data Preprocessing.ppt
ML-ChapterTwo-Data Preprocessing.pptML-ChapterTwo-Data Preprocessing.ppt
ML-ChapterTwo-Data Preprocessing.ppt
belay41
Data quality overview
Data quality overviewData quality overview
Data quality overview
Alex Meadows
Preprocessing
PreprocessingPreprocessing
Preprocessing
Kiran Bhowmick
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
EMC
DM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year studentsDM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year students
sriharipatilin
@vtucode.in-21CS71-module-1-pdf.pdfBig data
@vtucode.in-21CS71-module-1-pdf.pdfBig data@vtucode.in-21CS71-module-1-pdf.pdfBig data
@vtucode.in-21CS71-module-1-pdf.pdfBig data
sanjanakorawar
CTMS Data Migration by Krishnaveni Rapuru
CTMS Data Migration  by Krishnaveni RapuruCTMS Data Migration  by Krishnaveni Rapuru
CTMS Data Migration by Krishnaveni Rapuru
MuraliRaj M
Data quality testing a quick checklist to measure and improve data quality
Data quality testing  a quick checklist to measure and improve data qualityData quality testing  a quick checklist to measure and improve data quality
Data quality testing a quick checklist to measure and improve data quality
JaveriaGauhar
Analytics 101 - Getting Started
Analytics 101 - Getting Started Analytics 101 - Getting Started
Analytics 101 - Getting Started
Gautam Munshi
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSIONPREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
IRJET Journal
Data Preprocessing_17924109858fc09abd41bc880e540c13.ppt
Data Preprocessing_17924109858fc09abd41bc880e540c13.pptData Preprocessing_17924109858fc09abd41bc880e540c13.ppt
Data Preprocessing_17924109858fc09abd41bc880e540c13.ppt
MuhweziArthur1
Chapter 2 Cond (1).ppt
Chapter 2 Cond (1).pptChapter 2 Cond (1).ppt
Chapter 2 Cond (1).ppt
kannaradhas
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged Care
Altegra Health
Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.
Genest Benoit
Preprocessing data mining hhxdzsdsasaasa
Preprocessing data mining hhxdzsdsasaasaPreprocessing data mining hhxdzsdsasaasa
Preprocessing data mining hhxdzsdsasaasa
Suvedha8
machinelearning-191005133446.pdf
machinelearning-191005133446.pdfmachinelearning-191005133446.pdf
machinelearning-191005133446.pdf
LellaLinton
Data Exploration and Transformation.pptx
Data Exploration and Transformation.pptxData Exploration and Transformation.pptx
Data Exploration and Transformation.pptx
lovepreet33653
Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast Review
Ahmad Ali Abin
ML-ChapterTwo-Data Preprocessing.ppt
ML-ChapterTwo-Data Preprocessing.pptML-ChapterTwo-Data Preprocessing.ppt
ML-ChapterTwo-Data Preprocessing.ppt
belay41
Data quality overview
Data quality overviewData quality overview
Data quality overview
Alex Meadows
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
EMC
DM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year studentsDM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year students
sriharipatilin
@vtucode.in-21CS71-module-1-pdf.pdfBig data
@vtucode.in-21CS71-module-1-pdf.pdfBig data@vtucode.in-21CS71-module-1-pdf.pdfBig data
@vtucode.in-21CS71-module-1-pdf.pdfBig data
sanjanakorawar
CTMS Data Migration by Krishnaveni Rapuru
CTMS Data Migration  by Krishnaveni RapuruCTMS Data Migration  by Krishnaveni Rapuru
CTMS Data Migration by Krishnaveni Rapuru
MuraliRaj M
Data quality testing a quick checklist to measure and improve data quality
Data quality testing  a quick checklist to measure and improve data qualityData quality testing  a quick checklist to measure and improve data quality
Data quality testing a quick checklist to measure and improve data quality
JaveriaGauhar
Analytics 101 - Getting Started
Analytics 101 - Getting Started Analytics 101 - Getting Started
Analytics 101 - Getting Started
Gautam Munshi
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSIONPREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
PREDICTION OF HEART DISEASE USING LOGISTIC REGRESSION
IRJET Journal
Data Preprocessing_17924109858fc09abd41bc880e540c13.ppt
Data Preprocessing_17924109858fc09abd41bc880e540c13.pptData Preprocessing_17924109858fc09abd41bc880e540c13.ppt
Data Preprocessing_17924109858fc09abd41bc880e540c13.ppt
MuhweziArthur1

Recently uploaded (20)

Bacterial Endotoxin Testing_Basic_Presentation.ppt
Bacterial Endotoxin Testing_Basic_Presentation.pptBacterial Endotoxin Testing_Basic_Presentation.ppt
Bacterial Endotoxin Testing_Basic_Presentation.ppt
BhapinderGrover1
Dr. Puneet Agarwal - Best Neurologist in Delhi.pptx
Dr. Puneet Agarwal - Best Neurologist in Delhi.pptxDr. Puneet Agarwal - Best Neurologist in Delhi.pptx
Dr. Puneet Agarwal - Best Neurologist in Delhi.pptx
Dr. Puneet Agarwal - Best Neurologist in India
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptxMORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
Dr. Ravikiran H M Gowda
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptxUnit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Mayuri Chavan
Brain_Stroke_Presentation_in short .pptx
Brain_Stroke_Presentation_in short .pptxBrain_Stroke_Presentation_in short .pptx
Brain_Stroke_Presentation_in short .pptx
Shivangi kushwaha
Best PCD Pharma Company in Haryana, India
Best PCD Pharma Company in Haryana, IndiaBest PCD Pharma Company in Haryana, India
Best PCD Pharma Company in Haryana, India
Medkul Pharmaceuticals
TESTS DURING PREGNANCY.pptx. during all trimester
TESTS DURING PREGNANCY.pptx. during all trimesterTESTS DURING PREGNANCY.pptx. during all trimester
TESTS DURING PREGNANCY.pptx. during all trimester
Beena Vaza
LECTURE 1. Drugs affecting the respiratory system.pptx
LECTURE 1. Drugs affecting  the respiratory system.pptxLECTURE 1. Drugs affecting  the respiratory system.pptx
LECTURE 1. Drugs affecting the respiratory system.pptx
kaymokgwadi
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdfPunita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptxPATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
Shivangi kushwaha
ANTENATAL CARE.pptx..during pregnancy visit.
ANTENATAL CARE.pptx..during pregnancy visit.ANTENATAL CARE.pptx..during pregnancy visit.
ANTENATAL CARE.pptx..during pregnancy visit.
Beena Vaza
MICROBIOLOGY FOR NURSES [Autosaved].pptx
MICROBIOLOGY FOR NURSES [Autosaved].pptxMICROBIOLOGY FOR NURSES [Autosaved].pptx
MICROBIOLOGY FOR NURSES [Autosaved].pptx
vikram singh
/slideshow/Assessment & Planing.pptx
/slideshow/Assessment & Planing.pptx/slideshow/Assessment & Planing.pptx
/slideshow/Assessment & Planing.pptx
poojadighe10
The Pillow That Adapts to You for Dr Trigger
The Pillow That Adapts to You for Dr TriggerThe Pillow That Adapts to You for Dr Trigger
The Pillow That Adapts to You for Dr Trigger
doctortriggerofficia
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changesCHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
Beena Vaza
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptxKetamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
MatSouthwell1
How to Improve EMS Billing and Get Paid Faster
How to Improve EMS Billing and Get Paid FasterHow to Improve EMS Billing and Get Paid Faster
How to Improve EMS Billing and Get Paid Faster
Traumasoft LLC
ENCEPHALITIS pathology lecture for nursing.pptx
ENCEPHALITIS pathology lecture for nursing.pptxENCEPHALITIS pathology lecture for nursing.pptx
ENCEPHALITIS pathology lecture for nursing.pptx
Shivangi kushwaha
HERBAL EXCIPIENTS herbal drug technology
HERBAL EXCIPIENTS herbal drug technologyHERBAL EXCIPIENTS herbal drug technology
HERBAL EXCIPIENTS herbal drug technology
Poonam569362
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptxHow Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
Eminence RCM
Bacterial Endotoxin Testing_Basic_Presentation.ppt
Bacterial Endotoxin Testing_Basic_Presentation.pptBacterial Endotoxin Testing_Basic_Presentation.ppt
Bacterial Endotoxin Testing_Basic_Presentation.ppt
BhapinderGrover1
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptxMORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
MORTALITY AND MORBIDITY MEETING PRESENTATION.pptx
Dr. Ravikiran H M Gowda
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptxUnit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Unit 3 Sympathomimetic drugs/Adrenergic agonists U3.ptx
Mayuri Chavan
Brain_Stroke_Presentation_in short .pptx
Brain_Stroke_Presentation_in short .pptxBrain_Stroke_Presentation_in short .pptx
Brain_Stroke_Presentation_in short .pptx
Shivangi kushwaha
Best PCD Pharma Company in Haryana, India
Best PCD Pharma Company in Haryana, IndiaBest PCD Pharma Company in Haryana, India
Best PCD Pharma Company in Haryana, India
Medkul Pharmaceuticals
TESTS DURING PREGNANCY.pptx. during all trimester
TESTS DURING PREGNANCY.pptx. during all trimesterTESTS DURING PREGNANCY.pptx. during all trimester
TESTS DURING PREGNANCY.pptx. during all trimester
Beena Vaza
LECTURE 1. Drugs affecting the respiratory system.pptx
LECTURE 1. Drugs affecting  the respiratory system.pptxLECTURE 1. Drugs affecting  the respiratory system.pptx
LECTURE 1. Drugs affecting the respiratory system.pptx
kaymokgwadi
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdfPunita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki. About Myself & An Introduction to OT_April 2025.pdf
Punita V. Solanki
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptxPATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
PATHOLOGY OF MENINGITIS TYPES ETIOLOGY.pptx
Shivangi kushwaha
ANTENATAL CARE.pptx..during pregnancy visit.
ANTENATAL CARE.pptx..during pregnancy visit.ANTENATAL CARE.pptx..during pregnancy visit.
ANTENATAL CARE.pptx..during pregnancy visit.
Beena Vaza
MICROBIOLOGY FOR NURSES [Autosaved].pptx
MICROBIOLOGY FOR NURSES [Autosaved].pptxMICROBIOLOGY FOR NURSES [Autosaved].pptx
MICROBIOLOGY FOR NURSES [Autosaved].pptx
vikram singh
/slideshow/Assessment & Planing.pptx
/slideshow/Assessment & Planing.pptx/slideshow/Assessment & Planing.pptx
/slideshow/Assessment & Planing.pptx
poojadighe10
The Pillow That Adapts to You for Dr Trigger
The Pillow That Adapts to You for Dr TriggerThe Pillow That Adapts to You for Dr Trigger
The Pillow That Adapts to You for Dr Trigger
doctortriggerofficia
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changesCHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
CHANGES IN WOMEN DURING PREGNANCY.pptx minor changes
Beena Vaza
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptxKetamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
Ketamine 2025 Presentation - Mat Southwell Coact Technical Support.pptx
MatSouthwell1
How to Improve EMS Billing and Get Paid Faster
How to Improve EMS Billing and Get Paid FasterHow to Improve EMS Billing and Get Paid Faster
How to Improve EMS Billing and Get Paid Faster
Traumasoft LLC
ENCEPHALITIS pathology lecture for nursing.pptx
ENCEPHALITIS pathology lecture for nursing.pptxENCEPHALITIS pathology lecture for nursing.pptx
ENCEPHALITIS pathology lecture for nursing.pptx
Shivangi kushwaha
HERBAL EXCIPIENTS herbal drug technology
HERBAL EXCIPIENTS herbal drug technologyHERBAL EXCIPIENTS herbal drug technology
HERBAL EXCIPIENTS herbal drug technology
Poonam569362
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptxHow Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
How Quality Checks Can Enhance Dermatology Revenue Cycle Management.pptx
Eminence RCM

Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them

  • 1. Common Data Quality Issues During Data Mapping and How to Detect/Avoid Them Remzi Celebi remzi.celebi@maastrichtuniversity.nl Maastricht University Technical Coordinator, AIDAVA project 20 March 2025
  • 2. Data Quality Issues during Mapping Data quality issues can occur during mapping to a target data model. These errors can be classified into: Syntactic errors, such as incorrect date formatting or mismatched data types. Semantic errors, such as mapping a data element to the wrong category.
  • 3. Errors During Data Mapping Variable Code Value Unit Date Glucose 2339-0 130 mmol/l 12.01.2001 Creatinine 2160-0 100.0 mg/dL 07.02.2003 ... ... ... Observation code quantity date unit value text { "resourceType": "Observation", "id": "measurement-1", "status": "final", "code": { "coding": [ { "system": "http://loinc.org", "code": "2339-0", "display": "Glucose [Mass/volume] in Blood" } ], "text": "Glucose" }, "effectiveDateTime": "2001-01-12T00:00:00Z", "valueQuantity": { "value": 130.0, "unit": "mmol/L", "system": "http://unitsofmeasure.org", "code": "mmol/L" } }
  • 4. Errors During Data Mapping The date 12.01.2001 could be in the US format. This needs to be converted to the standard format (eg. ISO 8601 format YYYY-MM-DDThh:mm:ss+zz:zz) to ensure the interoperability. Variable Code Value Unit Date Glucose 2339-0 130 mmol/l 12.01.2001 Creatinine 2160-0 100.0 mg/dL 07.02.2003 ... ... ... Observation code quantity date unit value text
  • 5. Errors During Data Mapping Either the unit for Creatinine is incorrect, or the value itself is incorrect. The value should be checked for plausibility. Variable Code Value Unit Date Glucose 2339-0 130 mmol/l 12.01.2001 Creatinine 2160-0 100.0 mg/dL 07.02.2003 ... ... ... Observation code quantity date unit value text
  • 6. Errors During Data Mapping The identifiers in the Code column can mapped to SNOMED-CT using Variable column, but these are the identifiers for substances being measured, not the actual tests performed. Therefore, mapping these to observation codes would be incorrect. Variable Code Value Unit Date Glucose 67079006 130 mmol/l 12.01.2001 Creatinine 15373003 2.0 mg/dL 07.02.2003 ... ... ... Observation code quantity date unit value text
  • 7. RDFCraft -- tool to guide the mapping RDFCraft can be used to semi-automate the mapping process. Nodes are created corresponding to schema classes. Links are created corresponding to relations. Reduce possible errors during the mapping. https://github.com/MaastrichtU-IDS/RDFCraft
  • 9. Standard Mapping Definitions Click Save & Map Share standard mappings Multiple formats supported: YARRRML RML RDF/TTL Use mapping engine to execute the mapping
  • 10. Data quality check services in AIDAVA 10 Missing Essential variable - Birth date is missing. Multiple values than expected - A measurement must have only one quantity. Wrong data type for property - A hasValue for Quantity should take only string or double. Wrong object type for property - A hasUnit for Quantity must take only Unit type. Not valid code for type - Diagnosis does not use valid code. Data model Based Checks Medical & Common sense Checks Conditional Completeness - Flag If the patient is prescribed an allergy drug, and an allergy is not in their record. Incompatible information (date and time error); - Flag if date of birth > date of admission Incompatible information (gender and diagnosis) - Flag if Gender is equal to Male, Diagnosis Benign neoplasm ovary is present. Incompatible information (age and procedure) - Flag if Age is over 12, Procedure TONSILLECTOMY AND ADENOIDECTOMY; UNDER AGE 12 is present.. Incompatible information (lab measurement and unit) - Flag if Lab LDL Cholesterol is present and Unit is not equal to mg/dL
  • 11. Key takeaways Different Errors can occurs Syntactic errors: Mostly caused by incorrect mapping or uncleaned data that does not conform to the target format. Semantic errors: Occur due to misinterpretation of the source data and target model. To minimize errors, use appropriate tools during the mapping process and implement data quality checks after conversion.