ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
CADX150
Healthcare Analytics
Module III
Dr.Jose Reena K
• What is a database?
• A database is essentially a collection of data that has been arranged into a
structure that allows it to be easily read, edited, added or deleted.
• What is a database management system?
• Database Management System is a system designed to manage the
automatic and orderly database. The Database Management System is an
automatic system helping the user to control information, create, update
and maintain the database.
• What do databases do in healthcare?
• Healthcare databases help individual medical organizations understand their
daily activities and their place within the larger healthcare industry. This
allows healthcare professionals to make decisions about how they run their
businesses, the work they do, and the systems they use to manage their
operations. With the rapid advancement of healthcare technology and post-
COVID-19 changes in healthcare operations, it is more vital than ever for
healthcare databases to be organized, well-maintained, and simple to use.
• Why is database management important?
• Healthcare databases have been an essential component of understanding and
improving critical care worldwide. The importance of database technology in healthcare
cannot be overstated. The Healthcare database system generates data that requires
delicate handling and is developed for the purpose of assessing the quality of healthcare,
often for a specific disease or within a specific healthcare delivery system. databases in
healthcare can promote:
• 1. Assessing the quality of healthcare
• Hospitals, health departments, local, state and federal agencies, to name a few, all
contribute information to healthcare databases. Healthcare specialists can gain a
thorough understanding of the quality of current healthcare operations by analyzing
aspects such as; financing available to healthcare facilities, service availability and
affordability, use of innovation, and barriers to seeking care.
• 2. Tracking and Monitoring
• Medical workers can utilize reporting and logging tools to keep track of operations with
the help of healthcare databases. these aids healthcare providers in monitoring and
improving the quality of patient treatment by providing an important tool for tracking
healthcare use. Healthcare providers, for instance, need to know whether people are
getting their vaccination or not.
Healthcare DBMS
• A database can be defined as a collection of related data (Elmasri and
Navathe, 2010).
• Databases are often subcategorized as SQL databases or NoSQL databases.
• In SQL databases, data is recorded in tables and consists of rows and
columns. The related data may be distributed across several tables in a
trade-off between efficient storage and convenience.
• The database management system (DBMS) is a piece of software that
enables the database to serve several functions. For one thing, it allows for
the retrieval of data using the SQL language (for SQL databases). Another
function is to update the data when needed, also using SQL. Additional
functions of a DBMS include protecting and securing the data.
Data engineering with SQL – an example case
• For this chapter, let's pretend you secured a predictive analytics assignment
with a cardiology practice located in the United States. The practice wants
you to predict which patients are at risk of dying within 6 months of their
visit to the clinic. They make their data available to you in the form of a
database that includes six tables.
• For simplicity, we truncate the database to include the information for five
patients only. Our task is to manipulate the data using the SQL language to
consolidate it into a single table so that it can be used for machine learning.
We will first go over the patients in the database and the database structure.
Then, we will introduce basic SQL concepts for engineering and manipulate
the data into a form amenable to machine learning.
Case details – predicting mortality for a cardiology practice
• The cardiology practice you are working with has two physicians on
staff: Dr. Johnson and Dr. Wu. While the practice has many patients,
they are interested in identifying which patients who visit are at high
risk of all-cause mortality within the next 6 months.
• Now that we've reviewed the details of the modeling assignment,
let's take a look at the five patients in the database. The preliminary
data sent to you by the cardiology practice includes information on
five patients, distributed across six tables.
Case details – predicting mortality for a cardiology practice
The following is the information about the patients:
• Patient ID-1: Patient #1 in the database is a 65-year-old male who has congestive heart failure (CHF), a chronic
condition in which the heart is unable to pump blood properly to the rest of the body. He also has hypertension
(high blood pressure), which is a risk factor for CHF. He visited his cardiologist, Dr. Johnson, on 9/1/2016 and
17/1/2016. On his January 9th visit, he was found to have an elevated BP (154/94) and an elevated B-natriuretic
peptide (BNP) lab value of 350. BNP is a marker of CHF severity. He was subsequently placed on lisinopril and
furosemide, which are first-line treatments for CHF and hypertension. Unfortunately, he passed away on May
15th, 2016.
• Patient ID-2: Patient #2 is a 39-year-old female with a history of angina pectoris (cardiovascular-related chest
pain upon exercising) and diabetes mellitus. Diabetes mellitus is a risk factor for myocardial infarction (heart
attack; a late, often fatal manifestation of atherosclerotic heart disease), and angina pectoris can be seen as an
early manifestation of atherosclerotic heart disease. She visited her cardiologist, Dr. Wu, on January 15th, 2016, at
which time she was found to have an elevated blood glucose level of 225, a sign of uncontrolled diabetes. She
was started on metformin for her diabetes, as well as nitroglycerin, aspirin, and metoprolol for her angina.
• Patient ID-3: Patient #3 is a 32-year-old female who sees Dr. Johnson for management of her hypertension.
During her visit on February 1st, 2016 her blood pressure was elevated at 161/100. She was started on
valsartan/hydrochlorothiazide, an anti-hypertensive combination.
• Patient ID: 4: Patient #4 is a 51-year-old male who has severe CHF with
pulmonary hypertension. He saw Dr. Wu on February 27th, 2016. During
that visit, his weight was 211 lbs and his blood pressure was slightly
elevated at 143/84. His BNP level was highly elevated at 1,000. He was
given lisinopril and furosemide for his CHF as well as diltiazem for his
pulmonary hypertension. Unfortunately, he passed away on June 8th, 2016.
• Patient ID-5: The last patient in our database, patient #5, is a 58-year-old
male who presented to Dr. Wu on March 1st, 2016 with a history of CHF
and diabetes mellitus Type 2. During the visit, his glucose was elevated at
318 and BNP was moderately elevated at 400. He was started on lisinopril
and furosemide for his CHF and metformin for his diabetes.
The clinical database
• Now that we've gotten to know the five patients whose information is contained in our database,
we can describe the table structure and fields contained in the database, for six mock tables:
PATIENT, VISIT, MEDICATIONS, LABS, VITALS, and MORT.
• Although every clinical database is different, I've tried to use a structure that is commonly seen in
healthcare. Typically, tables are presented by clinical domains (for an example of a research study
that received tables in such a distributed format, see Basole et al., 2015). For example, there is
often one table that contains demographic and personal information, one table for lab results,
one for medications, and so on, so that is how we constructed the database in this example. They
tend to be tied together by a common identifier, which in our case is the Pid field.
• As we describe the tables, we must keep our end-goal of the data engineering phase in mind–to
combine the relevant information from the six tables into a single table, whose columns include
the target variable (mortality in this case) in addition to predictor variables, which should be
useful for predicting the target variable. This will enable us to make a machine learning model
with popular packages such as Python's scikit-learn. With this in mind, we will highlight selected
fields that will be useful for our assignment.
The PATIENT table
• In our example, the PATIENT table, which we can see in the following
screenshot, contains the demographic and identifying information of
our patients–their names, contact information, birthdays, and
biological sex.
• In this example, there are only five observations and 11 columns; in
real practice, this table would contain all of the patients affiliated with
the healthcare organization. The number of rows in this table might
range from hundreds to hundreds of thousands, while the table could
potentially include dozens of columns containing detailed
demographic information:
• In the database, every unique patient is assigned to an identifier (the
field labeled as Pid), which in our case is simply numbered 1 - 5. The
Pid column allows us to keep track of the patients across different
tables. Also, notice that there is one and only one entry for each
distinct patient ID.
The PATIENT table
• After identifying the indispensable identifer column, the focus should be on which variables to
keep and which to discard. Certainly, age and sex are important demographic predictors of
mortality. If race were in this table, that would be another important demographic variable.
• In the database, every unique patient is assigned to an identifier (the field labeled as Pid), which in
our case is simply numbered 1 - 5. The Pid column allows us to keep track of the patients across
different tables. Also, notice that there is one and only one entry for each distinct patient ID.
• Another notable variable in this table is the zip code. Increasingly, socioeconomic data is being
used in machine learning analyses. The zip code can potentially be tied to publicly available
census data; that data can then be joined to the data in this table on the zip code and could
potentially provide information on the average education level, income, and healthcare coverage
for each patient's zip code. There are even organizations who sell household-level information;
however, with that data comes a great responsibility for privacy protection and data security. For
this example, we will omit the zip code to keep our final table simple.
• Information we'll leave out from our final table includes names, street addresses, and phone
numbers. As long as we have the patient ID, these fields shouldn't have much of a predictive
impact on our target variable.
The VISIT table
• While the PATIENT table contains basic administrative information
about each patient, our assignment is to predict the mortality risk on
the basis of each visit. The VISIT table contains one observation for
each patient visit, along with some clinical information about each visit:
• Notice that the patient ID is no longer the primary identifier of this
table, since Patient #1 had two visits; instead, there is a Visit_id field
that is numbered from 10001 to 10006 in this example, with one
distinct ID per visit.
The VISIT table
• This table also contains Visit_date. Since the cardiology practice
indicated they want to know the mortality risk within 6 months of the
patient visit, we will have to use this field later when we compute the
target variable.
• Two of the fields in this table contain ICD (diagnosis) codes. Actual
tables may contain dozens of codes for each visit. For each coded
field, there is a corresponding name field that contains the name of
the condition that the code represents.
The MEDICATIONS table
• The MEDICATIONS table contains one entry for every medication
being taken by our five patients. In this example, there is no single
column that serves as a primary key for this table. As we can see in
the following screenshot, this table includes information about the
medication name, dose, frequency, route, prescribing physician, and
prescription date. The NDC code of each medication is also included;
we covered NDC codes in Chapter 2, Healthcare Foundations:
The MEDICATIONS table
• Including medications in our final table will not be straightforward. For example, the
information in the tables does not indicate the class of each medication. The NDC
code is present, but the NDC code is even more granular than the medication name
since it includes the route of administration and dosage in making each unique code;
therefore, multiple forms of lisinopril could have different NDC codes. In order to
make a column for each medication, we could potentially separately make a table for
each medication, which contains all of the medications that compose it, and then
merge that information into our table.
• If we choose to include dosage information, that field will require some cleaning.
Notice that Patient #3 is receiving an anti-hypertensive combination drug–the
valsartan component has a dosage of 160 mg, while the hydrochlorothiazide
component has a dosage of 12.5 mg. This could possibly be coded as two separate
drugs, but creating a script that splits combination drugs into two rows is not trivial.
The LABS table
• Laboratory information is an important part of clinical diagnostics, and many laboratory test results make for good predictor
variables (Donze et al., 2013; Sahni et al., 2018). The LABS table includes fields that describe the laboratory test name,
abbreviation, LOINC code, and result:
• There are some different approaches to including lab information in the final table. One way would be to include the raw lab result
as a continuous variable. However, this leads to a problem because the result would be NULL for most labs. We could potentially
navigate around this issue by imputing a value in the normal range when it is missing. Another approach would be to have a binary
variable for a lab test result that is in the abnormal range. This solves the missing data problem, since if the result is missing it
would be zero. However, a BNP value of 1,000 (which indicates severe CHF) would be no different than a BNP value of 350
(which indicates mild CHF) with this method. We will demonstrate both approaches in this chapter.
The LABS table
• Also note that the Lab_value field sometimes contains special
characters, for example in the troponin result. These will need to be
removed and the lab values interpreted accordingly. Culture results
(not included in this example) are completely textual, often naming
specific bacterial strains instead of numbers.
• Again, we repeat that this is a simplified example and that many of
the common labs that would be drawn for these patients (for
example, WBC count, hemoglobin, sodium, potassium, and so on) are
excluded here.
The VITALS table
• Vital signs are important indicators of a patient's health status and can be good predictors in
healthcare machine learning models (Sahni et al., 2018). Vital signs are typically taken at every
patient visit, so they can easily be included in their raw (numerical) form to preserve granularity.
• In the following screenshot of the table, we notice that while height and weight are present, the
body mass index (BMI) is missing. We will demonstrate the calculation of the BMI in Chapter 5,
Computing Foundations – Introduction to Python. Second, Visit #10004 is missing a temperature
reading. This is common in healthcare and may be caused by an oversight in care:
The MORT table
• Finally, we come to the table that contains the target variable. The
MORT table contains just two fields, the patient identifier, and the
date the patient passed away. Patients not listed in this table can be
assumed to be living:
Starting an SQLite session
The database engine we will use to transform our database is SQLite. It should be mentioned that SQL
comes in many variants, and the SQL specific to SQLite has minor differences to that specific to MySQL
or SQL Server databases. However, the underlying principles remain constant across all SQL dialects.
At this time, do the following:
• Navigate to the directory containing the sqlite3.exe program in your shell or command prompt (using
the cd command).
• Type sqlite3 mortality.db and press Enter. You should see a prompt that looks like the following:
sqlite>. This prompt indicates that you are in the SQLite program.
• Throughout the remainder of this chapter, we are going to create some tables and execute some SQLite
commands on them in the SQLite program.
• To exit the session at any time, type .exit and press Enter.
Data engineering, one table at a time with SQL
sqlite> CREATE TABLE PATIENT(Pid VARCHAR(30) NOT NULL,
Fname VARCHAR(30) NOT NULL,Minit CHAR,
Lname VARCHAR(30) NOT NULL,Bdate TEXT NOT NULL,
Street VARCHAR(50),City VARCHAR(30),State VARCHAR(2),
Zip VARCHAR(5),Phone VARCHAR(10) NOT NULL,Sex CHAR,
PRIMARY KEY (Pid)
);
sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex)
VALUES ('1','John','A','Smith','1952-01-01','1206 Fox Hollow
Rd.','Pittsburgh','PA','15213','6789871234','M');
sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex)
VALUES ('2','Candice','P','Jones','1978-02-03','1429 Orlyn Dr.','Los
Angeles','CA','90024','3107381419','F');
sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex)
VALUES ('3','Regina','H','Wilson','1985-04-23','765 Chestnut
Ln.','Albany','NY','12065','5184590206','F');
sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex)
VALUES ('4','Harold','','Lee','1966-11-15','2928 Policy
St.','Providence','RI','02912','6593482691','M');
sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex)
VALUES ('5','Stan','P','Davis','1958-12-30','4271 12th St.','Atlanta','GA','30339','4049814933','M');
Pid VARCHAR(30) NOT NULL,
Visit_id VARCHAR(30) NOT NULL,
Visit_date DATE NOT NULL,
Attending_md VARCHAR(30) NOT NULL,
Pri_dx_icd VARCHAR(20) NOT NULL,
Pri_dx_name VARCHAR(100) NOT NULL,
Sec_dx_icd VARCHAR(20),
Sec_dx_name VARCHAR(100),
PRIMARY KEY (Visit_id)
);
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('1','10001','2016-01-09','JOHNSON','I50.9','Heart failure, unspecified','I10','Essential (primary)
hypertension');
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('1','10002','2016-01-17','JOHNSON','I50.9','Heart failure, unspecified','I10','Essential (primary)
hypertension');
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('2','10003','2016-01-15','WU','I20.9','Angina pectoris, unspecified','E11.9','Type 2 diabetes mellitus
without complications');
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('3','10004','2016-02-01','JOHNSON','I10','Essential (primary) hypertension','','');
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('4','10005','2016-02-27','WU','I27.0','Primary pulmonary hypertension','I50.9','Heart failure,
unspecified');
sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd,
Sec_dx_name)
VALUES ('5','10006','2016-03-01','WU','I50.9','Heart failure, unspecified','E11.9','Type 2 diabetes mellitus
sqlite> CREATE TABLE MEDICATIONS(
Pid VARCHAR(30) NOT NULL,
Rx_name VARCHAR(50) NOT NULL,
Rx_dose VARCHAR(20),
Rx_freq VARCHAR(10),
Rx_route VARCHAR(10),
Prescribing_md VARCHAR(30) NOT NULL,
Rx_date DATE NOT NULL,
Rx_ndc VARCHAR(30)
);
• sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route,
Prescribing_md, Rx_date, Rx_ndc)
VALUES ('1', 'LISINOPRIL','5 mg','bid','po','JOHNSON','01/09/2016','68180-
513-01');
sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route,
Prescribing_md, Rx_date, Rx_ndc)
VALUES ('1', 'FUROSEMIDE','20 mg','bid','po','JOHNSON','01/09/2016','50742-
104-01');
sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route,
Prescribing_md, Rx_date, Rx_ndc)
VALUES ('2', 'NITROGLYCERIN','0.4 mg','tid','sl','WU','01/15/2016','59762-
3304-1');
sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route,
Prescribing_md, Rx_date, Rx_ndc)
VALUES ('2', 'METFORMIN','500 mg','bid','po','WU','01/15/2016','65162-175-
10');
sqlite> CREATE TABLE LABS(
Pid VARCHAR(30) NOT NULL,
Lab_name VARCHAR(50),
Lab_abbrev VARCHAR(20),
Lab_loinc VARCHAR(10) NOT NULL,
Lab_value VARCHAR(20) NOT NULL,
Ordering_md VARCHAR(30),
Lab_date DATE NOT NULL
);
• sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc,
Lab_value, Ordering_md, Lab_date)
VALUES ('1','Natriuretic peptide B','BNP','42637-
9','350','JOHNSON','2016-01-09');
sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc,
Lab_value, Ordering_md, Lab_date)
VALUES ('2','Natriuretic peptide B','BNP','42637-
9','100','WU','2016-01-15');
sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc,
Lab_value, Ordering_md, Lab_date)
VALUES ('2','Glucose','GLU','2345-7','225','WU','2016-01-15');
sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc,
Lab_value, Ordering_md, Lab_date)
VALUES ('2','Troponin I','TROP','10839-9','<0.004','WU','2016-01-
15');
sqlite> CREATE TABLE VITALS(
Pid VARCHAR(30) NOT NULL,
Visit_id VARCHAR(30) NOT NULL,
Height_in INT,
Weight_lb FLOAT,
Temp_f FLOAT,
Pulse INT,
Resp_rate INT,
Bp_syst INT,
Bp_diast INT,
SpO2 INT
);
• sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('1','10001',70,188.4,98.6,95,18,154,94,97);
sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('1','10002',70,188.4,99.1,85,17,157,96,100);
sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('2','10003',63,130.2,98.7,82,16,120,81,100);
sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('3','10004',65,120.0,NULL,100,19,161,100,98);
sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('4','10005',66,211.4,98.2,95,19,143,84,93);
sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f,
Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2)
VALUES ('5','10006',69,150.0,97.6,77,18,130,86,99);
sqlite> CREATE TABLE MORT(
Pid VARCHAR(30) NOT NULL,
Mortality_date DATE NOT NULL,
PRIMARY KEY (Pid)
);
sqlite> INSERT INTO MORT (Pid,
Mortality_date)
VALUES ('1', '2016-05-15');
sqlite> INSERT INTO MORT (Pid,
Mortality_date)
sqlite> SELECT * FROM
PATIENT;
sqlite> CREATE TABLE MORT_FINAL AS
SELECT Visit_id, Pid, Attending_md, Visit_date, Pri_dx_icd,
Sec_dx_icd
FROM VISIT;

More Related Content

Similar to Health care Analytics-Module 3-CADX150.. (20)

Hospital database management_system_sql
Hospital database management_system_sqlHospital database management_system_sql
Hospital database management_system_sql
SumedhMasal
Ìý
Health IT and OpenMRS
Health IT and OpenMRSHealth IT and OpenMRS
Health IT and OpenMRS
Saptarshi Purkayastha
Ìý
Health informationsystems chapter6
Health informationsystems chapter6Health informationsystems chapter6
Health informationsystems chapter6
Nui Nguyen
Ìý
The Dual Nature of InformaticsInformatics can be used for impr.docx
The Dual Nature of InformaticsInformatics can be used for impr.docxThe Dual Nature of InformaticsInformatics can be used for impr.docx
The Dual Nature of InformaticsInformatics can be used for impr.docx
he45mcurnow
Ìý
Information system in health care, role of information system in health care....
Information system in health care, role of information system in health care....Information system in health care, role of information system in health care....
Information system in health care, role of information system in health care....
akhilshrama041
Ìý
Automated data system CPK TDM chapter.pptx
Automated data system CPK TDM chapter.pptxAutomated data system CPK TDM chapter.pptx
Automated data system CPK TDM chapter.pptx
RakshithShetty82
Ìý
Informatics for librarians: the core of the onion
Informatics for librarians: the core of the onionInformatics for librarians: the core of the onion
Informatics for librarians: the core of the onion
Jacqueline Leskovec
Ìý
Health Care Processes and Decision Making_lecture 1_slides
Health Care Processes and Decision Making_lecture 1_slidesHealth Care Processes and Decision Making_lecture 1_slides
Health Care Processes and Decision Making_lecture 1_slides
CMDLearning
Ìý
Babithas Notes on unit-3 Health/Nursing Informatics Technology
Babithas Notes on unit-3 Health/Nursing Informatics TechnologyBabithas Notes on unit-3 Health/Nursing Informatics Technology
Babithas Notes on unit-3 Health/Nursing Informatics Technology
Babitha Devu
Ìý
hospitals-1.pptx
hospitals-1.pptxhospitals-1.pptx
hospitals-1.pptx
ABDULRAHMAN404686
Ìý
How Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT ChallengesHow Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT Challenges
Objective Medical Systems
Ìý
Game of documentation, Winter is coming Surviving ICD10
Game of documentation, Winter is coming Surviving ICD10Game of documentation, Winter is coming Surviving ICD10
Game of documentation, Winter is coming Surviving ICD10
Nick van Terheyden
Ìý
Transitioning to an EHR e-health conference St Joe's Toronto Boris Bard proj...
Transitioning to an EHR e-health conference  St Joe's Toronto Boris Bard proj...Transitioning to an EHR e-health conference  St Joe's Toronto Boris Bard proj...
Transitioning to an EHR e-health conference St Joe's Toronto Boris Bard proj...
Boris Bard
Ìý
Classifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient EncountersClassifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient Encounters
Mayur Srinivasan
Ìý
Clinical information systems
Clinical information systemsClinical information systems
Clinical information systems
Mark Wardle
Ìý
Capstone Presentation.pptx
Capstone Presentation.pptxCapstone Presentation.pptx
Capstone Presentation.pptx
ManikjitChohan
Ìý
ch2+ok+Health+information+management+2014+secand+symester.ppt
ch2+ok+Health+information+management+2014+secand+symester.pptch2+ok+Health+information+management+2014+secand+symester.ppt
ch2+ok+Health+information+management+2014+secand+symester.ppt
said Touati
Ìý
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics
Arun K
Ìý
Clinical Healthcare Data Analytics
Clinical Healthcare Data AnalyticsClinical Healthcare Data Analytics
Clinical Healthcare Data Analytics
dansouk
Ìý
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
Samantha Plumb
Ìý
Hospital database management_system_sql
Hospital database management_system_sqlHospital database management_system_sql
Hospital database management_system_sql
SumedhMasal
Ìý
Health informationsystems chapter6
Health informationsystems chapter6Health informationsystems chapter6
Health informationsystems chapter6
Nui Nguyen
Ìý
The Dual Nature of InformaticsInformatics can be used for impr.docx
The Dual Nature of InformaticsInformatics can be used for impr.docxThe Dual Nature of InformaticsInformatics can be used for impr.docx
The Dual Nature of InformaticsInformatics can be used for impr.docx
he45mcurnow
Ìý
Information system in health care, role of information system in health care....
Information system in health care, role of information system in health care....Information system in health care, role of information system in health care....
Information system in health care, role of information system in health care....
akhilshrama041
Ìý
Automated data system CPK TDM chapter.pptx
Automated data system CPK TDM chapter.pptxAutomated data system CPK TDM chapter.pptx
Automated data system CPK TDM chapter.pptx
RakshithShetty82
Ìý
Informatics for librarians: the core of the onion
Informatics for librarians: the core of the onionInformatics for librarians: the core of the onion
Informatics for librarians: the core of the onion
Jacqueline Leskovec
Ìý
Health Care Processes and Decision Making_lecture 1_slides
Health Care Processes and Decision Making_lecture 1_slidesHealth Care Processes and Decision Making_lecture 1_slides
Health Care Processes and Decision Making_lecture 1_slides
CMDLearning
Ìý
Babithas Notes on unit-3 Health/Nursing Informatics Technology
Babithas Notes on unit-3 Health/Nursing Informatics TechnologyBabithas Notes on unit-3 Health/Nursing Informatics Technology
Babithas Notes on unit-3 Health/Nursing Informatics Technology
Babitha Devu
Ìý
How Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT ChallengesHow Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT Challenges
Objective Medical Systems
Ìý
Game of documentation, Winter is coming Surviving ICD10
Game of documentation, Winter is coming Surviving ICD10Game of documentation, Winter is coming Surviving ICD10
Game of documentation, Winter is coming Surviving ICD10
Nick van Terheyden
Ìý
Transitioning to an EHR e-health conference St Joe's Toronto Boris Bard proj...
Transitioning to an EHR e-health conference  St Joe's Toronto Boris Bard proj...Transitioning to an EHR e-health conference  St Joe's Toronto Boris Bard proj...
Transitioning to an EHR e-health conference St Joe's Toronto Boris Bard proj...
Boris Bard
Ìý
Classifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient EncountersClassifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient Encounters
Mayur Srinivasan
Ìý
Clinical information systems
Clinical information systemsClinical information systems
Clinical information systems
Mark Wardle
Ìý
Capstone Presentation.pptx
Capstone Presentation.pptxCapstone Presentation.pptx
Capstone Presentation.pptx
ManikjitChohan
Ìý
ch2+ok+Health+information+management+2014+secand+symester.ppt
ch2+ok+Health+information+management+2014+secand+symester.pptch2+ok+Health+information+management+2014+secand+symester.ppt
ch2+ok+Health+information+management+2014+secand+symester.ppt
said Touati
Ìý
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics
Arun K
Ìý
Clinical Healthcare Data Analytics
Clinical Healthcare Data AnalyticsClinical Healthcare Data Analytics
Clinical Healthcare Data Analytics
dansouk
Ìý
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
My Top 5 articles from 2015-16 about Informatics and Digital Health in Physio...
Samantha Plumb
Ìý

More from ssuserec53e73 (6)

Computer Forensics and investigation module 3
Computer Forensics and investigation module 3Computer Forensics and investigation module 3
Computer Forensics and investigation module 3
ssuserec53e73
Ìý
Module 3 continuation of computer forensic
Module 3 continuation of computer forensicModule 3 continuation of computer forensic
Module 3 continuation of computer forensic
ssuserec53e73
Ìý
data acquisition in computer forensics and
data acquisition in computer forensics anddata acquisition in computer forensics and
data acquisition in computer forensics and
ssuserec53e73
Ìý
DLF-Microprocessor_02_2017-18 SAE2B microprocessor
DLF-Microprocessor_02_2017-18 SAE2B microprocessorDLF-Microprocessor_02_2017-18 SAE2B microprocessor
DLF-Microprocessor_02_2017-18 SAE2B microprocessor
ssuserec53e73
Ìý
Computer Forensics -Introduction and the details
Computer Forensics -Introduction and the detailsComputer Forensics -Introduction and the details
Computer Forensics -Introduction and the details
ssuserec53e73
Ìý
COMPUTER FORENSICS MODULE III of unit 3.ppt
COMPUTER FORENSICS MODULE III of unit 3.pptCOMPUTER FORENSICS MODULE III of unit 3.ppt
COMPUTER FORENSICS MODULE III of unit 3.ppt
ssuserec53e73
Ìý
Computer Forensics and investigation module 3
Computer Forensics and investigation module 3Computer Forensics and investigation module 3
Computer Forensics and investigation module 3
ssuserec53e73
Ìý
Module 3 continuation of computer forensic
Module 3 continuation of computer forensicModule 3 continuation of computer forensic
Module 3 continuation of computer forensic
ssuserec53e73
Ìý
data acquisition in computer forensics and
data acquisition in computer forensics anddata acquisition in computer forensics and
data acquisition in computer forensics and
ssuserec53e73
Ìý
DLF-Microprocessor_02_2017-18 SAE2B microprocessor
DLF-Microprocessor_02_2017-18 SAE2B microprocessorDLF-Microprocessor_02_2017-18 SAE2B microprocessor
DLF-Microprocessor_02_2017-18 SAE2B microprocessor
ssuserec53e73
Ìý
Computer Forensics -Introduction and the details
Computer Forensics -Introduction and the detailsComputer Forensics -Introduction and the details
Computer Forensics -Introduction and the details
ssuserec53e73
Ìý
COMPUTER FORENSICS MODULE III of unit 3.ppt
COMPUTER FORENSICS MODULE III of unit 3.pptCOMPUTER FORENSICS MODULE III of unit 3.ppt
COMPUTER FORENSICS MODULE III of unit 3.ppt
ssuserec53e73
Ìý

Recently uploaded (20)

Adventure Activities Final By H R Gohil Sir
Adventure Activities Final By H R Gohil SirAdventure Activities Final By H R Gohil Sir
Adventure Activities Final By H R Gohil Sir
GUJARATCOMMERCECOLLE
Ìý
Fuel part 1.pptx........................
Fuel part 1.pptx........................Fuel part 1.pptx........................
Fuel part 1.pptx........................
ksbhattadcm
Ìý
Storytelling instructions...............
Storytelling instructions...............Storytelling instructions...............
Storytelling instructions...............
Alexander Benito
Ìý
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣsUseful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Celine George
Ìý
Essentials of a Good PMO, presented by Aalok Sonawala
Essentials of a Good PMO, presented by Aalok SonawalaEssentials of a Good PMO, presented by Aalok Sonawala
Essentials of a Good PMO, presented by Aalok Sonawala
Association for Project Management
Ìý
CRITICAL THINKING AND NURSING JUDGEMENT.pptx
CRITICAL THINKING AND NURSING JUDGEMENT.pptxCRITICAL THINKING AND NURSING JUDGEMENT.pptx
CRITICAL THINKING AND NURSING JUDGEMENT.pptx
PoojaSen20
Ìý
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAMDUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
vlckovar
Ìý
Information Technology for class X CBSE skill Subject
Information Technology for class X CBSE skill SubjectInformation Technology for class X CBSE skill Subject
Information Technology for class X CBSE skill Subject
VEENAKSHI PATHAK
Ìý
QuickBooks Desktop to QuickBooks Online How to Make the Move
QuickBooks Desktop to QuickBooks Online  How to Make the MoveQuickBooks Desktop to QuickBooks Online  How to Make the Move
QuickBooks Desktop to QuickBooks Online How to Make the Move
TechSoup
Ìý
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptxFESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
DanmarieMuli1
Ìý
How to Modify Existing Web Pages in Odoo 18
How to Modify Existing Web Pages in Odoo 18How to Modify Existing Web Pages in Odoo 18
How to Modify Existing Web Pages in Odoo 18
Celine George
Ìý
EDL 290F Week 3 - Mountaintop Views (2025).pdf
EDL 290F Week 3  - Mountaintop Views (2025).pdfEDL 290F Week 3  - Mountaintop Views (2025).pdf
EDL 290F Week 3 - Mountaintop Views (2025).pdf
Liz Walsh-Trevino
Ìý
N.C. DPI's 2023 Language Diversity Briefing
N.C. DPI's 2023 Language Diversity BriefingN.C. DPI's 2023 Language Diversity Briefing
N.C. DPI's 2023 Language Diversity Briefing
Mebane Rash
Ìý
Rass MELAI : an Internet MELA Quiz Finals - El Dorado 2025
Rass MELAI : an Internet MELA Quiz Finals - El Dorado 2025Rass MELAI : an Internet MELA Quiz Finals - El Dorado 2025
Rass MELAI : an Internet MELA Quiz Finals - El Dorado 2025
Conquiztadors- the Quiz Society of Sri Venkateswara College
Ìý
Database population in Odoo 18 - Odoo slides
Database population in Odoo 18 - Odoo slidesDatabase population in Odoo 18 - Odoo slides
Database population in Odoo 18 - Odoo slides
Celine George
Ìý
Computer Application in Business (commerce)
Computer Application in Business (commerce)Computer Application in Business (commerce)
Computer Application in Business (commerce)
Sudar Sudar
Ìý
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣsHow to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
Celine George
Ìý
A PPT on the First Three chapters of Wings of Fire
A PPT on the First Three chapters of Wings of FireA PPT on the First Three chapters of Wings of Fire
A PPT on the First Three chapters of Wings of Fire
Beena E S
Ìý
SOCIAL CHANGE(a change in the institutional and normative structure of societ...
SOCIAL CHANGE(a change in the institutional and normative structure of societ...SOCIAL CHANGE(a change in the institutional and normative structure of societ...
SOCIAL CHANGE(a change in the institutional and normative structure of societ...
DrNidhiAgarwal
Ìý
Blind spots in AI and Formulation Science, IFPAC 2025.pdf
Blind spots in AI and Formulation Science, IFPAC 2025.pdfBlind spots in AI and Formulation Science, IFPAC 2025.pdf
Blind spots in AI and Formulation Science, IFPAC 2025.pdf
Ajaz Hussain
Ìý
Adventure Activities Final By H R Gohil Sir
Adventure Activities Final By H R Gohil SirAdventure Activities Final By H R Gohil Sir
Adventure Activities Final By H R Gohil Sir
GUJARATCOMMERCECOLLE
Ìý
Fuel part 1.pptx........................
Fuel part 1.pptx........................Fuel part 1.pptx........................
Fuel part 1.pptx........................
ksbhattadcm
Ìý
Storytelling instructions...............
Storytelling instructions...............Storytelling instructions...............
Storytelling instructions...............
Alexander Benito
Ìý
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣsUseful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Useful environment methods in Odoo 18 - Odoo ºÝºÝߣs
Celine George
Ìý
CRITICAL THINKING AND NURSING JUDGEMENT.pptx
CRITICAL THINKING AND NURSING JUDGEMENT.pptxCRITICAL THINKING AND NURSING JUDGEMENT.pptx
CRITICAL THINKING AND NURSING JUDGEMENT.pptx
PoojaSen20
Ìý
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAMDUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
DUBLIN PROGRAM DUBLIN PROGRAM DUBLIN PROGRAM
vlckovar
Ìý
Information Technology for class X CBSE skill Subject
Information Technology for class X CBSE skill SubjectInformation Technology for class X CBSE skill Subject
Information Technology for class X CBSE skill Subject
VEENAKSHI PATHAK
Ìý
QuickBooks Desktop to QuickBooks Online How to Make the Move
QuickBooks Desktop to QuickBooks Online  How to Make the MoveQuickBooks Desktop to QuickBooks Online  How to Make the Move
QuickBooks Desktop to QuickBooks Online How to Make the Move
TechSoup
Ìý
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptxFESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
FESTIVAL: SINULOG & THINGYAN-LESSON 4.pptx
DanmarieMuli1
Ìý
How to Modify Existing Web Pages in Odoo 18
How to Modify Existing Web Pages in Odoo 18How to Modify Existing Web Pages in Odoo 18
How to Modify Existing Web Pages in Odoo 18
Celine George
Ìý
EDL 290F Week 3 - Mountaintop Views (2025).pdf
EDL 290F Week 3  - Mountaintop Views (2025).pdfEDL 290F Week 3  - Mountaintop Views (2025).pdf
EDL 290F Week 3 - Mountaintop Views (2025).pdf
Liz Walsh-Trevino
Ìý
N.C. DPI's 2023 Language Diversity Briefing
N.C. DPI's 2023 Language Diversity BriefingN.C. DPI's 2023 Language Diversity Briefing
N.C. DPI's 2023 Language Diversity Briefing
Mebane Rash
Ìý
Database population in Odoo 18 - Odoo slides
Database population in Odoo 18 - Odoo slidesDatabase population in Odoo 18 - Odoo slides
Database population in Odoo 18 - Odoo slides
Celine George
Ìý
Computer Application in Business (commerce)
Computer Application in Business (commerce)Computer Application in Business (commerce)
Computer Application in Business (commerce)
Sudar Sudar
Ìý
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣsHow to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
How to Setup WhatsApp in Odoo 17 - Odoo ºÝºÝߣs
Celine George
Ìý
A PPT on the First Three chapters of Wings of Fire
A PPT on the First Three chapters of Wings of FireA PPT on the First Three chapters of Wings of Fire
A PPT on the First Three chapters of Wings of Fire
Beena E S
Ìý
SOCIAL CHANGE(a change in the institutional and normative structure of societ...
SOCIAL CHANGE(a change in the institutional and normative structure of societ...SOCIAL CHANGE(a change in the institutional and normative structure of societ...
SOCIAL CHANGE(a change in the institutional and normative structure of societ...
DrNidhiAgarwal
Ìý
Blind spots in AI and Formulation Science, IFPAC 2025.pdf
Blind spots in AI and Formulation Science, IFPAC 2025.pdfBlind spots in AI and Formulation Science, IFPAC 2025.pdf
Blind spots in AI and Formulation Science, IFPAC 2025.pdf
Ajaz Hussain
Ìý

Health care Analytics-Module 3-CADX150..

  • 2. • What is a database? • A database is essentially a collection of data that has been arranged into a structure that allows it to be easily read, edited, added or deleted. • What is a database management system? • Database Management System is a system designed to manage the automatic and orderly database. The Database Management System is an automatic system helping the user to control information, create, update and maintain the database. • What do databases do in healthcare? • Healthcare databases help individual medical organizations understand their daily activities and their place within the larger healthcare industry. This allows healthcare professionals to make decisions about how they run their businesses, the work they do, and the systems they use to manage their operations. With the rapid advancement of healthcare technology and post- COVID-19 changes in healthcare operations, it is more vital than ever for healthcare databases to be organized, well-maintained, and simple to use.
  • 3. • Why is database management important? • Healthcare databases have been an essential component of understanding and improving critical care worldwide. The importance of database technology in healthcare cannot be overstated. The Healthcare database system generates data that requires delicate handling and is developed for the purpose of assessing the quality of healthcare, often for a specific disease or within a specific healthcare delivery system. databases in healthcare can promote: • 1. Assessing the quality of healthcare • Hospitals, health departments, local, state and federal agencies, to name a few, all contribute information to healthcare databases. Healthcare specialists can gain a thorough understanding of the quality of current healthcare operations by analyzing aspects such as; financing available to healthcare facilities, service availability and affordability, use of innovation, and barriers to seeking care. • 2. Tracking and Monitoring • Medical workers can utilize reporting and logging tools to keep track of operations with the help of healthcare databases. these aids healthcare providers in monitoring and improving the quality of patient treatment by providing an important tool for tracking healthcare use. Healthcare providers, for instance, need to know whether people are getting their vaccination or not.
  • 4. Healthcare DBMS • A database can be defined as a collection of related data (Elmasri and Navathe, 2010). • Databases are often subcategorized as SQL databases or NoSQL databases. • In SQL databases, data is recorded in tables and consists of rows and columns. The related data may be distributed across several tables in a trade-off between efficient storage and convenience. • The database management system (DBMS) is a piece of software that enables the database to serve several functions. For one thing, it allows for the retrieval of data using the SQL language (for SQL databases). Another function is to update the data when needed, also using SQL. Additional functions of a DBMS include protecting and securing the data.
  • 5. Data engineering with SQL – an example case • For this chapter, let's pretend you secured a predictive analytics assignment with a cardiology practice located in the United States. The practice wants you to predict which patients are at risk of dying within 6 months of their visit to the clinic. They make their data available to you in the form of a database that includes six tables. • For simplicity, we truncate the database to include the information for five patients only. Our task is to manipulate the data using the SQL language to consolidate it into a single table so that it can be used for machine learning. We will first go over the patients in the database and the database structure. Then, we will introduce basic SQL concepts for engineering and manipulate the data into a form amenable to machine learning.
  • 6. Case details – predicting mortality for a cardiology practice • The cardiology practice you are working with has two physicians on staff: Dr. Johnson and Dr. Wu. While the practice has many patients, they are interested in identifying which patients who visit are at high risk of all-cause mortality within the next 6 months. • Now that we've reviewed the details of the modeling assignment, let's take a look at the five patients in the database. The preliminary data sent to you by the cardiology practice includes information on five patients, distributed across six tables.
  • 7. Case details – predicting mortality for a cardiology practice The following is the information about the patients: • Patient ID-1: Patient #1 in the database is a 65-year-old male who has congestive heart failure (CHF), a chronic condition in which the heart is unable to pump blood properly to the rest of the body. He also has hypertension (high blood pressure), which is a risk factor for CHF. He visited his cardiologist, Dr. Johnson, on 9/1/2016 and 17/1/2016. On his January 9th visit, he was found to have an elevated BP (154/94) and an elevated B-natriuretic peptide (BNP) lab value of 350. BNP is a marker of CHF severity. He was subsequently placed on lisinopril and furosemide, which are first-line treatments for CHF and hypertension. Unfortunately, he passed away on May 15th, 2016. • Patient ID-2: Patient #2 is a 39-year-old female with a history of angina pectoris (cardiovascular-related chest pain upon exercising) and diabetes mellitus. Diabetes mellitus is a risk factor for myocardial infarction (heart attack; a late, often fatal manifestation of atherosclerotic heart disease), and angina pectoris can be seen as an early manifestation of atherosclerotic heart disease. She visited her cardiologist, Dr. Wu, on January 15th, 2016, at which time she was found to have an elevated blood glucose level of 225, a sign of uncontrolled diabetes. She was started on metformin for her diabetes, as well as nitroglycerin, aspirin, and metoprolol for her angina. • Patient ID-3: Patient #3 is a 32-year-old female who sees Dr. Johnson for management of her hypertension. During her visit on February 1st, 2016 her blood pressure was elevated at 161/100. She was started on valsartan/hydrochlorothiazide, an anti-hypertensive combination.
  • 8. • Patient ID: 4: Patient #4 is a 51-year-old male who has severe CHF with pulmonary hypertension. He saw Dr. Wu on February 27th, 2016. During that visit, his weight was 211 lbs and his blood pressure was slightly elevated at 143/84. His BNP level was highly elevated at 1,000. He was given lisinopril and furosemide for his CHF as well as diltiazem for his pulmonary hypertension. Unfortunately, he passed away on June 8th, 2016. • Patient ID-5: The last patient in our database, patient #5, is a 58-year-old male who presented to Dr. Wu on March 1st, 2016 with a history of CHF and diabetes mellitus Type 2. During the visit, his glucose was elevated at 318 and BNP was moderately elevated at 400. He was started on lisinopril and furosemide for his CHF and metformin for his diabetes.
  • 9. The clinical database • Now that we've gotten to know the five patients whose information is contained in our database, we can describe the table structure and fields contained in the database, for six mock tables: PATIENT, VISIT, MEDICATIONS, LABS, VITALS, and MORT. • Although every clinical database is different, I've tried to use a structure that is commonly seen in healthcare. Typically, tables are presented by clinical domains (for an example of a research study that received tables in such a distributed format, see Basole et al., 2015). For example, there is often one table that contains demographic and personal information, one table for lab results, one for medications, and so on, so that is how we constructed the database in this example. They tend to be tied together by a common identifier, which in our case is the Pid field. • As we describe the tables, we must keep our end-goal of the data engineering phase in mind–to combine the relevant information from the six tables into a single table, whose columns include the target variable (mortality in this case) in addition to predictor variables, which should be useful for predicting the target variable. This will enable us to make a machine learning model with popular packages such as Python's scikit-learn. With this in mind, we will highlight selected fields that will be useful for our assignment.
  • 10. The PATIENT table • In our example, the PATIENT table, which we can see in the following screenshot, contains the demographic and identifying information of our patients–their names, contact information, birthdays, and biological sex. • In this example, there are only five observations and 11 columns; in real practice, this table would contain all of the patients affiliated with the healthcare organization. The number of rows in this table might range from hundreds to hundreds of thousands, while the table could potentially include dozens of columns containing detailed demographic information:
  • 11. • In the database, every unique patient is assigned to an identifier (the field labeled as Pid), which in our case is simply numbered 1 - 5. The Pid column allows us to keep track of the patients across different tables. Also, notice that there is one and only one entry for each distinct patient ID.
  • 12. The PATIENT table • After identifying the indispensable identifer column, the focus should be on which variables to keep and which to discard. Certainly, age and sex are important demographic predictors of mortality. If race were in this table, that would be another important demographic variable. • In the database, every unique patient is assigned to an identifier (the field labeled as Pid), which in our case is simply numbered 1 - 5. The Pid column allows us to keep track of the patients across different tables. Also, notice that there is one and only one entry for each distinct patient ID. • Another notable variable in this table is the zip code. Increasingly, socioeconomic data is being used in machine learning analyses. The zip code can potentially be tied to publicly available census data; that data can then be joined to the data in this table on the zip code and could potentially provide information on the average education level, income, and healthcare coverage for each patient's zip code. There are even organizations who sell household-level information; however, with that data comes a great responsibility for privacy protection and data security. For this example, we will omit the zip code to keep our final table simple. • Information we'll leave out from our final table includes names, street addresses, and phone numbers. As long as we have the patient ID, these fields shouldn't have much of a predictive impact on our target variable.
  • 13. The VISIT table • While the PATIENT table contains basic administrative information about each patient, our assignment is to predict the mortality risk on the basis of each visit. The VISIT table contains one observation for each patient visit, along with some clinical information about each visit: • Notice that the patient ID is no longer the primary identifier of this table, since Patient #1 had two visits; instead, there is a Visit_id field that is numbered from 10001 to 10006 in this example, with one distinct ID per visit.
  • 14. The VISIT table • This table also contains Visit_date. Since the cardiology practice indicated they want to know the mortality risk within 6 months of the patient visit, we will have to use this field later when we compute the target variable. • Two of the fields in this table contain ICD (diagnosis) codes. Actual tables may contain dozens of codes for each visit. For each coded field, there is a corresponding name field that contains the name of the condition that the code represents.
  • 15. The MEDICATIONS table • The MEDICATIONS table contains one entry for every medication being taken by our five patients. In this example, there is no single column that serves as a primary key for this table. As we can see in the following screenshot, this table includes information about the medication name, dose, frequency, route, prescribing physician, and prescription date. The NDC code of each medication is also included; we covered NDC codes in Chapter 2, Healthcare Foundations:
  • 16. The MEDICATIONS table • Including medications in our final table will not be straightforward. For example, the information in the tables does not indicate the class of each medication. The NDC code is present, but the NDC code is even more granular than the medication name since it includes the route of administration and dosage in making each unique code; therefore, multiple forms of lisinopril could have different NDC codes. In order to make a column for each medication, we could potentially separately make a table for each medication, which contains all of the medications that compose it, and then merge that information into our table. • If we choose to include dosage information, that field will require some cleaning. Notice that Patient #3 is receiving an anti-hypertensive combination drug–the valsartan component has a dosage of 160 mg, while the hydrochlorothiazide component has a dosage of 12.5 mg. This could possibly be coded as two separate drugs, but creating a script that splits combination drugs into two rows is not trivial.
  • 17. The LABS table • Laboratory information is an important part of clinical diagnostics, and many laboratory test results make for good predictor variables (Donze et al., 2013; Sahni et al., 2018). The LABS table includes fields that describe the laboratory test name, abbreviation, LOINC code, and result: • There are some different approaches to including lab information in the final table. One way would be to include the raw lab result as a continuous variable. However, this leads to a problem because the result would be NULL for most labs. We could potentially navigate around this issue by imputing a value in the normal range when it is missing. Another approach would be to have a binary variable for a lab test result that is in the abnormal range. This solves the missing data problem, since if the result is missing it would be zero. However, a BNP value of 1,000 (which indicates severe CHF) would be no different than a BNP value of 350 (which indicates mild CHF) with this method. We will demonstrate both approaches in this chapter.
  • 18. The LABS table • Also note that the Lab_value field sometimes contains special characters, for example in the troponin result. These will need to be removed and the lab values interpreted accordingly. Culture results (not included in this example) are completely textual, often naming specific bacterial strains instead of numbers. • Again, we repeat that this is a simplified example and that many of the common labs that would be drawn for these patients (for example, WBC count, hemoglobin, sodium, potassium, and so on) are excluded here.
  • 19. The VITALS table • Vital signs are important indicators of a patient's health status and can be good predictors in healthcare machine learning models (Sahni et al., 2018). Vital signs are typically taken at every patient visit, so they can easily be included in their raw (numerical) form to preserve granularity. • In the following screenshot of the table, we notice that while height and weight are present, the body mass index (BMI) is missing. We will demonstrate the calculation of the BMI in Chapter 5, Computing Foundations – Introduction to Python. Second, Visit #10004 is missing a temperature reading. This is common in healthcare and may be caused by an oversight in care:
  • 20. The MORT table • Finally, we come to the table that contains the target variable. The MORT table contains just two fields, the patient identifier, and the date the patient passed away. Patients not listed in this table can be assumed to be living:
  • 21. Starting an SQLite session The database engine we will use to transform our database is SQLite. It should be mentioned that SQL comes in many variants, and the SQL specific to SQLite has minor differences to that specific to MySQL or SQL Server databases. However, the underlying principles remain constant across all SQL dialects. At this time, do the following: • Navigate to the directory containing the sqlite3.exe program in your shell or command prompt (using the cd command). • Type sqlite3 mortality.db and press Enter. You should see a prompt that looks like the following: sqlite>. This prompt indicates that you are in the SQLite program. • Throughout the remainder of this chapter, we are going to create some tables and execute some SQLite commands on them in the SQLite program. • To exit the session at any time, type .exit and press Enter.
  • 22. Data engineering, one table at a time with SQL sqlite> CREATE TABLE PATIENT(Pid VARCHAR(30) NOT NULL, Fname VARCHAR(30) NOT NULL,Minit CHAR, Lname VARCHAR(30) NOT NULL,Bdate TEXT NOT NULL, Street VARCHAR(50),City VARCHAR(30),State VARCHAR(2), Zip VARCHAR(5),Phone VARCHAR(10) NOT NULL,Sex CHAR, PRIMARY KEY (Pid) );
  • 23. sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex) VALUES ('1','John','A','Smith','1952-01-01','1206 Fox Hollow Rd.','Pittsburgh','PA','15213','6789871234','M'); sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex) VALUES ('2','Candice','P','Jones','1978-02-03','1429 Orlyn Dr.','Los Angeles','CA','90024','3107381419','F'); sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex) VALUES ('3','Regina','H','Wilson','1985-04-23','765 Chestnut Ln.','Albany','NY','12065','5184590206','F'); sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex) VALUES ('4','Harold','','Lee','1966-11-15','2928 Policy St.','Providence','RI','02912','6593482691','M'); sqlite> INSERT INTO PATIENT (Pid, Fname, Minit, Lname, Bdate, Street, City, State, Zip, Phone, Sex) VALUES ('5','Stan','P','Davis','1958-12-30','4271 12th St.','Atlanta','GA','30339','4049814933','M');
  • 24. Pid VARCHAR(30) NOT NULL, Visit_id VARCHAR(30) NOT NULL, Visit_date DATE NOT NULL, Attending_md VARCHAR(30) NOT NULL, Pri_dx_icd VARCHAR(20) NOT NULL, Pri_dx_name VARCHAR(100) NOT NULL, Sec_dx_icd VARCHAR(20), Sec_dx_name VARCHAR(100), PRIMARY KEY (Visit_id) ); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('1','10001','2016-01-09','JOHNSON','I50.9','Heart failure, unspecified','I10','Essential (primary) hypertension'); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('1','10002','2016-01-17','JOHNSON','I50.9','Heart failure, unspecified','I10','Essential (primary) hypertension'); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('2','10003','2016-01-15','WU','I20.9','Angina pectoris, unspecified','E11.9','Type 2 diabetes mellitus without complications'); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('3','10004','2016-02-01','JOHNSON','I10','Essential (primary) hypertension','',''); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('4','10005','2016-02-27','WU','I27.0','Primary pulmonary hypertension','I50.9','Heart failure, unspecified'); sqlite> INSERT INTO VISIT (Pid, Visit_id, Visit_date, Attending_md, Pri_dx_icd, Pri_dx_name, Sec_dx_icd, Sec_dx_name) VALUES ('5','10006','2016-03-01','WU','I50.9','Heart failure, unspecified','E11.9','Type 2 diabetes mellitus
  • 25. sqlite> CREATE TABLE MEDICATIONS( Pid VARCHAR(30) NOT NULL, Rx_name VARCHAR(50) NOT NULL, Rx_dose VARCHAR(20), Rx_freq VARCHAR(10), Rx_route VARCHAR(10), Prescribing_md VARCHAR(30) NOT NULL, Rx_date DATE NOT NULL, Rx_ndc VARCHAR(30) );
  • 26. • sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route, Prescribing_md, Rx_date, Rx_ndc) VALUES ('1', 'LISINOPRIL','5 mg','bid','po','JOHNSON','01/09/2016','68180- 513-01'); sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route, Prescribing_md, Rx_date, Rx_ndc) VALUES ('1', 'FUROSEMIDE','20 mg','bid','po','JOHNSON','01/09/2016','50742- 104-01'); sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route, Prescribing_md, Rx_date, Rx_ndc) VALUES ('2', 'NITROGLYCERIN','0.4 mg','tid','sl','WU','01/15/2016','59762- 3304-1'); sqlite> INSERT INTO MEDICATIONS (Pid, Rx_name, Rx_dose, Rx_freq, Rx_route, Prescribing_md, Rx_date, Rx_ndc) VALUES ('2', 'METFORMIN','500 mg','bid','po','WU','01/15/2016','65162-175- 10');
  • 27. sqlite> CREATE TABLE LABS( Pid VARCHAR(30) NOT NULL, Lab_name VARCHAR(50), Lab_abbrev VARCHAR(20), Lab_loinc VARCHAR(10) NOT NULL, Lab_value VARCHAR(20) NOT NULL, Ordering_md VARCHAR(30), Lab_date DATE NOT NULL );
  • 28. • sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc, Lab_value, Ordering_md, Lab_date) VALUES ('1','Natriuretic peptide B','BNP','42637- 9','350','JOHNSON','2016-01-09'); sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc, Lab_value, Ordering_md, Lab_date) VALUES ('2','Natriuretic peptide B','BNP','42637- 9','100','WU','2016-01-15'); sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc, Lab_value, Ordering_md, Lab_date) VALUES ('2','Glucose','GLU','2345-7','225','WU','2016-01-15'); sqlite> INSERT INTO LABS (Pid, Lab_name, Lab_abbrev, Lab_loinc, Lab_value, Ordering_md, Lab_date) VALUES ('2','Troponin I','TROP','10839-9','<0.004','WU','2016-01- 15');
  • 29. sqlite> CREATE TABLE VITALS( Pid VARCHAR(30) NOT NULL, Visit_id VARCHAR(30) NOT NULL, Height_in INT, Weight_lb FLOAT, Temp_f FLOAT, Pulse INT, Resp_rate INT, Bp_syst INT, Bp_diast INT, SpO2 INT );
  • 30. • sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('1','10001',70,188.4,98.6,95,18,154,94,97); sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('1','10002',70,188.4,99.1,85,17,157,96,100); sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('2','10003',63,130.2,98.7,82,16,120,81,100); sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('3','10004',65,120.0,NULL,100,19,161,100,98); sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('4','10005',66,211.4,98.2,95,19,143,84,93); sqlite> INSERT INTO VITALS (Pid, Visit_id, Height_in, Weight_lb, Temp_f, Pulse, Resp_rate, Bp_syst, Bp_diast, SpO2) VALUES ('5','10006',69,150.0,97.6,77,18,130,86,99);
  • 31. sqlite> CREATE TABLE MORT( Pid VARCHAR(30) NOT NULL, Mortality_date DATE NOT NULL, PRIMARY KEY (Pid) ); sqlite> INSERT INTO MORT (Pid, Mortality_date) VALUES ('1', '2016-05-15'); sqlite> INSERT INTO MORT (Pid, Mortality_date)
  • 32. sqlite> SELECT * FROM PATIENT;
  • 33. sqlite> CREATE TABLE MORT_FINAL AS SELECT Visit_id, Pid, Attending_md, Visit_date, Pri_dx_icd, Sec_dx_icd FROM VISIT;