An Illustration from Heart Rate Variability Data
I have made an attempt to make calcification of different activities from Hear Rate Variability (HRV) data. I have also used model comparison here. Mostly it is graphics, but I will try to add some more text in future.
1 of 47
More Related Content
Models Can Lie
1. Models Can Lie
An Illustration from Heart Rate Variablity Data
Raju Rimal and Veronika Lindberg
2015/12/09
Norwegian University of Science and Technology
Norwegian University of Life Sciences
2. Overview
1. Background
2. How data looks like
3. Classi鍖cation with series stack
4. Classi鍖cation with series averaged over Series repetition
5. Classi鍖cation with series averaged over Person-Event
Combination
6. Some Comparison
5. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
6. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
7. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
4. Three variation of dataset is used
8. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
4. Three variation of dataset is used
Transpose of each frequency windows stacked together
9. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
4. Three variation of dataset is used
Transpose of each frequency windows stacked together
The average frequencies over time for each Series
10. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
4. Three variation of dataset is used
Transpose of each frequency windows stacked together
The average frequencies over time for each Series
The average frequencies over time for each person-event
combination
11. Some Background
1. PCR, PLS and Canonically Powered PLS (CPPLS) is used in
the analysis
2. CPPLS integrate CCA with PLS to select relevant variables
for response
3. Cross-validation is performed over the observations on
each a) frequency window b) series c) person-event
combination
4. Three variation of dataset is used
Transpose of each frequency windows stacked together
The average frequencies over time for each Series
The average frequencies over time for each person-event
combination
5. LDA model is used for discriminant analysis using the
scores obtained form each of the latent variable model with
cross-validation implemented
14. How data looks like
0
200
400
600
0
100
200
300
0
100
200
300
400
0
100
200
300
0
100
200
300
0
100
200
300
400
500
Person1Person1Person1Person1Person1Person3
P11P12P127P129P131P23
GymGymSaunaSaunaSaunaSauna
0.00 0.25 0.50 0.75 1.00
Frequency Window (dB)
Time(s)
Each block represent
a series divided into
several windows
(rows), 128 columns
each with 16
overlaps. The cell
contains the
frequency values
obtained from fast
fourier transform
15. How data looks like
0
200
400
600
0
100
200
300
0
100
200
300
400
0
100
200
300
0
100
200
300
0
100
200
300
400
500
Person1Person1Person1Person1Person1Person3
P11P12P127P129P131P23
GymGymSaunaSaunaSaunaSauna
0.00 0.25 0.50 0.75 1.00
Frequency Window (dB)
Time(s)
Each person may
have involved into
multiple activities
which may have
replications
16. How data looks like
0
200
400
600
0
100
200
300
0
100
200
300
400
0
100
200
300
0
100
200
300
0
100
200
300
400
500
Person1Person1Person1Person1Person1Person3
P11P12P127P129P131P23
GymGymSaunaSaunaSaunaSauna
0.00 0.25 0.50 0.75 1.00
Frequency Window (dB)
Time(s)
Set 1: Each windows
are stacked in a row
to form a big matrix
(may su鍖er from
repeated
measurement). This
contains di鍖erent
parts of same series
in various rows.
17. How data looks like
0
200
400
600
0
100
200
300
0
100
200
300
400
0
100
200
300
0
100
200
300
0
100
200
300
400
500
Person1Person1Person1Person1Person1Person3
P11P12P127P129P131P23
GymGymSaunaSaunaSaunaSauna
0.00 0.25 0.50 0.75 1.00
Frequency Window (dB)
Time(s)
Set 2: Each series are
averaged over
di鍖erent time
points. Each row
corresponds to one
series.
18. How data looks like
0
200
400
600
0
100
200
300
0
100
200
300
400
0
100
200
300
0
100
200
300
0
100
200
300
400
500
Person1Person1Person1Person1Person1Person3
P11P12P127P129P131P23
GymGymSaunaSaunaSaunaSauna
0.00 0.25 0.50 0.75 1.00
Frequency Window (dB)
Time(s)
Set 3: A person can
have multiple series
of same activity
(replication), the
third set is averaged
over each
person-event
conbination. In this
case each row
corresponds to some
speci鍖c event for
some speci鍖c person
46. Misclassi鍖cation Errors
Set 1 Set 2 Set 3
29
56
4
68
128
15
1
1 1
0.0175
0.0219
0.0789
0.3124
0.3276
0.3733
0.6667
0.875
0.9315
0.9521
0.9932
1
CPLS PCR PLS CPLS PCR PLS CPLS PCR PLS
Model
MisclassificationError
Test Train
Figure: Training and Test Misclassi鍖cation Error for all the three models. The LDA
models were 鍖tted with the scores obtained from three models with components
(number above each points) needed to get minimum cross-validation error.
47. References
[1] M Dowle et al. data.table: Extension of Data.frame. R package version 1.9.6. 2015.
url: http://CRAN.R-project.org/package=data.table.
[2] Ulf G Indahl, Kristian Hovde Liland, and Tormod N脱s. Canonical partial least
squaresa uni鍖ed PLS approach to classi鍖cation and regression problems. In:
Journal of Chemometrics 23.9 (2009), pp. 495504.
[3] Uwe Ligges, Tom Short, and Paul Kienzle. signal: Signal Processing. R package
version 0.7-6. 2015. url: http://CRAN.R-project.org/package=signal.
[4] Harald Martens and Magni Martens. Multivariate analysis of quality: an
introduction. John Wiley & Sons, 2001.
[5] Harald Martens and Tormod Naes. Multivariate calibration. John Wiley &
Sons, 1992.
[6] Hadley Wickham. ggplot: An Implementation of the Grammar of Graphics. In:
R package version 0.4. 0 (2006).
[7] Hadley Wickham. reshape2: Flexibly reshape data: a reboot of the reshape
package. In: R package version 1.2 (2012).
[8] Yihui Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R
package version 1.11. 2015. url: http://CRAN.R-project.org/package=knitr.