ݺߣ

ݺߣShare a Scribd company logo
???? ??? ???
- ????? ??? ??
???
KAIST
2017? 12? 1?
1 / 50
What is big data?
2 / 50
??
???? ?? ??? ???? ??
???? ???
??? ???
??? ??? (homogeneity)? ??? ???
???? ???? ??? ??? ??? ??? ???? ??? ?? ?? ??
??? ??. ???? ? ???? ?? ??? ?? ???? ???.
???? ???
??, ??
??? ???(heterogeneity)? ? ??
????? ?? ??? ???? ??? ??? ?? ???? ????? ??
(Population science)
Yu Xie (2013). Population heterogeneity and causal inference. PNAS,
110, 6262-6268.
4 / 50
??
?????? ??? ???? ??? ??? ?? ?? =
???
??
??
???
???
???
5 / 50
??
?? ????? ?? ??
??? ???? ????? ???? ??? ??.
??? = ???
???? ???? ????? ?? ??? ?
???? ??? ??? ??? ??? ?? ?? ??
6 / 50
??
???? ??
???
Sample ???
??
7 / 50
??
?? ?? (Survey Sampling)
Survey: ??
Sampling: ???
Table: ?? ???? ?? ??
?? ???(Survey Methodology) ?? ???(Sampling Statistics)
???, ??? (????) ?? ?? ????? ??
?? ????? ??? ??? ?? ??? ????? ?? ??? ??
?? ??(??? ??)? ??? ?? ??? ???? ??? ????
?? ?? ?? ??
??? ?? , ?? ?? ?? ?? ??, ???, ?? ? ???? ??
8 / 50
??
?? ??? ? ??
9 / 50
??
Sir Francis Galton (1822-1911)
Galton was a polymath who made
important contributions in many ?elds of
science, including meteorology (the
anti-cyclone and the ?rst popular weather
maps), statistics (regression and
correlation), psychology (synesthesia),
biology (the nature and mechanism of
heredity), and criminology (?ngerprints)
He ?rst introduced the use of
questionnaires and surveys for collecting
data on human communities.
10 / 50
??
Karl Pearson (1857 - 1936)
Student of Francis Galton
He has been credited with establishing
the discipline of mathematical statistics,
and contributed signi?cantly to the ?eld of
biometrics, meteorology, theories of social
Darwinism and eugenics
Founding chair of department of Applied
Statistics in University of London (1911),
the ?rst stat department in the world!
Founding editor of Biometrika
11 / 50
??
?? ?? ???
?? + ?? + (??) = ??
?? = ???
?? = ?? ?? ??
?? = ??
???? ??? ??? ?????.
??? ??? ???? ??? ???? ?? ???? ?? ??
12 / 50
??
???? ??? ?? ??
?? 12? (?? 1430?)
?? ??? ?? ?? ??
?? ??: 172,648? (?? 8?)
??: ??? ?? 57%, ?? 43%
13 / 50
????
???? ?? - ??? ???
15 / 50
????
???? ?? - ????? (Freeconomics)
16 / 50
????
???? ??? vs ????
Table: ??? ???? ??
???? ??? ????
???? ???? ?? ???? ????
?? ???? Y ? ?? ?? ???? X ? ??
??? ?? ???
17 / 50
????
???? ??? vs ????
Table: ??? ???? ???? ??
???? ??? ????
?? Bias = 0 Bias = 0
?? Variance = K/n Variance = 0
18 / 50
????
?? ???? ?? ?? (X? = ???, Y? =????)
0 2000 4000 6000 8000 10000
0.020.040.060.080.10
n
Error
19 / 50
????
????? ??(bias)
??: ?? ??? ???? ?? (systematic error)
??? ??
1 ?? ?? (selection bias)
2 ?? ?? (information bias)
?? ??: ??? random sampling ? ?? ??? ?? ?? ????
??? ??? ?? ???? ?? ??? ??
?? ??: ??? ??, ???? ?? ??? ??? ??
20 / 50
????
????
?? ???: U = {1,    , N}.
??: ??? ?YN = N?1 N
i=1 yi
???? ??: B ? U.
Ii =
1 if i  B
0 otherwise.
???: ?? ?? ?yB = N?1
B
N
i=1 Iiyi, where NB =
N
i=1 Ii is the big
data sample size (NB < N).
21 / 50
????
Fundamental theorem of estimation error
Formula (Meng, 2016)
E(?yB ? ?Y )2
= E(2
I,Y )  2

1 ? fB
fB
where I,Y is the correlation between I and Y ,fB = NB/N,  is the big
data sampling mechanism, generally unknown.
Three components: data quality, problem dif?culty, and data quantity
?? ??? (Effective sample size): ??? ????? ??? Big data
???? ?? ??(MSE)? ?? ??? simple random sample ? ???
22 / 50
????
?????
ne? =
fB
1 ? fB

1
E(2
I,Y )
.
If I,Y = 0.05 and fB = 1/2, then ne? = 400.
?? ?? ??? ??? 1?????? ?? ?? 50% ? 500???
?????? I,Y = 0.05 ?? ???? ?? ?? 400?? ??? ??
??? ??? ??.
23 / 50
????
Paradox of Big data
???? ??? ?? ?? ???? ???? ????? ???? ??
CI = (?yB ? 1.96 (1 ? fB)S2/NB, ?yB + 1.96 (1 ? fB)S2/NB)
As NB  , we have
Pr( ?YN  CI)  0.
Paradox: ??? ???? ?? ???? ???? ??? ??, ???
??? ???? ? ??? ?? ??? ????. (If one ignores the bias
and apply the standard method of estimation, the bigger the dataset, the
more misleading it is for valid statistical inference.)
24 / 50
Salvation
Salvation of Big Data
26 / 50
Salvation
1. ?? ?? ??: Data integration
??? ???: ????? ??? ???
????? ?? ??? ??
??? ???? ?? ??? Y ? ?????? ??.
I = 1 I = 0
Y = 1 NB1
Y = 0 NB0
NB N ? NB
where Ii = 1 if unit i belongs to the big data sample and Ii = 0 otherwise.
?? ??: P = P(Y = 1).
27 / 50
Salvation
??? ?????? ??? ?? ??? ??? ??. (?? ????
????)
I = 1 I = 0
Y = 1 nB1 nC1 n1
Y = 0 nB0 nC0 n0
n
? ???? ??? ???? P? ??? ????
28 / 50
Salvation
??? ???
Note that
P(Y = 1) = P(Y = 1 | I = 1)P(I = 1) + P(Y = 1 | I = 0)P(I = 0).
Three components
1 P(I = 1): Big data proportion (known)
2 P(Y = 1 | I = 1) = NB1/NB: obtained from the big data.
3 P(Y = 1 | I = 0): estimated by nC1/(nC0 + nC1) from the survey data.
Final estimator
?P = PBWB + ?PC(1 ? WB) (1)
where WB = NB/N, PB = NB1/NB, and ?PC = nC1/(nC0 + nC1).
29 / 50
Salvation
Remark
Variance
V ( ?P) = (1 ? WB)2
V ( ?PC)
.
= (1 ? WB)
1
n
PC(1 ? PC).
If WB is close to one, then the above variance is very small.
Instead of using ?PC = nC1/(nC0 + nC1), we can construct a ratio
estimator of PC to improve the ef?ciency. That is, use
?PC,r =
1
1 + ?C
where
?C =
NB0/NB1
nB0/nB1
 (nC0/nC1).
30 / 50
Salvation
2. ?? ??
?? ??: Y
?? ??: X
?? ??: CX << CY .
????: X? ???. (?? ??? ??? ??)
?? ??: ?y = E(Y ).
31 / 50
Salvation
?? ?? ?? - Calibration study
Idea
?? E(Yi | Xi) = 0 + 1Xi? ???? ?? 0, 1? ???
??y = N?1
B
iB
(0 + 1xi)
? ???? ?y = E(Y )? ???? ??.
???? 0, 1 ? ???? ???? ???? ?? calibration study ?
???? (xi, yi)? ??? ?? ?? ??? ?0, ?1? ????
??y = N?1
B
iB
(?0 + ?1xi)
? ????.
32 / 50
?? ??
?? ?? - ????? ????
?????? ?? - ????? ??? ????? ???? ??????
?? ??
???? ?? ??? ?? ??? ???
1 ????? ????: ??? ???
2 KT ??? ?????: ?? ?? ????
?? ??: ??? ???? ???, ???? ??? ?? ???? ?? ??
??? ?? ??? ???.
???: 2016?? ??? ??? ???
34 / 50
?? ??
?? ??? ????? ?? ??
35 / 50
?? ??
?? ?? - ??? ??? (??: ??)
?? ??? ? KT ??? T-???
?? 5,953 4,945 5.91
?? 390 357 0.40
?? 35 87 -2.01
?? 354 1,335 -11.95
?? 18 30 -0.75
?? 33 32 0.03
?? 0 35
?? 624 1,216 -5.68
?? 228 128 1.54
?? 13 125 -6.67
?? 38 78 -1.54
?? 56 50 0.17
?? 44 111 -2.31
?? 61 83 -0.72
?? 44 83 -1.37
?? 2,818 2,009 4.39
36 / 50
?? ??
?? ?? ??
? ???? ??? ???? ?? : (Xi, ?Yi)
Yi: ?? i? ???? ?? (Unobserved)
?Yi: Yi? ?? ??? ??? (subject to sampling error)
Xi: ?????? ??? ??? (subject to non-sampling errors)
??? ??? ?? (??)
37 / 50
?? ??
Figure: ??? ??? ?? ??? ?? ??
38 / 50
?? ??
Area level model (Contd)
The goal is to predict Yi(=??) using the observation of ?Yi (=????)
and and Xi(=KT ??).
Area level model is a useful tool for combining information from different
sources by making an area level matching.
Area level model consists of two parts:
1 Sampling error model: relationship between ?Yi and Yi.
2 Structural error model: relationship between Yi and Xi.
39 / 50
?? ??
Area level model: Fay-Herriot model approach
Figure: A Directed Acyclic Graph (DAG) for classical area level models.
?Y
Y
X
(2)(1)
(1): Sampling error model (known),
(2): Structural error model (known up to ).
40 / 50
?? ??
Combining two models
Prediction model = sampling error model + structural error model
Bayes formula for prediction model
p(Yi | ?Yi, Xi)  g( ?Yi | Yi)f(Yi | Xi),
where g() is the sampling error model and f() is the structural error
model.
g(): assumed to be known.
f(): known up to parameter . ?????
Yi = Xi + ei, ei  (0, 2
X2
i )
? ???
41 / 50
?? ??
Parameter estimation
Obtain the prediction model using Bayes formula
EM algorithm: Update the parameters
?(t+1)
= arg max
i
E{log f(Yi | Xi; ) | ?Yi, Xi; ?(t)
}
where the conditional expectation is with respect to the prediction model
evaluated at the current parameter ?(t)
.
42 / 50
?? ??
Prediction vs Parameter estimation
Figure: EM algorithm
?Y
Y
X
?
M-step
E-step
43 / 50
?? ??
Prediction (frequentist approach)
?? ??: Expectation from the prediction model at  = ?
?Y ?
i = E{Yi | ?Yi, Xi; ?}
If f(Yi | Xi) is a normal distribution then
?Y ?
i = i
?Yi + (1 ? i)E(Yi | Xi; ?)
for some i where
i =
V (Yi | Xi; ?)
V ( ?Yi) + V (Yi | Xi; ?)
.
44 / 50
?? ??
?? ?? (??: ?? )
?? ?Yi Xi
? i ?? ??? ?? MSE (%)
?? 5,953 3,589 0.993 5,936 99.6
?? 390 259 0.755 358 87.4
?? 35 64 0.663 45 82.1
?? 354 969 0.978 367 99.0
?? 18 22 0.354 21 59.5
?? 33 23 0.222 26 47.1
?? 0 25 0.000 25
?? 624 883 0.958 635 97.9
?? 228 93 0.392 146 62.6
?? 13 91 0.904 21 95.1
?? 38 57 0.604 45 77.7
?? 56 36 0.286 42 53.5
?? 44 81 0.712 54 84.4
?? 60 61 0.524 60 72.4
?? 44 60 0.582 51 76.3
?? 2,818 1,458 0.953 2,754 97.7
?? MSE: ?? ???? MSE ?? ?? ???? MSE ??
45 / 50
??
??: 1. ????? ??
????? ?? (????)
?? ?? ??? ??
????, ????, ?? ??
?? ???? ?? ??? ?? (??? ??)
????? ?? (????)
?? ?? (?? ??, ?? ??)
??? ??? ???? ??
47 / 50
??
??: 2. ????? ??
????? ?? - ??? (??)
????? ?? ??? data integration ?? ?? ??
????? ?? ??? calibration study ? ???? ?? ??
????? ??? ??? ??? ??? ?? ??? ??? ?? ? ???
?? ?? ??? ? ??? ???.
48 / 50
??
Take-home message: ????? ?? ?? ???
????? ?? ?? ??? ??? ????.
49 / 50
??
The end
50 / 50

More Related Content

Big data ??? ???

  • 1. ???? ??? ??? - ????? ??? ?? ??? KAIST 2017? 12? 1? 1 / 50
  • 2. What is big data? 2 / 50
  • 3. ?? ???? ?? ??? ???? ?? ???? ??? ??? ??? ??? ??? (homogeneity)? ??? ??? ???? ???? ??? ??? ??? ??? ???? ??? ?? ?? ?? ??? ??. ???? ? ???? ?? ??? ?? ???? ???. ???? ??? ??, ?? ??? ???(heterogeneity)? ? ?? ????? ?? ??? ???? ??? ??? ?? ???? ????? ?? (Population science) Yu Xie (2013). Population heterogeneity and causal inference. PNAS, 110, 6262-6268. 4 / 50
  • 4. ?? ?????? ??? ???? ??? ??? ?? ?? = ??? ?? ?? ??? ??? ??? 5 / 50
  • 5. ?? ?? ????? ?? ?? ??? ???? ????? ???? ??? ??. ??? = ??? ???? ???? ????? ?? ??? ? ???? ??? ??? ??? ??? ?? ?? ?? 6 / 50
  • 7. ?? ?? ?? (Survey Sampling) Survey: ?? Sampling: ??? Table: ?? ???? ?? ?? ?? ???(Survey Methodology) ?? ???(Sampling Statistics) ???, ??? (????) ?? ?? ????? ?? ?? ????? ??? ??? ?? ??? ????? ?? ??? ?? ?? ??(??? ??)? ??? ?? ??? ???? ??? ???? ?? ?? ?? ?? ??? ?? , ?? ?? ?? ?? ??, ???, ?? ? ???? ?? 8 / 50
  • 8. ?? ?? ??? ? ?? 9 / 50
  • 9. ?? Sir Francis Galton (1822-1911) Galton was a polymath who made important contributions in many ?elds of science, including meteorology (the anti-cyclone and the ?rst popular weather maps), statistics (regression and correlation), psychology (synesthesia), biology (the nature and mechanism of heredity), and criminology (?ngerprints) He ?rst introduced the use of questionnaires and surveys for collecting data on human communities. 10 / 50
  • 10. ?? Karl Pearson (1857 - 1936) Student of Francis Galton He has been credited with establishing the discipline of mathematical statistics, and contributed signi?cantly to the ?eld of biometrics, meteorology, theories of social Darwinism and eugenics Founding chair of department of Applied Statistics in University of London (1911), the ?rst stat department in the world! Founding editor of Biometrika 11 / 50
  • 11. ?? ?? ?? ??? ?? + ?? + (??) = ?? ?? = ??? ?? = ?? ?? ?? ?? = ?? ???? ??? ??? ?????. ??? ??? ???? ??? ???? ?? ???? ?? ?? 12 / 50
  • 12. ?? ???? ??? ?? ?? ?? 12? (?? 1430?) ?? ??? ?? ?? ?? ?? ??: 172,648? (?? 8?) ??: ??? ?? 57%, ?? 43% 13 / 50
  • 13. ???? ???? ?? - ??? ??? 15 / 50
  • 14. ???? ???? ?? - ????? (Freeconomics) 16 / 50
  • 15. ???? ???? ??? vs ???? Table: ??? ???? ?? ???? ??? ???? ???? ???? ?? ???? ???? ?? ???? Y ? ?? ?? ???? X ? ?? ??? ?? ??? 17 / 50
  • 16. ???? ???? ??? vs ???? Table: ??? ???? ???? ?? ???? ??? ???? ?? Bias = 0 Bias = 0 ?? Variance = K/n Variance = 0 18 / 50
  • 17. ???? ?? ???? ?? ?? (X? = ???, Y? =????) 0 2000 4000 6000 8000 10000 0.020.040.060.080.10 n Error 19 / 50
  • 18. ???? ????? ??(bias) ??: ?? ??? ???? ?? (systematic error) ??? ?? 1 ?? ?? (selection bias) 2 ?? ?? (information bias) ?? ??: ??? random sampling ? ?? ??? ?? ?? ???? ??? ??? ?? ???? ?? ??? ?? ?? ??: ??? ??, ???? ?? ??? ??? ?? 20 / 50
  • 19. ???? ???? ?? ???: U = {1, , N}. ??: ??? ?YN = N?1 N i=1 yi ???? ??: B ? U. Ii = 1 if i B 0 otherwise. ???: ?? ?? ?yB = N?1 B N i=1 Iiyi, where NB = N i=1 Ii is the big data sample size (NB < N). 21 / 50
  • 20. ???? Fundamental theorem of estimation error Formula (Meng, 2016) E(?yB ? ?Y )2 = E(2 I,Y ) 2 1 ? fB fB where I,Y is the correlation between I and Y ,fB = NB/N, is the big data sampling mechanism, generally unknown. Three components: data quality, problem dif?culty, and data quantity ?? ??? (Effective sample size): ??? ????? ??? Big data ???? ?? ??(MSE)? ?? ??? simple random sample ? ??? 22 / 50
  • 21. ???? ????? ne? = fB 1 ? fB 1 E(2 I,Y ) . If I,Y = 0.05 and fB = 1/2, then ne? = 400. ?? ?? ??? ??? 1?????? ?? ?? 50% ? 500??? ?????? I,Y = 0.05 ?? ???? ?? ?? 400?? ??? ?? ??? ??? ??. 23 / 50
  • 22. ???? Paradox of Big data ???? ??? ?? ?? ???? ???? ????? ???? ?? CI = (?yB ? 1.96 (1 ? fB)S2/NB, ?yB + 1.96 (1 ? fB)S2/NB) As NB , we have Pr( ?YN CI) 0. Paradox: ??? ???? ?? ???? ???? ??? ??, ??? ??? ???? ? ??? ?? ??? ????. (If one ignores the bias and apply the standard method of estimation, the bigger the dataset, the more misleading it is for valid statistical inference.) 24 / 50
  • 24. Salvation 1. ?? ?? ??: Data integration ??? ???: ????? ??? ??? ????? ?? ??? ?? ??? ???? ?? ??? Y ? ?????? ??. I = 1 I = 0 Y = 1 NB1 Y = 0 NB0 NB N ? NB where Ii = 1 if unit i belongs to the big data sample and Ii = 0 otherwise. ?? ??: P = P(Y = 1). 27 / 50
  • 25. Salvation ??? ?????? ??? ?? ??? ??? ??. (?? ???? ????) I = 1 I = 0 Y = 1 nB1 nC1 n1 Y = 0 nB0 nC0 n0 n ? ???? ??? ???? P? ??? ???? 28 / 50
  • 26. Salvation ??? ??? Note that P(Y = 1) = P(Y = 1 | I = 1)P(I = 1) + P(Y = 1 | I = 0)P(I = 0). Three components 1 P(I = 1): Big data proportion (known) 2 P(Y = 1 | I = 1) = NB1/NB: obtained from the big data. 3 P(Y = 1 | I = 0): estimated by nC1/(nC0 + nC1) from the survey data. Final estimator ?P = PBWB + ?PC(1 ? WB) (1) where WB = NB/N, PB = NB1/NB, and ?PC = nC1/(nC0 + nC1). 29 / 50
  • 27. Salvation Remark Variance V ( ?P) = (1 ? WB)2 V ( ?PC) . = (1 ? WB) 1 n PC(1 ? PC). If WB is close to one, then the above variance is very small. Instead of using ?PC = nC1/(nC0 + nC1), we can construct a ratio estimator of PC to improve the ef?ciency. That is, use ?PC,r = 1 1 + ?C where ?C = NB0/NB1 nB0/nB1 (nC0/nC1). 30 / 50
  • 28. Salvation 2. ?? ?? ?? ??: Y ?? ??: X ?? ??: CX << CY . ????: X? ???. (?? ??? ??? ??) ?? ??: ?y = E(Y ). 31 / 50
  • 29. Salvation ?? ?? ?? - Calibration study Idea ?? E(Yi | Xi) = 0 + 1Xi? ???? ?? 0, 1? ??? ??y = N?1 B iB (0 + 1xi) ? ???? ?y = E(Y )? ???? ??. ???? 0, 1 ? ???? ???? ???? ?? calibration study ? ???? (xi, yi)? ??? ?? ?? ??? ?0, ?1? ???? ??y = N?1 B iB (?0 + ?1xi) ? ????. 32 / 50
  • 30. ?? ?? ?? ?? - ????? ???? ?????? ?? - ????? ??? ????? ???? ?????? ?? ?? ???? ?? ??? ?? ??? ??? 1 ????? ????: ??? ??? 2 KT ??? ?????: ?? ?? ???? ?? ??: ??? ???? ???, ???? ??? ?? ???? ?? ?? ??? ?? ??? ???. ???: 2016?? ??? ??? ??? 34 / 50
  • 31. ?? ?? ?? ??? ????? ?? ?? 35 / 50
  • 32. ?? ?? ?? ?? - ??? ??? (??: ??) ?? ??? ? KT ??? T-??? ?? 5,953 4,945 5.91 ?? 390 357 0.40 ?? 35 87 -2.01 ?? 354 1,335 -11.95 ?? 18 30 -0.75 ?? 33 32 0.03 ?? 0 35 ?? 624 1,216 -5.68 ?? 228 128 1.54 ?? 13 125 -6.67 ?? 38 78 -1.54 ?? 56 50 0.17 ?? 44 111 -2.31 ?? 61 83 -0.72 ?? 44 83 -1.37 ?? 2,818 2,009 4.39 36 / 50
  • 33. ?? ?? ?? ?? ?? ? ???? ??? ???? ?? : (Xi, ?Yi) Yi: ?? i? ???? ?? (Unobserved) ?Yi: Yi? ?? ??? ??? (subject to sampling error) Xi: ?????? ??? ??? (subject to non-sampling errors) ??? ??? ?? (??) 37 / 50
  • 34. ?? ?? Figure: ??? ??? ?? ??? ?? ?? 38 / 50
  • 35. ?? ?? Area level model (Contd) The goal is to predict Yi(=??) using the observation of ?Yi (=????) and and Xi(=KT ??). Area level model is a useful tool for combining information from different sources by making an area level matching. Area level model consists of two parts: 1 Sampling error model: relationship between ?Yi and Yi. 2 Structural error model: relationship between Yi and Xi. 39 / 50
  • 36. ?? ?? Area level model: Fay-Herriot model approach Figure: A Directed Acyclic Graph (DAG) for classical area level models. ?Y Y X (2)(1) (1): Sampling error model (known), (2): Structural error model (known up to ). 40 / 50
  • 37. ?? ?? Combining two models Prediction model = sampling error model + structural error model Bayes formula for prediction model p(Yi | ?Yi, Xi) g( ?Yi | Yi)f(Yi | Xi), where g() is the sampling error model and f() is the structural error model. g(): assumed to be known. f(): known up to parameter . ????? Yi = Xi + ei, ei (0, 2 X2 i ) ? ??? 41 / 50
  • 38. ?? ?? Parameter estimation Obtain the prediction model using Bayes formula EM algorithm: Update the parameters ?(t+1) = arg max i E{log f(Yi | Xi; ) | ?Yi, Xi; ?(t) } where the conditional expectation is with respect to the prediction model evaluated at the current parameter ?(t) . 42 / 50
  • 39. ?? ?? Prediction vs Parameter estimation Figure: EM algorithm ?Y Y X ? M-step E-step 43 / 50
  • 40. ?? ?? Prediction (frequentist approach) ?? ??: Expectation from the prediction model at = ? ?Y ? i = E{Yi | ?Yi, Xi; ?} If f(Yi | Xi) is a normal distribution then ?Y ? i = i ?Yi + (1 ? i)E(Yi | Xi; ?) for some i where i = V (Yi | Xi; ?) V ( ?Yi) + V (Yi | Xi; ?) . 44 / 50
  • 41. ?? ?? ?? ?? (??: ?? ) ?? ?Yi Xi ? i ?? ??? ?? MSE (%) ?? 5,953 3,589 0.993 5,936 99.6 ?? 390 259 0.755 358 87.4 ?? 35 64 0.663 45 82.1 ?? 354 969 0.978 367 99.0 ?? 18 22 0.354 21 59.5 ?? 33 23 0.222 26 47.1 ?? 0 25 0.000 25 ?? 624 883 0.958 635 97.9 ?? 228 93 0.392 146 62.6 ?? 13 91 0.904 21 95.1 ?? 38 57 0.604 45 77.7 ?? 56 36 0.286 42 53.5 ?? 44 81 0.712 54 84.4 ?? 60 61 0.524 60 72.4 ?? 44 60 0.582 51 76.3 ?? 2,818 1,458 0.953 2,754 97.7 ?? MSE: ?? ???? MSE ?? ?? ???? MSE ?? 45 / 50
  • 42. ?? ??: 1. ????? ?? ????? ?? (????) ?? ?? ??? ?? ????, ????, ?? ?? ?? ???? ?? ??? ?? (??? ??) ????? ?? (????) ?? ?? (?? ??, ?? ??) ??? ??? ???? ?? 47 / 50
  • 43. ?? ??: 2. ????? ?? ????? ?? - ??? (??) ????? ?? ??? data integration ?? ?? ?? ????? ?? ??? calibration study ? ???? ?? ?? ????? ??? ??? ??? ??? ?? ??? ??? ?? ? ??? ?? ?? ??? ? ??? ???. 48 / 50
  • 44. ?? Take-home message: ????? ?? ?? ??? ????? ?? ?? ??? ??? ????. 49 / 50