1. Inferences Based on a Single Sample:
Point and Interval Estimation
Sutikno
sutikno@statistika.itd.ac.id
2. Ilustrasi
1. Misalkan ingin diketahui, berapa pengeluaran sebulan
mahasiswa semester satu tahun 2024
2. Berapa rata-rata pendapatan perkapita penduduk di
Indonesia
3. Berapa jumlah produksi yang cacat dalam proses industry
dalam 1 bulan?
4. Berapa rata-rata suhu harian di Kota Surabaya
5. Berapa rata-rata tinggi hujan bulanan di Lamongan
6. .
5. Target Parameter
The unknown population parameter (e.g., mean or
proportion) that we are interested in estimating is called
the target parameter.
6. Target Parameter
Determining the Target Parameter
Parameter Key Words of Phrase Type of Data
袖 Mean; average Quantitative
p Proportion; percentage
fraction; rate Qualitative
7. Point Estimator
A point estimator of a population parameter is a rule
or formula that tells us how to use the sample data to
calculate a single number that can be used as an
estimate of the target parameter.
8. Point Estimation
1. Provides a single value
Based on observations from one sample
2. Gives no information about how close the
value is to the unknown population parameter
3. Example: Sample mean x = 3 is the point
estimate of the unknown population mean
9. Interval Estimator
An interval estimator (or confidence interval) is a
formula that tells us how to use the sample data to
calculate an interval that estimates the target parameter.
10. Interval Estimation
1. Provides a range of values
Based on observations from one sample
2. Gives information about closeness to unknown
population parameter
Stated in terms of probability
Knowing exact closeness requires knowing unknown
population parameter
3. Example: Unknown population mean lies between 50
and 70 with 95% confidence
12. Estimation Process
Mean, , is
unknown
Population
Sample
Random Sample
I am 95%
confident that
is between 40
& 60.
Mean
x = 50
13. Key Elements of
Interval Estimation
Sample statistic
(point estimate)
Confidence interval
Confidence limit
(lower)
Confidence limit
(upper)
A confidence interval provides a range of
plausible values for the population parameter.
14. According to the Central Limit Theorem, the sampling
distribution of the sample mean is approximately normal
for large samples. Let us calculate the interval estimator:
Confidence Interval
x 1.96x x
1.96
n
That is, we form an interval from 1.96 standard
deviations below the sample mean to 1.96 standard
deviations above the mean. Prior to drawing the sample,
what are the chances that this interval will enclose 袖, the
population mean?
15. If sample measurements yield a value of that falls
between the two lines on either side of 袖, then the
interval will contain 袖.
Confidence Interval
The area under the
normal curve between
these two boundaries is
exactly .95. Thus, the
probability that a
randomly selected
interval will contain 袖
is equal to .95.
x
x 1.96x
16. The confidence coefficient is the probability that a
randomly selected confidence interval encloses the
population parameter - that is, the relative frequency
with which similarly constructed intervals enclose the
population parameter when the estimator is used
repeatedly a very large number of times. The confidence
level is the confidence coefficient expressed as a
percentage.
Confidence Coefficient
17. If our confidence level is 95%, then in the long run,
95% of our confidence intervals will contain 袖 and 5%
will not.
95% Confidence Level
For a confidence coefficient of 95%, the area in the two
tails is .05. To choose a different confidence coefficient
we increase or decrease the area (call it ) assigned
to the tails. If we place /2 in
each tail and z/2 is the z-value,
the confidence interval with
coefficient coefficient (1 )
is x z 2
x .
18. 1. A random sample is selected from the target
population.
2. The sample size n is large (i.e., n 30). Due to the
Central Limit Theorem, this condition guarantees
that the sampling distribution of is approximately
normal. Also, for large n, s will be a good estimator
of .
Conditions Required for a Valid
Large-Sample
Confidence Interval for 袖
x
19. where z/2 is the z-value with an area /2 to its right and
The parameter is the standard deviation of
the sampled population, and n is the sample size.
Note: When is unknown and n is large (n 30), the
confidence interval is approximately equal to
Large-Sample (1
)% Confidence
Interval for 袖
where s is the sample standard deviation.
x z 2
x x z 2
n
x z 2
s
n
20. Thinking Challenge
Youre a Q/C inspector for Gallo.
The for 2-liter bottles is .05
liters. A random sample of 100
bottles showed x = 1.99 liters.
What is the 90% confidence
interval estimate of the true
mean amount in 2-liter bottles?
2 liter
息 1984-1994 T/Maker Co.
2 liter
23. Small Sample Unknown
Instead of using the standard normal statistic
use the tstatistic
z
x 袖
x
x 袖
n
t
x 袖
s n
in which the sample standard deviation, s, replaces the
population standard deviation, .
24. Students t-Statistic
The t-statistic has a sampling distribution very much like
that of the z-statistic: mound-shaped, symmetric, with
mean 0.
The primary
difference between
the sampling
distributions of t and
z is that the t-statistic
is more variable than
the z-statistic.
25. Degrees of Freedom
The actual amount of variability in the sampling
distribution of t depends on the sample size n. A
convenient way of expressing this dependence is to say
that the t-statistic has (n 1) degrees of freedom (df).
28. t-value
If we want the t-value with an area of .025 to its right
and 4 df, we look in the table under the column t.025 for
the entry in the row corresponding to 4 df. This entry is
t.025 = 2.776. The corresponding standard normal z-score
is z.025 = 1.96.
30. Conditions Required for a
Valid Small-Sample
Confidence Interval for 袖
1. A random sample is selected from the target
population.
2. The population has a relative frequency
distribution that is approximately normal.
31. Estimation Example
Mean ( Unknown)
x t/2
s
n
o x t/2
s
n
50 2.064
8
25
o 50 2.064
8
25
46.70 o 53.30
A random sample of n = 25 has = 50 and s = 8.
Set up a 95% confidence interval estimate for .
x
32. 息 2011 Pearson Education, Inc
Thinking Challenge
Youre a time study analyst in
manufacturing. Youve
recorded the following task
times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
What is the 90% confidence
interval estimate of the
population mean task time?
33. Confidence Interval Solution*
x = 3.7
s = 3.8987
n = 6, df = n 1 = 6 1 = 5
t.05 = 2.015
3.7 2.015
.38987
6
o 3.7 2.015
.38987
6
.492 o 6.908
35. 1. The mean of the sampling distribution of is p; that
is, is an unbiased estimator of p.
Sampling Distribution of
p
p
3. For large samples, the sampling distribution of is
approximately normal. A sample size is considered
large if both np 15 and nq 15.
p
2. The standard deviation of the sampling distribution
of is ; that is, , where q = 1p.
pq n
p p pq n
p
37. Conditions Required for a
Valid Large-Sample
Confidence Interval for p
1. A random sample is selected from the target population.
np 15 and nq 15 np
2. The sample size n is large. (This condition will be
satisfied if both . Note that and
are simply the number of successes and number of
failures, respectively, in the sample.).
nq
38. 息 2011 Pearson Education, Inc
Estimation Example
Proportion
A random sample of 400 graduates showed 32
went to graduate school. Set up a 95%
confidence interval estimate for p.
/2 /2
32
0.08
400
.08 .92 .08 .92
.08 1.96 .08 1.96
400 400
.053 .107
pq pq
p Z p p Z p
n n
p
p
39. Thinking Challenge
Youre a production manager
for a newspaper. You want to
find the % defective. Of 200
newspapers, 35 had defects.
What is the 90% confidence
interval estimate of the
population proportion
defective?
41. where is the adjusted sample proportion of
Adjusted (1
)100%
Confidence Interval for a
Population Proportion, p
2
1
4
p p
p z
n
p
x 2
n 4
observations with the characteristic of interest, x is the
number of successes in the sample, and n is the sample
size.
43. In general, we express the reliability associated with a
confidence interval for the population mean 袖 by
specifying the sampling error within which we want to
estimate 袖 with 100(1 )% confidence. The sampling
error (denoted SE), then, is equal to the half-width of the
confidence interval.
Sampling Error
44. In order to estimate 袖 with a sampling error (SE) and
with 100(1 )% confidence, the required sample size
is found as follows:
Sample Size Determination
for 100(1
) % Confidence
Interval for 袖
z 2
n
SE
The solution for n is given by the equation
n
z 2
2
2
SE
2
45. Sample Size Example
What sample size is needed to be 90% confident
the mean is within 5? A pilot study
suggested that the standard deviation is 45.
n
(z 2
)2
2
(SE) 2
1.645
2
45
2
5
2
219.2 220
46. 息 2011 Pearson Education, Inc
In order to estimate p with a sampling error SE and with
100(1 )% confidence, the required sample size is
found by solving the following equation for n:
Sample Size Determination
for 100(1
) % Confidence
Interval for p
z 2
pq
n
SE
The solution for n can be written as follows:
n
z 2
2
pq
SE
2
Note: Always round n up to the nearest integer value.
47. Sample Size Example
What sample size is needed to estimate p
within .03 with 90% confidence?
.03
.015
2 2
width
SE
n
(Z 2 )2
pq
(SE) 2
1.645
2
.5
.5
.015
2 3006.69 3007
48. Thinking Challenge
You work in Human Resources
at Merrill Lynch. You plan to
survey employees to find their
average medical expenses. You
want to be 95% confident that
the sample mean is within 賊 $50.
A pilot study showed that was
about $400. What sample size
do you use?
51. Finite Population Correction Factor
In some sampling situations, the sample size n may
represent 5% or perhaps 10% of the total number N of
sampling units in the population. When the sample size
is large relative to the number of measurements in the
population (see the next slide), the standard errors of the
estimators of 袖 and p should be multiplied by a finite
population correction factor.
52. Rule of Thumb for Finite
Population Correction Factor
Use the finite population correction factor when
n/N > .05.
53. Simple Random Sampling with
Finite Population of Size N
Estimation of the Population Mean
Estimated standard error:
Approximate 95% confidence interval:
殻x
s
n
N n
N
x 2殻x
54. Simple Random Sampling with
Finite Population of Size N
Estimation of the Population Proportion
Estimated standard error:
Approximate 95% confidence interval: p 2殻 p
殻 p
p(1 p)
n
N n
N
55. Finite Population Correction
Factor Example
You want to estimate a population mean, 亮, where
x =115, s =18, N =700, and n = 60. Find an approximate
95% confidence interval for 亮.
is greater than .05 use the finite correction factor
086
.
700
60
N
n
Since
56. Finite Population Correction
Factor Example
You want to estimate a population mean, 亮, where
x =115, s =18, N =700, and n = 60. Find an approximate
95% confidence interval for 亮.
x 2
s
n
N n
N
115 2
18
60
700 60
700
115 4.4
110.6, 119.4
58. Simple Random Sample
If n elements are selected from a population in
such a way that every set of n elements in the
population has an equal probability of being
selected, the n elements are said to be a simple
random sample.
59. Stratified Random Sampling
Stratified random sampling is used when the
sampling units (i.e., the units that are sampled)
associated with the population can be physically
separated into two or more groups of sampling
units (called strata) where the within-stratum
response variation is less than the variation
within the entire population.
60. Systematic Sample
Sometimes it is difficult or too costly to select
random samples. For example, it would be easier
to obtain a sample of student opinions at a large
university by systematically selecting every
hundredth name from the student directory. This
type of sample design is called a systematic
sample. Although systematic samples are usually
easier to select than other types of samples, one
difficulty is the possibility of a systematic
sampling bias.
61. Randomized Response Sampling
Randomized response sampling is particularly
useful when the questions of the pollsters are
likely to elicit false answers. One method of
coping with the false responses produced by
sensitive questions is randomized response
sampling. Each person is presented two
questions; one question is the object of the
survey, and the other is an innocuous question to
which the interviewee will give an honest
answer.
62. Key Ideas
Population Parameters, Estimators, and
Standard Errors
Parameter Estimator Standard
Error of
Estimator
Estimated
Std Error
Mean, 袖
Proportion, p pq n
鰹洩
pq n
p
s n
n
x
洩
殻洩
63. Key Ideas
Population Parameters, Estimators, and
Standard Errors
Confidence Interval: An interval that encloses an
unknown population parameter with a certain level of
confidence (1 )
Confidence Coefficient: The probability (1 ) that a
randomly selected confidence interval encloses the true
value of the population parameter.
64. Key Ideas
Key Words for Identifying the Target
Parameter
袖 Mean, Average
p Proportion, Fraction, Percentage, Rate, Probability
65. Key Ideas
Sample Survey Designs
1. Simple random sampling
2. Stratified random sampling
3. Systematic sampling
4. Random response sampling