際際滷

際際滷Share a Scribd company logo
Statistical
programming in R
Part 2
Topics
covered
Point estimation
Confidence interval
Bootstrap
Point estimation
Point
estimation
One of the main goals of statistics
is to estimate unknown
parameters
To approximate these parameters,
we choose an estimator, which is
simply any function of randomly
sampled observations
Point estimation
 To illustrate this idea, we will estimate the value of  by uniformly
dropping samples on a square containing an inscribed circle. Notice
that the value of  can be expressed as a ratio of areas.
Point estimation
 We can estimate this ratio with our samples. Let m be the number of
samples within our circle and n the total number of samples dropped.
We define our estimator  as:
 https://seeing-theory.brown.edu/basic-probability/index.html - sectio
n1
Point
estimation in
R
We may also sample from a reference population and see if
our sample mean is a good approximation to the reference
population
In this case, we say that the sample mean is a point estimator
of the true population parameter
Let us sample 10 samples from a
normal distribution with mean
of 0 and standard deviation of 1
sample <- rnorm(10, 0, 1)
mean(sample)
Is this sample mean close to the true population mean?
Now, try with 100, 1000 and
10,000
Are we getting closer?
Confidence interval
Confidence
interval
Another way of estimating parameter from a
population is to define a range of possible
values instead of just using 1 point estimate
This range is an interval, and is associated with
a confidence level
The confidence level is the probability that this
range will contain the true population
parameter
This is known formally as the confidence
interval (CI)
 https://seeing-theory.brown.edu/basic-probability/index.html - sectio
n1
Confidence
interval in R
We will make some assumptions for
what we might find in an experiment
and find the resulting confidence
interval using a normal distribution.
In this example we will use a 95%
confidence level and wish to find the
confidence interval.
Confidence
interval in R
x = c(9.0,9.5,9.6,10.2,11.6)
t.test(x)
outcome <- t.test(x)
outcome$conf.int
Bootstrap
Bootstrap
The computational technique known as
the Bootstrap provides a convenient way
to estimate properties of an estimator via
resampling.
In this example, we resample with
replacement from the empirical
distribution function in order to estimate
the standard error of the sample mean.
 https://seeing-theory.brown.edu/basic-probability/index.html - sectio
n1
Bootstrap in R
 Using rnorm and possibly a loop or two. Write a function called
my_firstbootstrap(n,m,x,y), where n is the sampling size and m is the
number of resampling. x and y are the means and sd of a normal
distribution
 Devise a way to evaluate the bootstrap outcomes as you increment
sampling size, and number of resampling
 Hint: you can evaluate the sampling mean to the true population
mean
Bootstrap in R
 my_firstbootstrap(10,100,10,1)
 my_firstbootstrap(100,100,10,1)
 my_firstbootstrap(1000,100,10,1)
 my_firstbootstrap(10,500,10,1)
 my_firstbootstrap(10,1000,10,1)
 my_firstbootstrap(10,2000,10,1)
 my_firstbootstrap(10,3000,10,1)
End of Segment Lets take a break

More Related Content

4 Statistical programming in R part 2.pptx

  • 4. Point estimation One of the main goals of statistics is to estimate unknown parameters To approximate these parameters, we choose an estimator, which is simply any function of randomly sampled observations
  • 5. Point estimation To illustrate this idea, we will estimate the value of by uniformly dropping samples on a square containing an inscribed circle. Notice that the value of can be expressed as a ratio of areas.
  • 6. Point estimation We can estimate this ratio with our samples. Let m be the number of samples within our circle and n the total number of samples dropped. We define our estimator as:
  • 8. Point estimation in R We may also sample from a reference population and see if our sample mean is a good approximation to the reference population In this case, we say that the sample mean is a point estimator of the true population parameter Let us sample 10 samples from a normal distribution with mean of 0 and standard deviation of 1 sample <- rnorm(10, 0, 1) mean(sample) Is this sample mean close to the true population mean? Now, try with 100, 1000 and 10,000 Are we getting closer?
  • 10. Confidence interval Another way of estimating parameter from a population is to define a range of possible values instead of just using 1 point estimate This range is an interval, and is associated with a confidence level The confidence level is the probability that this range will contain the true population parameter This is known formally as the confidence interval (CI)
  • 12. Confidence interval in R We will make some assumptions for what we might find in an experiment and find the resulting confidence interval using a normal distribution. In this example we will use a 95% confidence level and wish to find the confidence interval.
  • 13. Confidence interval in R x = c(9.0,9.5,9.6,10.2,11.6) t.test(x) outcome <- t.test(x) outcome$conf.int
  • 15. Bootstrap The computational technique known as the Bootstrap provides a convenient way to estimate properties of an estimator via resampling. In this example, we resample with replacement from the empirical distribution function in order to estimate the standard error of the sample mean.
  • 17. Bootstrap in R Using rnorm and possibly a loop or two. Write a function called my_firstbootstrap(n,m,x,y), where n is the sampling size and m is the number of resampling. x and y are the means and sd of a normal distribution Devise a way to evaluate the bootstrap outcomes as you increment sampling size, and number of resampling Hint: you can evaluate the sampling mean to the true population mean
  • 18. Bootstrap in R my_firstbootstrap(10,100,10,1) my_firstbootstrap(100,100,10,1) my_firstbootstrap(1000,100,10,1) my_firstbootstrap(10,500,10,1) my_firstbootstrap(10,1000,10,1) my_firstbootstrap(10,2000,10,1) my_firstbootstrap(10,3000,10,1)
  • 19. End of Segment Lets take a break

Editor's Notes

  1. https://seeing-theory.brown.edu/basic-probability/index.html#section1https://seeing-theory.brown.edu/basic-probability/index.html#section
  2. https://seeing-theory.brown.edu/basic-probability/index.html#section1https://seeing-theory.brown.edu/basic-probability/index.html#section
  3. https://seeing-theory.brown.edu/basic-probability/index.html#section1https://seeing-theory.brown.edu/basic-probability/index.html#section
  4. my_firstbootstrap <- function(n,m,x,y) { sampling_means <- c() for (i in 1:m) { sampling_means <- append(sampling_means, mean(rnorm(n, x, y))) } return(mean(sampling_means)) }