This document provides an introduction to R for econometric analysis. It discusses installing R and RStudio, why R is useful for data science and analysis tasks, basic R syntax like variables, functions, modules and packages. It also demonstrates some R code examples and covers flow control, objects, operators, sampling, common functions and basic graphics. The document is intended to give students a quick start in learning the R programming language.
6. Software used in data analysis competitions in 2011.
source :http://r4stats.com/articles/popularity/
http://blog.revolutionanalytics.com/2012/08/r-language-
popularity-for-data-mining.html
17. Variable ׃
1
2
3
4
# R code
# vector
a = c(1,2,3,4) # numeric vector ֵ
b = c("1", "2", "3","4") # string vector ִ
c = c( T, F, T, T) # boolean vector
# matrix
d = matrix(a, nrow=2, ncol=2)
dim(a) = c(2,2)
# data.frame
e = data.frame(string = b, booling = c) #it can store
different type data
1 3
2 4
1 T
2 F
3 T
4 F
numeric
vector
numeric
matrix
data.frame
18. Function ʽ
like a collection of computation
Ҳf, һ\
do some computation
a function:length
[1,2,3,4] 4
return
# R code
a = c(1,2,3,4)
result = length(a)
result
input
19. Function ʽ
do some computation
function: mean
[1,2,3,4] 2.5
return
output
Built-in
self-defined
(package)
# Built-in
data = 1:4
output= mean(data)
data
# Self-defined
MyMean = function(data){
total = sum(data)
len = length(data)
result = total / len
return(result)
}
data = 1:4
output = MyMean(data)
input
20. Module ģM
like a collection of function
[example] data_preprocess.R
21. Package
you can expand your
built-in function
by installing a packages
like a collection of module
22. Package
how to use PACKAGES ?
# R code
x = 1:10 # OxS
y = sin(3*x) # OyS
plot(x,y) # ԭRAOĮDʽ
# ˮ^ƯĈD....
install.packages(ggplot2) # ggplot2@ĹپWdC
#̖DZҪ
library(ggplot2) # ıC load @ݳʽae
qplot(x,y) # ʹ ggplot2e挑õĺʽ qplot
24. Flow Control
#1 if
if (expression){
statement
}
# R code
data = rnorm(100) #Ę˜ʳBг
100ӱc
mu = mean(data)
mu > 0
if ( mu > 0 ){
print("mean is greater than 0")
}else{
print("mean is less than 0")
}
ɜyTRUE,
͈дȔ
t
25. Flow Control
#2 while
while (expression){
statement
}
# R code
for ( i in 1:3){
data = rnorm(i)
print(data)
}
ֻҪɜyTRUE,
дȔ
t
26. Flow Control
#3 For
For( i in 1: 3){
statement (i)
}
# R code
data = rnorm(100) #Ę˜ʳBг
100ӱc
mu = mean(data)
mu > 0
while (mu > 0){
print("mean is greater than 0")
# mu = "tested"
}
# loFޒȦԇwhileȵ]
_
i = 1 , һ
i = 2 ,وһ
i = 3 ,وһ
YޒȦ