際際滷

際際滷Share a Scribd company logo
Introduction to Data Science with R
Sabar Suwarsono, S.Si
@soewarsono
...
Who am I?

Data Scientist with R enthusiast

Member of Komunitas R Indonesia

Member of Komunitas GNU/Linux Malang
(KLiM)
Data Science

Data science adalah ilmu interdisiplin
yang berarti data science terbentuk dari
berbagai ilmu pengetahuan.

Menurut Staven Geringer Raleigh
(2014), pembentuk data science dapat
diilustrasikan dalam diagram venn
berikut,

Data science mencakup disiplin ilmu
yang luas, berdasarkan diagram diatas
terdapat 3 disiplin ilmu yang berfokus
pada data science.
Machine Learning
Machine learning merupakan irisan dari ilmu matematika dan statistika
dengan ilmu komputer. Machine Learning adalah cabang dari disiplin ilmu
kecerdasan buatan (Artificial Intelligence) yang bertujuan memberikan
kemampuan kepada komputer untuk dapat melakukan proses belajar.
Banyak algoritma machine learning yang digunakan untuk melakukan
analisis data dengan tingkat akurasi yang tinggi, yang paling populer
adalah neural network. Dimana kita ketahuai fundamental sebuah
algoritma selalu menggunakan ilmu matematika. Salah satu
penerapannya adalah Cortana atau yang lebih dikenal sebagai asisten
dari pengguna Windows 10 merupakan salah satu penarapan machine
learning.
Traditional Software
Traditional software merupakan irisan dari ilmu komputer
dengan SME (Subject Matter Expertise), SME adalah
pengetahuan mengenai proses dari suatu bisnis atau
instansi untuk beroperasi sehingga dapat dibuat (develop)
suatu sistem yang dapat membantu bisnis atau instansi
tersebut. Penerapan traditional software hampir digunakan
oleh seluruh instansi pemerintahan maupun bisnis,
contohnya e-learning, e-library, online banking, Point of
Sales (PoS), dan lain-lain.
Traditional Research
Traditional research merupakan irisan dari ilmu
matematika dan statistika dengan SME
(Subject Matter Expertise). Traditional research
hampir digunakan diberbagai perusahaan,
instansi serta universitas. Penelitian-penelitian
yang dilakukan umumya menggunakan
traditional research.
Apa itu Data Scientist?

Berdasarkan diagram, data science adalah ilmu yang memuat disiplin ilmu-
ilmu tersebut.

Dalam perkembangan selanjutnya, seseorang yang berkecimpung dalam
ilmu ini disebut Data Scientist.

Namun terdapat pertimbangan antara data scientist dan unicorn pada
diagram diatas. Dalam kenyataannya sangat susah untuk mencari
seseorang yang expert di semua ilmu tersebut.

Dalam diagram, orang ini adalah definisi dari unicorn pada diagram diatas.
Sehingga unicorn adalah orang yang perfect di bidang data science.
Founded by Ross Ihaka &
Robert Gentleman
High level language
Interactive &
Programming
A swiss army knife for
statistical tests and
models, out-of-the box!
Download R
Changes in the realm of analytical software
1. Point and click software solutions (e.g. SPSS, SAS)
are limited
2. Software is becoming free in several areas (OS, free
APIs, applications, etc.)
3. Reproducible and transparent research movements
source: http://r4stats.com/articles/popularity/
Advantages of R
 Completely free
 Reproducibility
 The R community is very active and helpful (e.g. Stack Overflow)
 Evolving rapidly
 Several statistical procedures are first (or only) available in R
 Great tools for sharing results (make presentations, posters,
notebooks, books, articles in R)
 You can do every step of a data analysis project within R, from
collecting, transforming, and analyzing the data to plotting and
even sharing the results.
 Version control via GitHub
source: http://blog.revolutionanalytics.com/2016/04/cran-package-growth.html
Disadvantages of R
 Can be difficult to learn
 Can be slow with huge datasets (we are talking about data tables with several million
records)
 Best used in data science/analysis circles, not a generic language
 Obscure syntax (imo now resolved)
Reasons to learn R: get published
 R has the largest growth in analytical software in
science
 Learning R can make you the stat/tech guy ->
everybody will want to work with you -> lots of
publications at least as a co-author
source: http://r4stats.com/articles/popularity/
Reasons to learn R: you can get a job
source: http://r4stats.com/articles/popularity/
Reasons to learn R: support and popularity
source: http://redmonk.com/sogrady/2015/07/01/language-rankings-6-15/
Why R and not another data science language
+ -
R  Stats and research centric
 Stunning visualizations
 Data manipulation
 Great community support
 Steep learning curve
 Obscure syntax
Python  Data manipulation
 Easier to learn
 Great community support
 Generic language
 Stats not cutting edge
 Ecosystem a bit chaotic
Matlab  Mathematical capabilities
 Toolboxes
 Visualizations
 Cumbersome string data management
 Not open source
 Really expensive
Octave  Free Matlab  Cant run Matlab toolboxes 俗_(  )_/俗
Julia  Intuitive syntax (for mathematicians)
 Lightning fast
 Underdeveloped
 Poor community support
Main features:

Console

Syntax-highlighting editor

Tools for plotting, history,
debugging and workspace
management
Download RStudio
Lets try
it out!play with and set RStudio
- use Projects, not setwd(...)
- use script, try to avoid console
- Ctrl+Shift+F10 and Ctrl+Alt+B, not rm(list=ls())
- Tab is your friend!
- learn the handy shortcuts
- do not save and load .Rdata
- set up the .Rprofile
- use git!
Download: git-scm.com/
Reading: happygitwithr.com
Tidyverse?
Human thought Machine Language
Source: https://github.com/rstudio-education/arm-workshop-rsc2019
Human thought Machine Language
Source: https://github.com/rstudio-education/arm-workshop-rsc2019
Human thought Machine Language
Source: https://github.com/rstudio-education/arm-workshop-rsc2019
The tidyverse is an
opinionated collection of
R packages designed for
data science.
Program
Import Tidy Transform
Visualise
Model
Communicate
Understand
Data science activity
Introduction to Data Science with R
How to install it?
install.packages(tidyverse)
https://tidyverse.org/
Next? Have fun!
R for Data Science
(r4ds.had.co.nz)
Introduction to Statistical Learning
(www-bcf.usc.edu/~gareth/ISL/)
Online books
(bookdown.org)
Online course
(2 m.o access at DataCamp >> my.visualstudio.com)
Need help?
install.packages(swirl)
Telegram:
@GNURIndonesia (t.me/GNURIndonesia)
Region Malang (t.me/RIndonesia_Malang)
Web:
https://r-indonesia.id/
GitHub:
www.github.com/indo-r
Indonesian R user community
soewarsono@klim.or.id
Telegram: @soewarsono
GitHub: @soewarsono
Thanks!
Ad

Recommended

Pesta Rilis ParrotOS 4.7
Pesta Rilis ParrotOS 4.7
Sabar Suwarsono
Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1
Sabar Suwarsono
2024 Trend Updates: What Really Works In SEO & Content Marketing
2024 Trend Updates: What Really Works In SEO & Content Marketing
Search Engine Journal
Storytelling For The Web: Integrate Storytelling in your Design Process
Storytelling For The Web: Integrate Storytelling in your Design Process
Chiara Aliotta
Artificial Intelligence, Data and Competition SCHREPEL June 2024 OECD dis...
Artificial Intelligence, Data and Competition SCHREPEL June 2024 OECD dis...
OECD Directorate for Financial and Enterprise Affairs
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
SocialHRCamp
2024 State of Marketing Report by Hubspot
2024 State of Marketing Report by Hubspot
Marius Sescu
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
rspawicu3
01. Konsep LNRT 07032025_final sklnprtpptx
01. Konsep LNRT 07032025_final sklnprtpptx
neracalobar
Presentasi baru pecahannn - 1750207487.pptx
Presentasi baru pecahannn - 1750207487.pptx
sukmaidi035
Amplop surat penelitian - kab Buton.docx
Amplop surat penelitian - kab Buton.docx
bbig71779
Lapkas puskemas Koto Katik dr.Khairani.pptx
Lapkas puskemas Koto Katik dr.Khairani.pptx
khairani221
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
HalimTangguda
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rspawicu3
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
lilinterbang5050
materi-brs-2025-01-02perkembanganpariwisata.pdf
materi-brs-2025-01-02perkembanganpariwisata.pdf
neracalobar
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
reisyanisrinadanti05
revitalisasi-desentralisasi-otoda-map.ppt
revitalisasi-desentralisasi-otoda-map.ppt
achmadbudiarto
Basic interaction human computer A1 (1).pptx
Basic interaction human computer A1 (1).pptx
DanielAkim12
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
faizalecal1904
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
faizalecal1904
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
20randomm10
rekap kehadiran fitria guru sd negeri .pdf
rekap kehadiran fitria guru sd negeri .pdf
fitribangun24
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
dimasafrzl98
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
Skeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley

More Related Content

Recently uploaded (17)

NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
rspawicu3
01. Konsep LNRT 07032025_final sklnprtpptx
01. Konsep LNRT 07032025_final sklnprtpptx
neracalobar
Presentasi baru pecahannn - 1750207487.pptx
Presentasi baru pecahannn - 1750207487.pptx
sukmaidi035
Amplop surat penelitian - kab Buton.docx
Amplop surat penelitian - kab Buton.docx
bbig71779
Lapkas puskemas Koto Katik dr.Khairani.pptx
Lapkas puskemas Koto Katik dr.Khairani.pptx
khairani221
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
HalimTangguda
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rspawicu3
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
lilinterbang5050
materi-brs-2025-01-02perkembanganpariwisata.pdf
materi-brs-2025-01-02perkembanganpariwisata.pdf
neracalobar
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
reisyanisrinadanti05
revitalisasi-desentralisasi-otoda-map.ppt
revitalisasi-desentralisasi-otoda-map.ppt
achmadbudiarto
Basic interaction human computer A1 (1).pptx
Basic interaction human computer A1 (1).pptx
DanielAkim12
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
faizalecal1904
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
faizalecal1904
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
20randomm10
rekap kehadiran fitria guru sd negeri .pdf
rekap kehadiran fitria guru sd negeri .pdf
fitribangun24
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
dimasafrzl98
NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA TINDAK LANJUT STARKES ...
rspawicu3
01. Konsep LNRT 07032025_final sklnprtpptx
01. Konsep LNRT 07032025_final sklnprtpptx
neracalobar
Presentasi baru pecahannn - 1750207487.pptx
Presentasi baru pecahannn - 1750207487.pptx
sukmaidi035
Amplop surat penelitian - kab Buton.docx
Amplop surat penelitian - kab Buton.docx
bbig71779
Lapkas puskemas Koto Katik dr.Khairani.pptx
Lapkas puskemas Koto Katik dr.Khairani.pptx
khairani221
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
PENGEMBANGAN KURIKULUM 2013_KOMPLIT.pptx
HalimTangguda
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rencana tindak lanjut NOTULENSI BIMBINGAN SURVEI AKREDITASI PMKP DAN RENCANA
rspawicu3
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
Rekomendasi Daftar Situs IDN Pusat Main Game Slot Paling Sering Menang
lilinterbang5050
materi-brs-2025-01-02perkembanganpariwisata.pdf
materi-brs-2025-01-02perkembanganpariwisata.pdf
neracalobar
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
Analysis of the Influence of Average Length of Schooling and Life Expectancy ...
reisyanisrinadanti05
revitalisasi-desentralisasi-otoda-map.ppt
revitalisasi-desentralisasi-otoda-map.ppt
achmadbudiarto
Basic interaction human computer A1 (1).pptx
Basic interaction human computer A1 (1).pptx
DanielAkim12
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
Tugas makalah tentang analisis vektor sebagai tugas mata kuliah analisis vektor
faizalecal1904
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
makalah analisis vektor sebsgai tugas mata kulias analisis vektor
faizalecal1904
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
Pertemuan 1&2 (3).pptmdadmdqmdqmndmdmdmnemd
20randomm10
rekap kehadiran fitria guru sd negeri .pdf
rekap kehadiran fitria guru sd negeri .pdf
fitribangun24
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
bab 5.pptx asdasd asdas dasd asdasd asda sdas dasd
dimasafrzl98

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
Skeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
Introduction to Data Science
Introduction to Data Science
Christy Abraham Joy
Time Management & Productivity - Best Practices
Time Management & Productivity - Best Practices
Vit Horky
The six step guide to practical project management
The six step guide to practical project management
MindGenius
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
Time Management & Productivity - Best Practices
Time Management & Productivity - Best Practices
Vit Horky
The six step guide to practical project management
The six step guide to practical project management
MindGenius
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
Ad

Introduction to Data Science with R

  • 1. Introduction to Data Science with R Sabar Suwarsono, S.Si @soewarsono ...
  • 2. Who am I? Data Scientist with R enthusiast Member of Komunitas R Indonesia Member of Komunitas GNU/Linux Malang (KLiM)
  • 3. Data Science Data science adalah ilmu interdisiplin yang berarti data science terbentuk dari berbagai ilmu pengetahuan. Menurut Staven Geringer Raleigh (2014), pembentuk data science dapat diilustrasikan dalam diagram venn berikut, Data science mencakup disiplin ilmu yang luas, berdasarkan diagram diatas terdapat 3 disiplin ilmu yang berfokus pada data science.
  • 4. Machine Learning Machine learning merupakan irisan dari ilmu matematika dan statistika dengan ilmu komputer. Machine Learning adalah cabang dari disiplin ilmu kecerdasan buatan (Artificial Intelligence) yang bertujuan memberikan kemampuan kepada komputer untuk dapat melakukan proses belajar. Banyak algoritma machine learning yang digunakan untuk melakukan analisis data dengan tingkat akurasi yang tinggi, yang paling populer adalah neural network. Dimana kita ketahuai fundamental sebuah algoritma selalu menggunakan ilmu matematika. Salah satu penerapannya adalah Cortana atau yang lebih dikenal sebagai asisten dari pengguna Windows 10 merupakan salah satu penarapan machine learning.
  • 5. Traditional Software Traditional software merupakan irisan dari ilmu komputer dengan SME (Subject Matter Expertise), SME adalah pengetahuan mengenai proses dari suatu bisnis atau instansi untuk beroperasi sehingga dapat dibuat (develop) suatu sistem yang dapat membantu bisnis atau instansi tersebut. Penerapan traditional software hampir digunakan oleh seluruh instansi pemerintahan maupun bisnis, contohnya e-learning, e-library, online banking, Point of Sales (PoS), dan lain-lain.
  • 6. Traditional Research Traditional research merupakan irisan dari ilmu matematika dan statistika dengan SME (Subject Matter Expertise). Traditional research hampir digunakan diberbagai perusahaan, instansi serta universitas. Penelitian-penelitian yang dilakukan umumya menggunakan traditional research.
  • 7. Apa itu Data Scientist? Berdasarkan diagram, data science adalah ilmu yang memuat disiplin ilmu- ilmu tersebut. Dalam perkembangan selanjutnya, seseorang yang berkecimpung dalam ilmu ini disebut Data Scientist. Namun terdapat pertimbangan antara data scientist dan unicorn pada diagram diatas. Dalam kenyataannya sangat susah untuk mencari seseorang yang expert di semua ilmu tersebut. Dalam diagram, orang ini adalah definisi dari unicorn pada diagram diatas. Sehingga unicorn adalah orang yang perfect di bidang data science.
  • 8. Founded by Ross Ihaka & Robert Gentleman High level language Interactive & Programming A swiss army knife for statistical tests and models, out-of-the box! Download R
  • 9. Changes in the realm of analytical software 1. Point and click software solutions (e.g. SPSS, SAS) are limited 2. Software is becoming free in several areas (OS, free APIs, applications, etc.) 3. Reproducible and transparent research movements source: http://r4stats.com/articles/popularity/
  • 10. Advantages of R Completely free Reproducibility The R community is very active and helpful (e.g. Stack Overflow) Evolving rapidly Several statistical procedures are first (or only) available in R Great tools for sharing results (make presentations, posters, notebooks, books, articles in R) You can do every step of a data analysis project within R, from collecting, transforming, and analyzing the data to plotting and even sharing the results. Version control via GitHub source: http://blog.revolutionanalytics.com/2016/04/cran-package-growth.html
  • 11. Disadvantages of R Can be difficult to learn Can be slow with huge datasets (we are talking about data tables with several million records) Best used in data science/analysis circles, not a generic language Obscure syntax (imo now resolved)
  • 12. Reasons to learn R: get published R has the largest growth in analytical software in science Learning R can make you the stat/tech guy -> everybody will want to work with you -> lots of publications at least as a co-author source: http://r4stats.com/articles/popularity/
  • 13. Reasons to learn R: you can get a job source: http://r4stats.com/articles/popularity/
  • 14. Reasons to learn R: support and popularity source: http://redmonk.com/sogrady/2015/07/01/language-rankings-6-15/
  • 15. Why R and not another data science language + - R Stats and research centric Stunning visualizations Data manipulation Great community support Steep learning curve Obscure syntax Python Data manipulation Easier to learn Great community support Generic language Stats not cutting edge Ecosystem a bit chaotic Matlab Mathematical capabilities Toolboxes Visualizations Cumbersome string data management Not open source Really expensive Octave Free Matlab Cant run Matlab toolboxes 俗_( )_/俗 Julia Intuitive syntax (for mathematicians) Lightning fast Underdeveloped Poor community support
  • 16. Main features: Console Syntax-highlighting editor Tools for plotting, history, debugging and workspace management Download RStudio
  • 17. Lets try it out!play with and set RStudio
  • 18. - use Projects, not setwd(...) - use script, try to avoid console - Ctrl+Shift+F10 and Ctrl+Alt+B, not rm(list=ls()) - Tab is your friend! - learn the handy shortcuts - do not save and load .Rdata - set up the .Rprofile - use git! Download: git-scm.com/ Reading: happygitwithr.com
  • 20. Human thought Machine Language Source: https://github.com/rstudio-education/arm-workshop-rsc2019
  • 21. Human thought Machine Language Source: https://github.com/rstudio-education/arm-workshop-rsc2019
  • 22. Human thought Machine Language Source: https://github.com/rstudio-education/arm-workshop-rsc2019 The tidyverse is an opinionated collection of R packages designed for data science.
  • 25. How to install it? install.packages(tidyverse) https://tidyverse.org/
  • 27. R for Data Science (r4ds.had.co.nz) Introduction to Statistical Learning (www-bcf.usc.edu/~gareth/ISL/) Online books (bookdown.org) Online course (2 m.o access at DataCamp >> my.visualstudio.com) Need help? install.packages(swirl)
  • 28. Telegram: @GNURIndonesia (t.me/GNURIndonesia) Region Malang (t.me/RIndonesia_Malang) Web: https://r-indonesia.id/ GitHub: www.github.com/indo-r Indonesian R user community