狠狠撸

狠狠撸Share a Scribd company logo
First steps with data
science
Igor Leroy
@lerrua
About me
Developer since 2007
timeline: Action Script,
PHP, Java, C#, Rails and
Python
Pythonista since 2011
Home of?ce +4 years
Backend at SQUAD
github.com/lerruatwitter.com/lerrua
Summary
What is a data scientist?
Stack!
Where to start?
Data scientist?
"Data scientists are inquisitive: exploring, asking questions, doing
“what if” analysis, questioning existing assumptions and processes"
https://www-01.ibm.com/software/data/infosphere/data-scientist/
First steps with Data Science
Image via Data Science London
First steps with Data Science
we have data analysts
and data engineers
First steps with Data Science
salary
http://www.payscale.com/research/US/Job=Data_Scientist,_IT/Salary
stack
useful Python libs
Pandas and SFrame
scikit learn
Scipy
Numpy
Matplotlib, Bokeh
jupyter
open mining
AirBnb: caravel
Rodeo IDE
Rodeo IDE
canopy IDE
Where to start?
http://www.lerrua.com/blog/2016/03/08/primeiros-passos-
com-data-science/
http://www.lerrua.com/blog/2016/03/17/getting-started-
with-data-science/
pt-BR
en-US
top free courses
Big Data Basics: Hadoop, MapReduce, Hive, Pig & Spark - https://
www.udemy.com/big-data-basics-hadoop-mapreduce-hive-pig-
spark
Intro to Data Analysis – https://www.udacity.com/courses/
ud170
Intro to Data Science – https://www.udacity.com/courses/
ud359
Intro to Statistics – https://www.udacity.com/courses/st101
Intro to Machine Learning – https://www.udacity.com/
courses/ud120
references
https://becomingadatascientist.wordpress.com/
2013/07/26/choosing-a-data-science-technology-
stack-w-survey/
https://www.import.io/post/data-scientists-vs-data-
analysts-why-the-distinction-matters/
http://blog.udacity.com/2014/12/data-analyst-vs-
data-scientist-vs-data-engineer.html
http://www.mastersindatascience.org/careers/data-
scientist/
questions?
Ad

Recommended

Python in Data Science Work
Python in Data Science Work
Rick. Bahague
?
Getting a job in industry
Getting a job in industry
Austin Baird
?
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Data Con LA
?
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
Christopher Williams
?
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
Leslie Samuel
?
Python on Science ? Yes, We can.
Python on Science ? Yes, We can.
Marcel Caraciolo
?
Introduction to Python for new beginner
Introduction to Python for new beginner
Burasakorn Sabyeying
?
Converging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven Poutsy
Big Data Spain
?
DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
WG_ Events
?
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet
?
Python For Data Analysis Unlocking Insightsguide Brian P
Python For Data Analysis Unlocking Insightsguide Brian P
panchhijar4n
?
ING - Mind the Gap
ING - Mind the Gap
Richard Abbuhl
?
Code Europe Spring 2018 - Mind the Gap
Code Europe Spring 2018 - Mind the Gap
Richard Abbuhl
?
Introduction to python
Introduction to python
Rajesh Rajamani
?
Datascope runs on python
Datascope runs on python
bo_p
?
Open IoT Made Easy - Introduction to OGC SensorThings API
Open IoT Made Easy - Introduction to OGC SensorThings API
SensorUp
?
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Yury Leonychev
?
Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitch
Yury Chemerkin
?
Crab - A Python Framework for Building Recommendation Systems
Crab - A Python Framework for Building Recommendation Systems
Marcel Caraciolo
?
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Iker Jusue
?
Full-Stack Development
Full-Stack Development
Dhilipsiva DS
?
Apache Toree
Apache Toree
Asim Jalis
?
OSINT tools for security auditing [FOSDEM edition]
OSINT tools for security auditing [FOSDEM edition]
Jose Manuel Ortega Candel
?
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
Hang Li
?
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
valleerinavadeep
?
SensorThings API webinar-#4-Connect Your Sensor
SensorThings API webinar-#4-Connect Your Sensor
SensorUp
?
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
tkisason
?
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
?
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
taqyed
?

More Related Content

Similar to First steps with Data Science (20)

DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
WG_ Events
?
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet
?
Python For Data Analysis Unlocking Insightsguide Brian P
Python For Data Analysis Unlocking Insightsguide Brian P
panchhijar4n
?
ING - Mind the Gap
ING - Mind the Gap
Richard Abbuhl
?
Code Europe Spring 2018 - Mind the Gap
Code Europe Spring 2018 - Mind the Gap
Richard Abbuhl
?
Introduction to python
Introduction to python
Rajesh Rajamani
?
Datascope runs on python
Datascope runs on python
bo_p
?
Open IoT Made Easy - Introduction to OGC SensorThings API
Open IoT Made Easy - Introduction to OGC SensorThings API
SensorUp
?
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Yury Leonychev
?
Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitch
Yury Chemerkin
?
Crab - A Python Framework for Building Recommendation Systems
Crab - A Python Framework for Building Recommendation Systems
Marcel Caraciolo
?
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Iker Jusue
?
Full-Stack Development
Full-Stack Development
Dhilipsiva DS
?
Apache Toree
Apache Toree
Asim Jalis
?
OSINT tools for security auditing [FOSDEM edition]
OSINT tools for security auditing [FOSDEM edition]
Jose Manuel Ortega Candel
?
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
Hang Li
?
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
valleerinavadeep
?
SensorThings API webinar-#4-Connect Your Sensor
SensorThings API webinar-#4-Connect Your Sensor
SensorUp
?
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
tkisason
?
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
?
DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
DataTalks #4: Необходимый минимум инструментов для построения своей системы р...
WG_ Events
?
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet
?
Python For Data Analysis Unlocking Insightsguide Brian P
Python For Data Analysis Unlocking Insightsguide Brian P
panchhijar4n
?
Code Europe Spring 2018 - Mind the Gap
Code Europe Spring 2018 - Mind the Gap
Richard Abbuhl
?
Datascope runs on python
Datascope runs on python
bo_p
?
Open IoT Made Easy - Introduction to OGC SensorThings API
Open IoT Made Easy - Introduction to OGC SensorThings API
SensorUp
?
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Yury Leonychev
?
Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitch
Yury Chemerkin
?
Crab - A Python Framework for Building Recommendation Systems
Crab - A Python Framework for Building Recommendation Systems
Marcel Caraciolo
?
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Why you need to become a Tech Sourcer (even if you don't hire for IT roles).
Iker Jusue
?
OSINT tools for security auditing [FOSDEM edition]
OSINT tools for security auditing [FOSDEM edition]
Jose Manuel Ortega Candel
?
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
Hang Li
?
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
slidesgo-unleashing-the-power-of-python-your-gateway-to-programming-mastery-2...
valleerinavadeep
?
SensorThings API webinar-#4-Connect Your Sensor
SensorThings API webinar-#4-Connect Your Sensor
SensorUp
?
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
tkisason
?
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
?

Recently uploaded (20)

All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
taqyed
?
llm_presentation and deep learning methods
llm_presentation and deep learning methods
sayedabdussalam11
?
Flextronics Employee Safety Data-Project-2.pptx
Flextronics Employee Safety Data-Project-2.pptx
kilarihemadri
?
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
Introduction for GenAI for Faculty for University.pdf
Introduction for GenAI for Faculty for University.pdf
Saeed999312
?
reporting monthly for genset & Air Compressor.pptx
reporting monthly for genset & Air Compressor.pptx
dacripapanjaitan
?
Communication_Skills_Class10_Visual.pptx
Communication_Skills_Class10_Visual.pptx
namanrastogi70555
?
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
taqyea
?
Crafting-Research-Recommendations Grade 12.pptx
Crafting-Research-Recommendations Grade 12.pptx
DaryllWhere
?
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
?
UPS and Big Data intro to Business Analytics.pptx
UPS and Big Data intro to Business Analytics.pptx
sanjum5582
?
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
taqyea
?
NASA ESE Study Results v4 05.29.2020.pptx
NASA ESE Study Results v4 05.29.2020.pptx
CiroAlejandroCamacho
?
Statistics-and-Computer-Tools-for-Analyzing-of-Assessment-Data.pptx
Statistics-and-Computer-Tools-for-Analyzing-of-Assessment-Data.pptx
pelaezmaryjoy90
?
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
?
25 items quiz for practical research 1 in grade 11
25 items quiz for practical research 1 in grade 11
leamaydayaganon81
?
presentation4.pdf Intro to mcmc methodss
presentation4.pdf Intro to mcmc methodss
SergeyTsygankov6
?
最新版美国约翰霍普金斯大学毕业证(闯贬鲍毕业证书)原版定制
最新版美国约翰霍普金斯大学毕业证(闯贬鲍毕业证书)原版定制
Taqyea
?
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
定制翱颁础顿学生卡加拿大安大略艺术与设计大学成绩单范本,翱颁础顿成绩单复刻
taqyed
?
llm_presentation and deep learning methods
llm_presentation and deep learning methods
sayedabdussalam11
?
Flextronics Employee Safety Data-Project-2.pptx
Flextronics Employee Safety Data-Project-2.pptx
kilarihemadri
?
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
Introduction for GenAI for Faculty for University.pdf
Introduction for GenAI for Faculty for University.pdf
Saeed999312
?
reporting monthly for genset & Air Compressor.pptx
reporting monthly for genset & Air Compressor.pptx
dacripapanjaitan
?
Communication_Skills_Class10_Visual.pptx
Communication_Skills_Class10_Visual.pptx
namanrastogi70555
?
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
taqyea
?
Crafting-Research-Recommendations Grade 12.pptx
Crafting-Research-Recommendations Grade 12.pptx
DaryllWhere
?
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
?
UPS and Big Data intro to Business Analytics.pptx
UPS and Big Data intro to Business Analytics.pptx
sanjum5582
?
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
taqyea
?
NASA ESE Study Results v4 05.29.2020.pptx
NASA ESE Study Results v4 05.29.2020.pptx
CiroAlejandroCamacho
?
Statistics-and-Computer-Tools-for-Analyzing-of-Assessment-Data.pptx
Statistics-and-Computer-Tools-for-Analyzing-of-Assessment-Data.pptx
pelaezmaryjoy90
?
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
?
25 items quiz for practical research 1 in grade 11
25 items quiz for practical research 1 in grade 11
leamaydayaganon81
?
presentation4.pdf Intro to mcmc methodss
presentation4.pdf Intro to mcmc methodss
SergeyTsygankov6
?
最新版美国约翰霍普金斯大学毕业证(闯贬鲍毕业证书)原版定制
最新版美国约翰霍普金斯大学毕业证(闯贬鲍毕业证书)原版定制
Taqyea
?
Ad

First steps with Data Science