際際滷

際際滷Share a Scribd company logo
The Top Mistakes
in a Data Science
Interview
(and ideas on how to avoid
them)
Omri Allouche,
Head of Research, Gong.io
Omri Allouche
Head of Research, Gong.io
Teacher, Bar Ilan University
PhD in Ecology
BSc in Biology and Cognitive Sciences
2
3
Gong.io at a Glance
 Help companies improve business
conversations
 Focus on sales teams
 Analyzes conversation, assesses
what works and what doesnt
 Raised over $28m from top investors
 Established leader in the space
 Leading customers: LinkedIn,
Pinterest, ZipRecruiter, Zenefits
Qualified
Leads
100%
Disco
75%
Trial
50%
Close
25%
25%
No one knows why
6
Not
understanding
what Data
Science is
(truth be told, nobody
understands it)
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
13
Whats your Data 皆界庄艶稼界艶?
 You need to know all of the above
 Ask the Team Leader to explain to you exactly what
the job includes
 Whats her definition of Data Science?
 What would be your day to day responsibility?
 Will you be:
 writing code that goes to production?
 develop new algorithms?
 be in charge of data collecting?
 work on your projects alone? with other data
scientists?
Dont run away from
your superpowers
15
Dont run away from your superpowers
Data science is a blend of different skills - you're bound to be better at some
than the others
Emphasize your strengths -
 Software engineering
 Data Analysis
 Research capabilities
 In-depth mathematical understanding
Optimizing the wrong loss function
http://www.travel.ru/wow/rice_terraces.html
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
20
Dont get caught in a local minimum
Youre not Kobe Bryant  dont skip college!
 Work in a work place that has people smarter than
you
 Work in a work place that has people more
experienced than you
 Dont work as the first Data Scientist in a company
 Your goal is to learn, and become the best Data
Scientist around in a few years
 You must love what you do. Dont waste time in a
company that isnt right for you
The top mistakes you're making in your Data Science interview - Omri Allouche
Ignoring the
Data
in Data 皆界庄艶稼界艶
The top mistakes you're making in your Data Science interview - Omri Allouche
 In God we trust: all others -
bring data 
 Without data youre just
another person with an opinion 
- W. EDWARDS DEMING
The top mistakes you're making in your Data Science interview - Omri Allouche
26
Stop running away from Data
 Learn to perform a meaningful analysis using the data alone
No classifier
Just visualizations and basic statistics
 Stop neglecting Unsupervised Learning  its waaaaay cooler than
Supervised Learning
 Stop worrying that AutoML will replace you in 10 years
 Look for real ways to improve your model - running a grid search for hyper
parameter optimization doesnt count
 Learn to do a proper error analysis  when is your model wrong?
Ignoring the
Science
in Data 皆界庄艶稼界艶
28
Embrace the Scientific Method
The Scientific Method is key to success in Data Science projects
You try to study a phenomenon in the world
Your model is your hypothesis
In the scientific method, youre trying to reject your hypothesis using
experiments
In DS, you should look for ways to
find where your model is wrong,
and improve it iteratively
Running kNN with k=1
Neglecting your intuition
Looking for THE answer
31
 Many give as an answer the approach they took in a project that looks the
most similar
 Even more give as an answer what they saw others do in a similar scenario
 Interviewers want to take you away from these cases, to see what you'd do
when away from known solutions
 Data Science isnt just about learning many algorithms  its knowing when
to use them, and how to use them creatively
 To be good at it  you need to knowingly work to build your intuition
The top mistakes you're making in your Data Science interview - Omri Allouche
34
Ideas for building an intuition
 When reading a paper about a new state-of-the-art method, you should
care equally (if not more) about the Strong Baseline this method beats.
 Teach yourself to say I dont know, but I would try
 Force yourself to suggest more than one solution per interview problem
 Pay attention to cases where your intuition misses
35
 Prepare for interviews with others  brainstorm
about ideas
 Learn what the company youre interviewing for
does.
Think how youd solve those problems on your
own.
Read a bit to learn of common approaches in the
field
and ask yourself  would I enjoy reading about it
and doing it for 5 years now?
Using too many layers
Over complicating things
https://extremetech.com/extreme/179223-the-first-real-time-non-invasive-
imaging-of-neurons-forming-a-neural-network
37
 Senior data scientists know to come up with a strong baseline that will "do
the job in a fraction of the time and with a much reduced risk
 DNNs are a tool. Make sure not everything looks like a nail
 Developing rule-based models is a great exercise
 Focus more on what you do once the model is ready
You optimize the wrong loss function
Playing in the Sandbox for too long
Image: https://www.pwap.com/daycare/sand-water/sandbox-packages
39
 Youll define the dataset, input and output to the ML algorithm, and loss
function yourself
 You should look for a strong baseline (and be able to predict its complexity
and time requirements)
 Work includes reiteration and improvement, until results are satisfactory
 "Data scientists spend 80% of the time cleaning the data, and 20% of the
time complaining about cleaning the data
 Kaggle competitions are great, but:
Learning Data Science through Kaggle competitions is like
learning to play Chess through playing Backgammon
- O. Allouche
40
Keep a notebook with all terms you dont understand.
Revisit it, and read a bit about terms all the time
Take real, ugly data and start solving problems with it
 NLP is awesome for that
 Scrape the web for photos, web pages etc.
 Unsupervised data usually teaches you more
than supervised
Have a study mate
 Discuss potential approaches
https://bestfunnies.com/top-50-funny-pig-pictures/funny-pig-01/
Overselling
The top mistakes you're making in your Data Science interview - Omri Allouche
44
 Your F1 score shouldnt be 100%
(You're not expected to know everything)
 Your inner model's confidence should correlate with its performance
(You are expected to know what you don't know)
 The tone of your voice is very telling, and very important 
You dont want to sound too confident
 We're looking for a good POC (Proof of Concept), hoping for
improvement all the time
 The field is moving so fast, you can't rely on your existing knowledge.
This means companies hire you based on what you can learn, not on what
you know
45
Your CV
Suggestions...
 Your Coursera course  nano-degree isnt the equivalent of a BSc  MSc
 You didnt go to Stanford
 Your CV is not the same as the requirements file of your conda
environment  dont list python packages
 The problems you solved are much more interesting than the tools youve
used (listing algorithms instead of problems)
The top mistakes you're making in your Data Science interview - Omri Allouche
Lets connect -
Omri Allouche on
omri.allouche@gong.io
Were also hiring  send me a
message to learn more!

More Related Content

The top mistakes you're making in your Data Science interview - Omri Allouche

  • 1. The Top Mistakes in a Data Science Interview (and ideas on how to avoid them) Omri Allouche, Head of Research, Gong.io
  • 2. Omri Allouche Head of Research, Gong.io Teacher, Bar Ilan University PhD in Ecology BSc in Biology and Cognitive Sciences 2
  • 3. 3 Gong.io at a Glance Help companies improve business conversations Focus on sales teams Analyzes conversation, assesses what works and what doesnt Raised over $28m from top investors Established leader in the space Leading customers: LinkedIn, Pinterest, ZipRecruiter, Zenefits Qualified Leads 100% Disco 75% Trial 50% Close 25% 25% No one knows why
  • 4. 6
  • 5. Not understanding what Data Science is (truth be told, nobody understands it)
  • 10. 13 Whats your Data 皆界庄艶稼界艶? You need to know all of the above Ask the Team Leader to explain to you exactly what the job includes Whats her definition of Data Science? What would be your day to day responsibility? Will you be: writing code that goes to production? develop new algorithms? be in charge of data collecting? work on your projects alone? with other data scientists?
  • 11. Dont run away from your superpowers
  • 12. 15 Dont run away from your superpowers Data science is a blend of different skills - you're bound to be better at some than the others Emphasize your strengths - Software engineering Data Analysis Research capabilities In-depth mathematical understanding
  • 13. Optimizing the wrong loss function http://www.travel.ru/wow/rice_terraces.html
  • 16. 20 Dont get caught in a local minimum Youre not Kobe Bryant dont skip college! Work in a work place that has people smarter than you Work in a work place that has people more experienced than you Dont work as the first Data Scientist in a company Your goal is to learn, and become the best Data Scientist around in a few years You must love what you do. Dont waste time in a company that isnt right for you
  • 18. Ignoring the Data in Data 皆界庄艶稼界艶
  • 20. In God we trust: all others - bring data Without data youre just another person with an opinion - W. EDWARDS DEMING
  • 22. 26 Stop running away from Data Learn to perform a meaningful analysis using the data alone No classifier Just visualizations and basic statistics Stop neglecting Unsupervised Learning its waaaaay cooler than Supervised Learning Stop worrying that AutoML will replace you in 10 years Look for real ways to improve your model - running a grid search for hyper parameter optimization doesnt count Learn to do a proper error analysis when is your model wrong?
  • 23. Ignoring the Science in Data 皆界庄艶稼界艶
  • 24. 28 Embrace the Scientific Method The Scientific Method is key to success in Data Science projects You try to study a phenomenon in the world Your model is your hypothesis In the scientific method, youre trying to reject your hypothesis using experiments In DS, you should look for ways to find where your model is wrong, and improve it iteratively
  • 25. Running kNN with k=1 Neglecting your intuition
  • 26. Looking for THE answer
  • 27. 31 Many give as an answer the approach they took in a project that looks the most similar Even more give as an answer what they saw others do in a similar scenario Interviewers want to take you away from these cases, to see what you'd do when away from known solutions Data Science isnt just about learning many algorithms its knowing when to use them, and how to use them creatively To be good at it you need to knowingly work to build your intuition
  • 29. 34 Ideas for building an intuition When reading a paper about a new state-of-the-art method, you should care equally (if not more) about the Strong Baseline this method beats. Teach yourself to say I dont know, but I would try Force yourself to suggest more than one solution per interview problem Pay attention to cases where your intuition misses
  • 30. 35 Prepare for interviews with others brainstorm about ideas Learn what the company youre interviewing for does. Think how youd solve those problems on your own. Read a bit to learn of common approaches in the field and ask yourself would I enjoy reading about it and doing it for 5 years now?
  • 31. Using too many layers Over complicating things https://extremetech.com/extreme/179223-the-first-real-time-non-invasive- imaging-of-neurons-forming-a-neural-network
  • 32. 37 Senior data scientists know to come up with a strong baseline that will "do the job in a fraction of the time and with a much reduced risk DNNs are a tool. Make sure not everything looks like a nail Developing rule-based models is a great exercise Focus more on what you do once the model is ready
  • 33. You optimize the wrong loss function Playing in the Sandbox for too long Image: https://www.pwap.com/daycare/sand-water/sandbox-packages
  • 34. 39 Youll define the dataset, input and output to the ML algorithm, and loss function yourself You should look for a strong baseline (and be able to predict its complexity and time requirements) Work includes reiteration and improvement, until results are satisfactory "Data scientists spend 80% of the time cleaning the data, and 20% of the time complaining about cleaning the data Kaggle competitions are great, but: Learning Data Science through Kaggle competitions is like learning to play Chess through playing Backgammon - O. Allouche
  • 35. 40 Keep a notebook with all terms you dont understand. Revisit it, and read a bit about terms all the time Take real, ugly data and start solving problems with it NLP is awesome for that Scrape the web for photos, web pages etc. Unsupervised data usually teaches you more than supervised Have a study mate Discuss potential approaches
  • 38. 44 Your F1 score shouldnt be 100% (You're not expected to know everything) Your inner model's confidence should correlate with its performance (You are expected to know what you don't know) The tone of your voice is very telling, and very important You dont want to sound too confident We're looking for a good POC (Proof of Concept), hoping for improvement all the time The field is moving so fast, you can't rely on your existing knowledge. This means companies hire you based on what you can learn, not on what you know
  • 39. 45 Your CV Suggestions... Your Coursera course nano-degree isnt the equivalent of a BSc MSc You didnt go to Stanford Your CV is not the same as the requirements file of your conda environment dont list python packages The problems you solved are much more interesting than the tools youve used (listing algorithms instead of problems)
  • 41. Lets connect - Omri Allouche on omri.allouche@gong.io Were also hiring send me a message to learn more!