To be a great Data Scientist, you need to be a good mathematician, a curious analyst, a smart computer scientist and an expert in the problem domain. Furthermore, the field is moving so fast, you have to run at full speed just to stay in place. How should you balance these skills?
When interviewing candidates for Gong.io, we try to evaluate how well the candidate will tackle the large variety of research tasks we face, including Speech Recognition, Video and Audio analysis, NLP and statistical hypothesis testing. In this talk, I'll give an inside pick into our Data Science interview, and will list the top mistakes I see people make preparing for Data Science interviews, hoping to help you excel in your next interview and next position.
You can view a low-quality recording of the talk at https://www.youtube.com/watch?v=yu0HAudwGEA
Bio:
Omri Allouche heads the Research department at Gong.io, helping sales organizations improve their performance by providing actionable, data-driven insights using machine learning.
He also teaches Applied Data Science at Bar Ilan University, and was the founder and CEO of Page2site (acquired by Algomizer), an algorithms engineer at Elisra, and researcher at IDF's intelligence unit.
Omri holds a Ph.D. in Computational Ecology from the Hebrew University (cum laude). He won several academic awards and scholarships, including the Clore fund, and his research papers had been cited over 2,000 times.
1 of 41
Downloaded 12 times
More Related Content
The top mistakes you're making in your Data Science interview - Omri Allouche
1. The Top Mistakes
in a Data Science
Interview
(and ideas on how to avoid
them)
Omri Allouche,
Head of Research, Gong.io
2. Omri Allouche
Head of Research, Gong.io
Teacher, Bar Ilan University
PhD in Ecology
BSc in Biology and Cognitive Sciences
2
3. 3
Gong.io at a Glance
Help companies improve business
conversations
Focus on sales teams
Analyzes conversation, assesses
what works and what doesnt
Raised over $28m from top investors
Established leader in the space
Leading customers: LinkedIn,
Pinterest, ZipRecruiter, Zenefits
Qualified
Leads
100%
Disco
75%
Trial
50%
Close
25%
25%
No one knows why
10. 13
Whats your Data 皆界庄艶稼界艶?
You need to know all of the above
Ask the Team Leader to explain to you exactly what
the job includes
Whats her definition of Data Science?
What would be your day to day responsibility?
Will you be:
writing code that goes to production?
develop new algorithms?
be in charge of data collecting?
work on your projects alone? with other data
scientists?
12. 15
Dont run away from your superpowers
Data science is a blend of different skills - you're bound to be better at some
than the others
Emphasize your strengths -
Software engineering
Data Analysis
Research capabilities
In-depth mathematical understanding
16. 20
Dont get caught in a local minimum
Youre not Kobe Bryant dont skip college!
Work in a work place that has people smarter than
you
Work in a work place that has people more
experienced than you
Dont work as the first Data Scientist in a company
Your goal is to learn, and become the best Data
Scientist around in a few years
You must love what you do. Dont waste time in a
company that isnt right for you
20. In God we trust: all others -
bring data
Without data youre just
another person with an opinion
- W. EDWARDS DEMING
22. 26
Stop running away from Data
Learn to perform a meaningful analysis using the data alone
No classifier
Just visualizations and basic statistics
Stop neglecting Unsupervised Learning its waaaaay cooler than
Supervised Learning
Stop worrying that AutoML will replace you in 10 years
Look for real ways to improve your model - running a grid search for hyper
parameter optimization doesnt count
Learn to do a proper error analysis when is your model wrong?
24. 28
Embrace the Scientific Method
The Scientific Method is key to success in Data Science projects
You try to study a phenomenon in the world
Your model is your hypothesis
In the scientific method, youre trying to reject your hypothesis using
experiments
In DS, you should look for ways to
find where your model is wrong,
and improve it iteratively
27. 31
Many give as an answer the approach they took in a project that looks the
most similar
Even more give as an answer what they saw others do in a similar scenario
Interviewers want to take you away from these cases, to see what you'd do
when away from known solutions
Data Science isnt just about learning many algorithms its knowing when
to use them, and how to use them creatively
To be good at it you need to knowingly work to build your intuition
29. 34
Ideas for building an intuition
When reading a paper about a new state-of-the-art method, you should
care equally (if not more) about the Strong Baseline this method beats.
Teach yourself to say I dont know, but I would try
Force yourself to suggest more than one solution per interview problem
Pay attention to cases where your intuition misses
30. 35
Prepare for interviews with others brainstorm
about ideas
Learn what the company youre interviewing for
does.
Think how youd solve those problems on your
own.
Read a bit to learn of common approaches in the
field
and ask yourself would I enjoy reading about it
and doing it for 5 years now?
31. Using too many layers
Over complicating things
https://extremetech.com/extreme/179223-the-first-real-time-non-invasive-
imaging-of-neurons-forming-a-neural-network
32. 37
Senior data scientists know to come up with a strong baseline that will "do
the job in a fraction of the time and with a much reduced risk
DNNs are a tool. Make sure not everything looks like a nail
Developing rule-based models is a great exercise
Focus more on what you do once the model is ready
33. You optimize the wrong loss function
Playing in the Sandbox for too long
Image: https://www.pwap.com/daycare/sand-water/sandbox-packages
34. 39
Youll define the dataset, input and output to the ML algorithm, and loss
function yourself
You should look for a strong baseline (and be able to predict its complexity
and time requirements)
Work includes reiteration and improvement, until results are satisfactory
"Data scientists spend 80% of the time cleaning the data, and 20% of the
time complaining about cleaning the data
Kaggle competitions are great, but:
Learning Data Science through Kaggle competitions is like
learning to play Chess through playing Backgammon
- O. Allouche
35. 40
Keep a notebook with all terms you dont understand.
Revisit it, and read a bit about terms all the time
Take real, ugly data and start solving problems with it
NLP is awesome for that
Scrape the web for photos, web pages etc.
Unsupervised data usually teaches you more
than supervised
Have a study mate
Discuss potential approaches
38. 44
Your F1 score shouldnt be 100%
(You're not expected to know everything)
Your inner model's confidence should correlate with its performance
(You are expected to know what you don't know)
The tone of your voice is very telling, and very important
You dont want to sound too confident
We're looking for a good POC (Proof of Concept), hoping for
improvement all the time
The field is moving so fast, you can't rely on your existing knowledge.
This means companies hire you based on what you can learn, not on what
you know
39. 45
Your CV
Suggestions...
Your Coursera course nano-degree isnt the equivalent of a BSc MSc
You didnt go to Stanford
Your CV is not the same as the requirements file of your conda
environment dont list python packages
The problems you solved are much more interesting than the tools youve
used (listing algorithms instead of problems)
41. Lets connect -
Omri Allouche on
omri.allouche@gong.io
Were also hiring send me a
message to learn more!