際際滷

際際滷Share a Scribd company logo
Probability Distributions
Probability distribution
 Probability distribution is a function that gives the
likelihood of occurrence of all possible outcomes of an
experiment.
 Categories: -
 Discrete probability distribution
 Continuous probability distribution
 Functions used to describe a probability distribution: -
 Probability mass function (Discrete)
 Probability density function (Continuous)
A random variable is a variable that represents a numerical
outcome of a random experiment. Hence a probability
distribution function gives the probability of all the possible
values that a random variable can take.
Random variable may be discrete or continuous.
Why is probability distribution
significant?
 They show all the possible values for a set of data and how often they
occur.
 Distributions of data display the spread and shape of data
 Helps in standardized comparisons/analysis.
 Data exhibiting a defined distribution have predefined statistical
attributes
Mean = Median = Mode
Probability Distribution Function
 The probability distribution function is also known as the
cumulative distribution function (CDF).
 If there is a random variable, X, and its value is evaluated at
a point, x, then the probability distribution function gives
the probability that X will take a value lesser than or equal to
x. It can be written as
F(x) = P (X x)

Probability distribution function can be used for both discrete
and continuous variables.
Probability Distribution Function
(Example)
 Let the random variable X represent the number of heads obtained in
two tosses of a coin.
 Sample space: {HH, HT, TH, TT}
 Probability distribution function:
 Probability of obtaining less than/equal to one head,
P(X 1) = P(X = 0) + P (X = 1)

= 村 + 遜
= 他
No. of heads 0 1 2 Sum
PDF, P(X) 村 遜 村 1
Probability distribution of a
discrete random variable
 A discrete random variable can be
defined as a variable that can take a
countable distinct value like 0, 1, 2, 3...
 Probability Mass Function: p(x) = P(X =
x)
 Probability Distribution Function: F(x) =
P (X x)

 Examples of discrete probability
distribution: -
 Binomial distribution
 Bernoulli distribution
 Poisson distribution
Probability distribution of a discrete random
variable
https://www.youtube.com/watch?v=YXLVjCKVP7U&ab_channel=zedstatistics
Probability Distribution of a
Continuous Random Variable
 A continuous random variable can be
defined as a variable that can take on
infinitely many values.
 The probability that a continuous random
variable will take on an exact value is 0.
 Probability Distribution Function: F(x) = P (X
x)

 Probability Density Function: f(x) = d/dx (F(x))
 Examples of continuous probability
distribution: -
 Normal distribution
 Uniform distribution
 Exponential distribution
Probability Distribution of a Continuous Random
Variable
 A
Bernoulli Distribution
 A Bernoulli distribution has only two possible outcomes, namely 1
(success) and 0 (failure), and a single trial.
 The random variable X can take the following values: -
 1 with the probability of success, p
 0 with the probability of failure, q = 1  p
 Probability mass function (PMF), P(x)
 Expected value or mean = p
 Variance = p.q
Bernoulli Distribution
 Probability of success, p when x = 1 and failure, q when x = 0.
 Note: p and q may not be the same.
Binomial distribution
 When multiple trials of an experiment that yields a
success/failure (Bernoulli distribution) is conducted, it exhibits a
binomial distribution.
PMF, P
where, n = number of trials
x = number of successes
p = probability of success
q = probability of failure
 Expected value = n.p
 Variance = n.p.q
Binomial distribution (Example)
A store manager estimates the probability of a customer making a
purchase as 0.30. What is the probability that two of the next three
customers will make a purchase?
Solution:
The above exhibits a binomial distribution as there are three customers ( 3
trials) with every customer either making a purchase (success) or not
making a purchase (failure).
Probability that two of the next three customers will make a purchase,
P
Normal distribution
 In a normal distribution the data
tends to be around a central value
with no bias left or right.
 Also called a bell curve as it looks
like a bell.
 Many things follow a normal
distribution  heights of people,
marks scored in a test.
Normal distribution
Mean = Median = Mode
68% of data lie within one standard deviation
95% of data lie within one standard deviation
https://www.mathsisfun.com/data/standard-n
ormal-distribution.html
Skewness
Negative skew: The long tail is on the negative side of the peak
Positive skew: The long tail is on the positive side of the peak
https://www.mathsisfun.com/data/skewness.html
Uniform distribution
 In a Uniform Distribution there is an equal probability for all
values of the random variable between a and b.
Relationship between two variables
 Covariance and correlation and are two statistical measures
that describe the relationship between two variables.
 They both quantify how two variables change together, but
they differ in scale, interpretation, and units.
Covariance
 Covariance measures the direction of the linear relationship between
two variables.
 It tells you whether the variables move in the same direction (positive
covariance) or in opposite directions (negative covariance).
Covariance (Example)
Covariance between temperature and ice cream sales
Cov(X, Y) = 243
 Positive value indicates a positive
correlation between temperature and ice
cream sales.
 However, it does not specify the strength of
the relationship.
Correlation
 Correlation measures both the strength and direction of the linear
relationship between two variables.
 It lies within a within a standardized range.
 1  perfect positive correlation
 -1  perfect negative correlation
 0  no correlation
Perfect
Positive
Correlation
Correlation(Example)
0.9575
Correlation
 Correlation only works for linear relationships.
 Correlation is 0.
Exploratory Data Analysis (EDA)
Exploratory Data Analysis refers to the critical process of performing initial
investigations on data so as to discover patterns, spot anomalies, test
hypothesis and to check assumptions with the help of summary statistics
and graphical representations.
Key Objectives of EDA:
 Understand the data structure: Gain insights into the data's size, types,
and completeness.
 Identify patterns: Detect trends, correlations, and groupings.
 Find anomalies: Spot outliers and inconsistencies in the data.
 Generate hypotheses: Form initial ideas for models, statistical testing, or
predictions.
 Refine data: Clean, transform, or filter the data for further analysis.
Steps in EDA
1. Data loading and inspection
2. Univariate analysis
3. Bivariate analysis
4. Multivariate analysis
5. Identifying missing values and outliers
6. Data transformation
7. Feature engineering
8. Hypothesis engineering
Data loading and inspection
Step 1. Load data into the workspace
df.head() command displays the first few records
Step 2. Data preview and
summary
Univariate
analysis
 Involves analyzing each
variable individually to
understand its distribution,
central tendency, and
spread.
 Numerical variables:
histograms, box plots, and
summary statistics (mean,
median, standard
deviation)
 Categorical variables: bar
charts, pie charts
References
 https://www.cuemath.com/data/probability-distribution/
 https://www.cuemath.com/data/bernoulli-distribution/

More Related Content

Similar to Fundamentals of Data Science Probability Distributions (20)

梶襦衣≡訣殊: scikit-learn & メ求メ求 (螳覦)
梶襦衣≡訣殊:  scikit-learn & メ求メ求 (螳覦)梶襦衣≡訣殊:  scikit-learn & メ求メ求 (螳覦)
梶襦衣≡訣殊: scikit-learn & メ求メ求 (螳覦)
襷伎殊
4646150.ppt
4646150.ppt4646150.ppt
4646150.ppt
TulkinChulliev
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Maninda Edirisooriya
probability types and definition and how to measure
probability types and definition and how to measureprobability types and definition and how to measure
probability types and definition and how to measure
hanifaelfadilelmhdi
probability for beginners masters in africa.ppt
probability for beginners masters in africa.pptprobability for beginners masters in africa.ppt
probability for beginners masters in africa.ppt
eliezerkbl
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
Kumar P
statics in research
statics in researchstatics in research
statics in research
chauhan keyursinh
Inorganic CHEMISTRY
Inorganic CHEMISTRYInorganic CHEMISTRY
Inorganic CHEMISTRY
Saikumar raja
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
King Khalid University
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
Vikash Keshri
Sampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi JainSampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi Jain
RuchiJainRuchiJain
Probability
ProbabilityProbability
Probability
muthukrishnaveni anand
1-Descriptive Statistics - pdf file descriptive
1-Descriptive Statistics - pdf file descriptive1-Descriptive Statistics - pdf file descriptive
1-Descriptive Statistics - pdf file descriptive
SomyaVardhan1
chi_square test.pptx
chi_square test.pptxchi_square test.pptx
chi_square test.pptx
SheetalSardhna
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
Henock Beyene
UNIT 3.pptx.......................................
UNIT 3.pptx.......................................UNIT 3.pptx.......................................
UNIT 3.pptx.......................................
vijayannamratha
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
TripthiDubey
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
Noorelhuda2
G4 PROBABLITY.pptx
G4 PROBABLITY.pptxG4 PROBABLITY.pptx
G4 PROBABLITY.pptx
SmitKajbaje1
Probability_Distributions_Presentation_Complete.pptx
Probability_Distributions_Presentation_Complete.pptxProbability_Distributions_Presentation_Complete.pptx
Probability_Distributions_Presentation_Complete.pptx
codewithgauravkumar
梶襦衣≡訣殊: scikit-learn & メ求メ求 (螳覦)
梶襦衣≡訣殊:  scikit-learn & メ求メ求 (螳覦)梶襦衣≡訣殊:  scikit-learn & メ求メ求 (螳覦)
梶襦衣≡訣殊: scikit-learn & メ求メ求 (螳覦)
襷伎殊
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Maninda Edirisooriya
probability types and definition and how to measure
probability types and definition and how to measureprobability types and definition and how to measure
probability types and definition and how to measure
hanifaelfadilelmhdi
probability for beginners masters in africa.ppt
probability for beginners masters in africa.pptprobability for beginners masters in africa.ppt
probability for beginners masters in africa.ppt
eliezerkbl
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
Kumar P
Inorganic CHEMISTRY
Inorganic CHEMISTRYInorganic CHEMISTRY
Inorganic CHEMISTRY
Saikumar raja
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
Vikash Keshri
Sampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi JainSampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi Jain
RuchiJainRuchiJain
1-Descriptive Statistics - pdf file descriptive
1-Descriptive Statistics - pdf file descriptive1-Descriptive Statistics - pdf file descriptive
1-Descriptive Statistics - pdf file descriptive
SomyaVardhan1
chi_square test.pptx
chi_square test.pptxchi_square test.pptx
chi_square test.pptx
SheetalSardhna
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
Henock Beyene
UNIT 3.pptx.......................................
UNIT 3.pptx.......................................UNIT 3.pptx.......................................
UNIT 3.pptx.......................................
vijayannamratha
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
TripthiDubey
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
Noorelhuda2
G4 PROBABLITY.pptx
G4 PROBABLITY.pptxG4 PROBABLITY.pptx
G4 PROBABLITY.pptx
SmitKajbaje1
Probability_Distributions_Presentation_Complete.pptx
Probability_Distributions_Presentation_Complete.pptxProbability_Distributions_Presentation_Complete.pptx
Probability_Distributions_Presentation_Complete.pptx
codewithgauravkumar

More from RBeze58 (10)

Fundamentals of Data Science Modeling Lec
Fundamentals of Data Science Modeling LecFundamentals of Data Science Modeling Lec
Fundamentals of Data Science Modeling Lec
RBeze58
IT Laws and Practices Module 3 to Module 5
IT Laws and Practices Module 3 to Module 5IT Laws and Practices Module 3 to Module 5
IT Laws and Practices Module 3 to Module 5
RBeze58
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docxARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
RBeze58
COI/IT LAWS AND PRACTICES Case Study.docx
COI/IT LAWS AND PRACTICES Case Study.docxCOI/IT LAWS AND PRACTICES Case Study.docx
COI/IT LAWS AND PRACTICES Case Study.docx
RBeze58
COI/IT LAWS AND PRACTICES Module2_Casestudy.docx
COI/IT LAWS AND PRACTICES Module2_Casestudy.docxCOI/IT LAWS AND PRACTICES Module2_Casestudy.docx
COI/IT LAWS AND PRACTICES Module2_Casestudy.docx
RBeze58
COI/ IT LAWS AND PRACTICES Module 3.pptx
COI/ IT LAWS AND PRACTICES Module 3.pptxCOI/ IT LAWS AND PRACTICES Module 3.pptx
COI/ IT LAWS AND PRACTICES Module 3.pptx
RBeze58
COI/ IT LAWS AND PRACTICES Module 2.pptx
COI/ IT LAWS AND PRACTICES Module 2.pptxCOI/ IT LAWS AND PRACTICES Module 2.pptx
COI/ IT LAWS AND PRACTICES Module 2.pptx
RBeze58
COI/ IT LAWS AND PRACTICES Module 1.pptx
COI/ IT LAWS AND PRACTICES Module 1.pptxCOI/ IT LAWS AND PRACTICES Module 1.pptx
COI/ IT LAWS AND PRACTICES Module 1.pptx
RBeze58
Marketing Communication & Advertising.pdf
Marketing Communication & Advertising.pdfMarketing Communication & Advertising.pdf
Marketing Communication & Advertising.pdf
RBeze58
Computer Networks 04 Data and Signal Fundamentals.pptx
Computer Networks 04 Data and Signal Fundamentals.pptxComputer Networks 04 Data and Signal Fundamentals.pptx
Computer Networks 04 Data and Signal Fundamentals.pptx
RBeze58
Fundamentals of Data Science Modeling Lec
Fundamentals of Data Science Modeling LecFundamentals of Data Science Modeling Lec
Fundamentals of Data Science Modeling Lec
RBeze58
IT Laws and Practices Module 3 to Module 5
IT Laws and Practices Module 3 to Module 5IT Laws and Practices Module 3 to Module 5
IT Laws and Practices Module 3 to Module 5
RBeze58
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docxARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
ARTIFICIAL INTELLIGENCE 271_AI Lect Notes.docx
RBeze58
COI/IT LAWS AND PRACTICES Case Study.docx
COI/IT LAWS AND PRACTICES Case Study.docxCOI/IT LAWS AND PRACTICES Case Study.docx
COI/IT LAWS AND PRACTICES Case Study.docx
RBeze58
COI/IT LAWS AND PRACTICES Module2_Casestudy.docx
COI/IT LAWS AND PRACTICES Module2_Casestudy.docxCOI/IT LAWS AND PRACTICES Module2_Casestudy.docx
COI/IT LAWS AND PRACTICES Module2_Casestudy.docx
RBeze58
COI/ IT LAWS AND PRACTICES Module 3.pptx
COI/ IT LAWS AND PRACTICES Module 3.pptxCOI/ IT LAWS AND PRACTICES Module 3.pptx
COI/ IT LAWS AND PRACTICES Module 3.pptx
RBeze58
COI/ IT LAWS AND PRACTICES Module 2.pptx
COI/ IT LAWS AND PRACTICES Module 2.pptxCOI/ IT LAWS AND PRACTICES Module 2.pptx
COI/ IT LAWS AND PRACTICES Module 2.pptx
RBeze58
COI/ IT LAWS AND PRACTICES Module 1.pptx
COI/ IT LAWS AND PRACTICES Module 1.pptxCOI/ IT LAWS AND PRACTICES Module 1.pptx
COI/ IT LAWS AND PRACTICES Module 1.pptx
RBeze58
Marketing Communication & Advertising.pdf
Marketing Communication & Advertising.pdfMarketing Communication & Advertising.pdf
Marketing Communication & Advertising.pdf
RBeze58
Computer Networks 04 Data and Signal Fundamentals.pptx
Computer Networks 04 Data and Signal Fundamentals.pptxComputer Networks 04 Data and Signal Fundamentals.pptx
Computer Networks 04 Data and Signal Fundamentals.pptx
RBeze58

Recently uploaded (20)

Shallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdfShallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdf
DUSABEMARIYA
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
AI Publications
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-8-2025 FINAL ver4...
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-8-2025 FINAL ver4...YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-8-2025 FINAL ver4...
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-8-2025 FINAL ver4...
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
Supervised Learning Ensemble Techniques Machine Learning
Supervised Learning Ensemble Techniques Machine LearningSupervised Learning Ensemble Techniques Machine Learning
Supervised Learning Ensemble Techniques Machine Learning
ShivarkarSandip
Electromechanical Engineering Portfolio RJH
Electromechanical Engineering Portfolio RJHElectromechanical Engineering Portfolio RJH
Electromechanical Engineering Portfolio RJH
rhoustonx1
Mix Design of M40 Concrete & Application of NDT.pptx
Mix Design of M40 Concrete & Application of NDT.pptxMix Design of M40 Concrete & Application of NDT.pptx
Mix Design of M40 Concrete & Application of NDT.pptx
narayan311979
Mastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdfMastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdf
Brion Mario
MODULE 01 - CLOUD COMPUTING [BIS 613D] .pptx
MODULE 01 - CLOUD COMPUTING [BIS 613D] .pptxMODULE 01 - CLOUD COMPUTING [BIS 613D] .pptx
MODULE 01 - CLOUD COMPUTING [BIS 613D] .pptx
Alvas Institute of Engineering and technology, Moodabidri
Project Manager | Integrated Design Expert
Project Manager | Integrated Design ExpertProject Manager | Integrated Design Expert
Project Manager | Integrated Design Expert
BARBARA BIANCO
Machine Elements in Mechanical Design.pdf
Machine Elements in Mechanical Design.pdfMachine Elements in Mechanical Design.pdf
Machine Elements in Mechanical Design.pdf
SLatorreAndrs
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHMSIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
VLSICS Design
Scalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M NotificationsScalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M Notifications
Gustavo Araujo
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Alberto Lorenzo
applicationof differential equation.pptx
applicationof differential equation.pptxapplicationof differential equation.pptx
applicationof differential equation.pptx
PPSTUDIES
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
yadavchandan322
"Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications""Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications"
GtxDriver
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
Guru Nanak Technical Institutions
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
Guru Nanak Technical Institutions
NFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
NFPA 70B & 70E Changes and Additions Webinar Presented By FlukeNFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
NFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
Transcat
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
Guru Nanak Technical Institutions
Shallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdfShallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdf
DUSABEMARIYA
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
Explainability and Transparency in Artificial Intelligence: Ethical Imperativ...
AI Publications
Supervised Learning Ensemble Techniques Machine Learning
Supervised Learning Ensemble Techniques Machine LearningSupervised Learning Ensemble Techniques Machine Learning
Supervised Learning Ensemble Techniques Machine Learning
ShivarkarSandip
Electromechanical Engineering Portfolio RJH
Electromechanical Engineering Portfolio RJHElectromechanical Engineering Portfolio RJH
Electromechanical Engineering Portfolio RJH
rhoustonx1
Mix Design of M40 Concrete & Application of NDT.pptx
Mix Design of M40 Concrete & Application of NDT.pptxMix Design of M40 Concrete & Application of NDT.pptx
Mix Design of M40 Concrete & Application of NDT.pptx
narayan311979
Mastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdfMastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdf
Brion Mario
Project Manager | Integrated Design Expert
Project Manager | Integrated Design ExpertProject Manager | Integrated Design Expert
Project Manager | Integrated Design Expert
BARBARA BIANCO
Machine Elements in Mechanical Design.pdf
Machine Elements in Mechanical Design.pdfMachine Elements in Mechanical Design.pdf
Machine Elements in Mechanical Design.pdf
SLatorreAndrs
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHMSIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
SIMULATION OF FIR FILTER BASED ON CORDIC ALGORITHM
VLSICS Design
Scalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M NotificationsScalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M Notifications
Gustavo Araujo
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Reinventando el CD_ Unificando Aplicaciones e Infraestructura con Crossplane-...
Alberto Lorenzo
applicationof differential equation.pptx
applicationof differential equation.pptxapplicationof differential equation.pptx
applicationof differential equation.pptx
PPSTUDIES
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
PROJECT REPORT ON PASTA MACHINE - KP AUTOMATIONS - PASTA MAKING MACHINE PROJE...
yadavchandan322
"Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications""Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications"
GtxDriver
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
Guru Nanak Technical Institutions
NFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
NFPA 70B & 70E Changes and Additions Webinar Presented By FlukeNFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
NFPA 70B & 70E Changes and Additions Webinar Presented By Fluke
Transcat

Fundamentals of Data Science Probability Distributions

  • 2. Probability distribution Probability distribution is a function that gives the likelihood of occurrence of all possible outcomes of an experiment. Categories: - Discrete probability distribution Continuous probability distribution Functions used to describe a probability distribution: - Probability mass function (Discrete) Probability density function (Continuous) A random variable is a variable that represents a numerical outcome of a random experiment. Hence a probability distribution function gives the probability of all the possible values that a random variable can take. Random variable may be discrete or continuous.
  • 3. Why is probability distribution significant? They show all the possible values for a set of data and how often they occur. Distributions of data display the spread and shape of data Helps in standardized comparisons/analysis. Data exhibiting a defined distribution have predefined statistical attributes Mean = Median = Mode
  • 4. Probability Distribution Function The probability distribution function is also known as the cumulative distribution function (CDF). If there is a random variable, X, and its value is evaluated at a point, x, then the probability distribution function gives the probability that X will take a value lesser than or equal to x. It can be written as F(x) = P (X x) Probability distribution function can be used for both discrete and continuous variables.
  • 5. Probability Distribution Function (Example) Let the random variable X represent the number of heads obtained in two tosses of a coin. Sample space: {HH, HT, TH, TT} Probability distribution function: Probability of obtaining less than/equal to one head, P(X 1) = P(X = 0) + P (X = 1) = 村 + 遜 = 他 No. of heads 0 1 2 Sum PDF, P(X) 村 遜 村 1
  • 6. Probability distribution of a discrete random variable A discrete random variable can be defined as a variable that can take a countable distinct value like 0, 1, 2, 3... Probability Mass Function: p(x) = P(X = x) Probability Distribution Function: F(x) = P (X x) Examples of discrete probability distribution: - Binomial distribution Bernoulli distribution Poisson distribution
  • 7. Probability distribution of a discrete random variable https://www.youtube.com/watch?v=YXLVjCKVP7U&ab_channel=zedstatistics
  • 8. Probability Distribution of a Continuous Random Variable A continuous random variable can be defined as a variable that can take on infinitely many values. The probability that a continuous random variable will take on an exact value is 0. Probability Distribution Function: F(x) = P (X x) Probability Density Function: f(x) = d/dx (F(x)) Examples of continuous probability distribution: - Normal distribution Uniform distribution Exponential distribution
  • 9. Probability Distribution of a Continuous Random Variable A
  • 10. Bernoulli Distribution A Bernoulli distribution has only two possible outcomes, namely 1 (success) and 0 (failure), and a single trial. The random variable X can take the following values: - 1 with the probability of success, p 0 with the probability of failure, q = 1 p Probability mass function (PMF), P(x) Expected value or mean = p Variance = p.q
  • 11. Bernoulli Distribution Probability of success, p when x = 1 and failure, q when x = 0. Note: p and q may not be the same.
  • 12. Binomial distribution When multiple trials of an experiment that yields a success/failure (Bernoulli distribution) is conducted, it exhibits a binomial distribution. PMF, P where, n = number of trials x = number of successes p = probability of success q = probability of failure Expected value = n.p Variance = n.p.q
  • 13. Binomial distribution (Example) A store manager estimates the probability of a customer making a purchase as 0.30. What is the probability that two of the next three customers will make a purchase? Solution: The above exhibits a binomial distribution as there are three customers ( 3 trials) with every customer either making a purchase (success) or not making a purchase (failure). Probability that two of the next three customers will make a purchase, P
  • 14. Normal distribution In a normal distribution the data tends to be around a central value with no bias left or right. Also called a bell curve as it looks like a bell. Many things follow a normal distribution heights of people, marks scored in a test.
  • 15. Normal distribution Mean = Median = Mode 68% of data lie within one standard deviation 95% of data lie within one standard deviation https://www.mathsisfun.com/data/standard-n ormal-distribution.html
  • 16. Skewness Negative skew: The long tail is on the negative side of the peak Positive skew: The long tail is on the positive side of the peak https://www.mathsisfun.com/data/skewness.html
  • 17. Uniform distribution In a Uniform Distribution there is an equal probability for all values of the random variable between a and b.
  • 18. Relationship between two variables Covariance and correlation and are two statistical measures that describe the relationship between two variables. They both quantify how two variables change together, but they differ in scale, interpretation, and units.
  • 19. Covariance Covariance measures the direction of the linear relationship between two variables. It tells you whether the variables move in the same direction (positive covariance) or in opposite directions (negative covariance).
  • 20. Covariance (Example) Covariance between temperature and ice cream sales Cov(X, Y) = 243 Positive value indicates a positive correlation between temperature and ice cream sales. However, it does not specify the strength of the relationship.
  • 21. Correlation Correlation measures both the strength and direction of the linear relationship between two variables. It lies within a within a standardized range. 1 perfect positive correlation -1 perfect negative correlation 0 no correlation Perfect Positive Correlation
  • 23. Correlation Correlation only works for linear relationships. Correlation is 0.
  • 24. Exploratory Data Analysis (EDA) Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns, spot anomalies, test hypothesis and to check assumptions with the help of summary statistics and graphical representations. Key Objectives of EDA: Understand the data structure: Gain insights into the data's size, types, and completeness. Identify patterns: Detect trends, correlations, and groupings. Find anomalies: Spot outliers and inconsistencies in the data. Generate hypotheses: Form initial ideas for models, statistical testing, or predictions. Refine data: Clean, transform, or filter the data for further analysis.
  • 25. Steps in EDA 1. Data loading and inspection 2. Univariate analysis 3. Bivariate analysis 4. Multivariate analysis 5. Identifying missing values and outliers 6. Data transformation 7. Feature engineering 8. Hypothesis engineering
  • 26. Data loading and inspection Step 1. Load data into the workspace df.head() command displays the first few records Step 2. Data preview and summary
  • 27. Univariate analysis Involves analyzing each variable individually to understand its distribution, central tendency, and spread. Numerical variables: histograms, box plots, and summary statistics (mean, median, standard deviation) Categorical variables: bar charts, pie charts