際際滷

際際滷Share a Scribd company logo
Presenter: Muhammad Rizwan Khan
Usafzai
1
 NumPy: NumPy is a library for the Python programming language, adding
support for large, multi-dimensional arrays and matrices, along with a
collection of mathematical functions to operate on these arrays.
 Key Features:
 Array creation and manipulation
 Mathematical operations on arrays
 Linear algebra operations
 Fourier transforms
 Random number generation
 Applications:
 Scientific computing
 Data analysis and manipulation
 Machine learning 2
How to install NumPy on Jupyter?
Open the jupyter notebook and type the following code:
!pip install numpy
Import numpy as np
Solve the following code then:
n = np.array((1,2,3))
Print(n)
Type of object:
Print(type(n))
3
 OpenCV (Open Source Computer Vision Library):
 OpenCV is an open-source computer vision and machine learning software
library. It provides a wide range of functionalities for real-time computer vision,
including image and video processing, object detection, face recognition, and
more.
 Key Features:
 Image and video I/O
 Image processing algorithms
 Object detection and tracking
 Machine learning algorithms for computer vision tasks
 Applications:
 Robotics
 Augmented reality
 Surveillance systems
 Medical image analysis 4
How to install Open CV on Jupyter?
Open the jupyter notebook and type the following code:
!pip install opencv-python
import cv2
img = cv2.imread("img1.png")
cv2.imshow("MRK",img)
cv2.waitKey(10000)
cv2.destroyAllWindows()
5
 Matplotlib is a comprehensive library for creating static, animated, and
interactive visualizations in Python. It provides a MATLAB-like interface and
supports a wide variety of plots and graphs.
 Key Features:
 Line plots, scatter plots, and histograms
 2D and 3D plotting
 Customization of plots
 Integration with NumPy arrays
 Applications:
 Data visualization
 Scientific plotting
 Statistical analysis
6
How to install Matplotlib on Jupyter?
Open the jupyter notebook and type the following code:
!pip install matplotlib
Import matplotlib.pyplot as plt // as means alias (named)
import numpy as np
xpts = np.array([0,4])
ypts = np.array([0,6])
plt.plot(xpts,ypts)
plt.show()
7
 scikit-image, commonly abbreviated as skimage, is an open-source image
processing library for Python.
 It provides a collection of algorithms for image division, feature extraction,
image filtering, and other image processing tasks
 Image Processing
 Integration: It seamlessly integrates with other scientific Python libraries such
as NumPy, SciPy, and Matplotlib, allowing for efficient image manipulation and
analysis.
 User-Friendly API
 Community Support: Skimage benefits from an active community of developers
and users,
8
Installing scikit-image library:
Pip install scikit-image
Import skimage
from skimage import io
# Load an image from a file
image = io.imread('example_image.jpg')
# Display the image
io.imshow(image)
io.show()
9
 Pillow is a Python Imaging Library (PIL) fork, which adds extensive image processing
capabilities to Python. It provides support for opening, manipulating, and saving many
different image file formats.
 Image Manipulation: Pillow offers a wide range of image handling functionalities such
as resizing, cropping, rotating, filtering, and enhancing images.
 Image File Support: It supports various image file formats including JPEG, PNG, GIF,
etc. making it suitable for handling varied image data.
 Integration: Pillow seamlessly integrates with other Python libraries such as NumPy
and Matplotlib, enabling easy interoperability with scientific computing and data
visualization tools.
 Ease of Use: Pillow provides a simple and intuitive API for working with images,
making it accessible to users with varying levels of programming experience.
 Activeness: Pillow is actively maintained and updated, ensuring compatibility with the
latest Python versions and continued support for new features and improvements.
10
 Installing Pillow library:
 Pip install pillow
 from PIL import Image
 # Open an image file
 original_image =
Image.open("example.jpg")
 # Display basic information about
the image
 print("Original Image Format:",
original_image.format)
 print("Original Image Size:",
original_image.size)
 # Resize the image
 new_size = (original_image.size[0] //
2, original_image.size[1] // 2)
 # Reduce size by half
 resized_image =
original_image.resize(new_size)
11
# Display new size
print("Resized Image Size:", resized_image.size)
# Save the resized image with a new name
resized_image.save("resized_example.jpg")
# Close the original and resized images
original_image.close()
resized_image.close()
print("Resized image saved successfully!")
 Pandas is a powerful Python library for data manipulation and analysis. It
offers data structures and functions to efficiently work with structured data like
time series, tabular, and heterogeneous data.
 Data Structures: Pandas provides two main data structures: Series (1D labeled
array) and DataFrame (2D labeled data structure), which offer powerful data
manipulation capabilities.
 Data Handling: It offers functionalities for reading and writing data from
various formats like CSV, Excel, SQL databases etc.
 Data Analysis: Pandas supports data analysis tasks including data cleaning,
filtering, grouping, merging, and reshaping, making it indispensable for
exploratory data analysis.
 Integration: It seamlessly integrates with other Python libraries such as
NumPy, Matplotlib, and scikit-learn, enhancing its capabilities in scientific
computing and machine learning tasks.
12
 Installing Pandas library:
Pip install pandas
Some time it shows for pip upgrade
then use the following to upgrade
your pip:
Python.exe -m pip install --upgrade
pip
import pandas as pd
# Read a CSV file into a DataFrame
df = pd.read_csv("example.csv")
# Display the first few rows of the
DataFrame
print("First few rows of the
DataFrame:")
print(df.head()) 13
# Display summary information
about the DataFrame
print("nSummary
information:")
print(df.info())
# Display basic statistics of
numerical columns
print("nBasic statistics:")
print(df.describe())
 Definition: scikit-learn is a versatile machine learning library for Python. It offers
simple and efficient tools for data mining and data analysis, implementing a wide
range of machine learning algorithms.
 Machine Learning Algorithms: scikit-learn provides implementations for various
machine learning algorithms including classification, regression, clustering,
dimensionality reduction, and model selection.
 Model Evaluation: It offers tools for model evaluation, cross-validation, and
hyperparameter tuning, facilitating the development of robust and accurate machine
learning models.
 Integration: scikit-learn seamlessly integrates with other Python libraries such as
NumPy, SciPy, and Pandas, enabling easy preprocessing, training, and evaluation of
machine learning models.
 Scalability: It is designed to be scalable and efficient, making it suitable for working
with large datasets and complex models.
14
 Installing scikit-learn library:
 Pip install scikit-learn
 Import sklearn
 from sklearn.datasets import load_iris
 from sklearn.model_selection import
train_test_split
 from sklearn.ensemble import
RandomForestClassifier
 from sklearn.metrics import
accuracy_score, classification_report
 # Load the Iris dataset
 iris = load_iris()
 X = iris.data # Features
 y = iris.target # Target variable
 # Split the dataset into training and
testing sets
 X_train, X_test, y_train, y_test =
train_test_split(X, y, test_size=0.2,
random_state=42) 15
# Initialize the Random Forest classifier
rf_classifier =
RandomForestClassifier(n_estimators=100,
random_state=42)
# Train the classifier
rf_classifier.fit(X_train, y_train)
# Predict on the test set
y_pred = rf_classifier.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Display classification report
print("nClassification Report:")
print(classification_report(y_test, y_pred,
target_names=iris.target_names))
Seaborn is a Python library for creating attractive statistical graphics.
 Statistical Visualization: Seaborn excels in generating plots like scatter plots,
bar charts, and heatmaps for effective data exploration.
 Integration with Pandas: It seamlessly works with Pandas DataFrames,
making data visualization straightforward.
 Customization: Users can easily customize plot aesthetics to suit their
preferences.
 Statistical Analysis: Seaborn offers tools for visualizing relationships between
variables and conducting statistical analysis.
 Community and Documentation: Supported by an active community and
comprehensive documentation for easy learning.
16
 Installing seaborn library:
 Pip install seaborn
 import seaborn as sns
 import matplotlib.pyplot as plt
 from sklearn.datasets import load_iris
 # Load the Iris dataset
 iris = load_iris()
 iris_df = sns.load_dataset("iris") # Load Iris dataset as a DataFrame
 # Create a pairplot using Seaborn
 sns.pairplot(iris_df, hue='species', palette='Set1')
 # Add title
 plt.suptitle("Pairplot of Iris Dataset")
 # Show the plot
 plt.show()
17
Plotly is a Python library for creating interactive and publication-quality graphs.
 Interactive Visualization: Plotly allows users to interactively explore data
through zooming and hovering over data points.
 Online Platform: It offers an online platform for hosting and sharing interactive
plots.
 Chart Types: Supports a wide range of chart types including scatter plots, line
plots, and 3D surface plots.
 Integration: Easily integrates with other Python libraries for seamless data
manipulation and visualization.
 Customization: Provides extensive options for customizing plot appearance for
tailored visualizations.
18
 Installing plotly library:
 Pip install plotly
 import plotly.graph_objects as go
 # Sample data
 x_values = [1, 2, 3, 4, 5]
 y_values = [2, 3, 5, 7, 11]
 # Create a line plot
 fig = go.Figure(data=go.Scatter(x=x_values, y=y_values,
mode='lines'))
 # Add title and axis labels
 fig.update_layout(title='Simple Line Plot',
 xaxis_title='X-axis',
 yaxis_title='Y-axis')
 # Show the plot
 fig.show() 19
Data Pre Processing:
Data preprocessing is a critical step in machine learning pipelines.
It is define as the techniques and procedures used to prepare raw
data for analysis.
It involves several tasks such as importing and exporting data,
cleaning and formatting data, handling missing values, and feature
scaling.
20
Importing and Exporting Data:
Importing data involves loading datasets into the machine learning
environment.
This can be done using libraries like Pandas in Python or functions like
read_csv() for CSV files, read_excel() for Excel files, etc.
import pandas as pd
df=pd.read_csv(ML.csv)
df.shape #show number of rows and columns
df.describe() #calculate the SD, mean etc.
Exporting the Data :
import pandas as pd
# Example DataFrame
data = {
'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
# Export DataFrame to CSV
df.to_csv('output.csv', index=False) 21
 Cleaning and Formatting Data:
 Cleaning data involves identifying and handling anomalies, inconsistencies,
and errors in the dataset.
 This may include removing duplicates, correcting data types, dealing with
outliers, etc.
 Formatting data involves ensuring that data is in the appropriate format for
analysis.
 For example, converting categorical variables into numerical representations,
standardizing date formats, etc.
22
 import pandas as pd
 # Load the dataset
 data = {
 'Name': ['John', 'Alice', 'Bob', 'Anna', 'Mike', 'Emily'],
 'Age': [25, 30, None, 35, 40, ''],
 'City': ['New York', 'Los Angeles', 'Chicago', 'San Francisco', '',
'Seattle'],
 'Gender': ['Male', 'Female', 'Male', '', 'Male', 'Female'],
 'Salary': ['$50000', '$60000', '$70000', '$80000', '90000', '$100000']
 }
 df = pd.DataFrame(data)
 # Display the original DataFrame
 print("Original DataFrame:")
 print(df)
 print()
 # Clean and format the data
 # 1. Convert Age to numeric and fill missing values with the median
age
 df['Age'] = pd.to_numeric(df['Age'], errors='coerce')
23
median_age = df['Age'].median() #
Calculate median age
df['Age'].fillna(median_age, inplace=True)
# Fill missing values with median
# 2. Remove rows with missing or empty
values in City and Gender columns
df = df[df['City'].notna() &
df['Gender'].notna() & (df['Gender'] != '')]
# 3. Convert Salary to numeric and remove
dollar signs
df['Salary'] = df['Salary'].replace('[$,]', '',
regex=True).astype(float)
# Display the cleaned and formatted
DataFrame
print("Cleaned and Formatted
DataFrame:")
print(df)
 Handling Missing Values:
 Missing values are common in datasets and can significantly affect the
performance of machine learning models if not handled properly.
 Techniques for handling missing values include:
 Imputation: Replacing missing values with a calculated or estimated value
(e.g., mean, median, mode).
 Deletion: Removing rows or columns with missing values.
 Advanced techniques like predictive modeling to estimate missing values
based on other features.
 The example is same as previous.
24
 Feature Scaling:
 Feature scaling is the process of standardizing or normalizing the range of
independent variables or features in the dataset.
 It is essential for algorithms that are sensitive to the scale of the input
features, such as gradient descent-based algorithms (e.g., linear regression,
logistic regression) or distance-based algorithms (e.g., k-nearest neighbors,
support vector machines).
 Common techniques for feature scaling include:
 Min-Max Scaling: Scaling features to a fixed range, usually [0, 1].
 Standardization (Z-score normalization): Scaling features so that they have
the properties of a standard normal distribution with a mean of 0 and a
standard deviation of 1.
 Robust Scaling: Scaling features using statistics that are robust to outliers,
such as the median and interquartile range.
25
 Feature Scaling:
 import numpy as np
 from sklearn.preprocessing import MinMaxScaler, StandardScaler
 # Sample dataset with two features
 data = np.array([[10, 0.5],
 [20, 0.7],
 [30, 0.9]])
 # Min-Max Scaling
 scaler_minmax = MinMaxScaler() # Initialize MinMaxScaler
 data_minmax = scaler_minmax.fit_transform(data) # Perform Min-Max Scaling
 print("Min-Max Scaled Data:")
 print(data_minmax)
 print()
 # Standardization (Z-score normalization)
 scaler_standard = StandardScaler() # Initialize StandardScaler
 data_standard = scaler_standard.fit_transform(data) # Perform Standardization
 print("Standardized Data:")
 print(data_standard) 26

More Related Content

Similar to Introduction to Machine Learning by MARK (20)

Session 2
Session 2Session 2
Session 2
HarithaAshok3
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
Ganesan Narayanasamy
GDG-MLOps using Protobuf in Unity
GDG-MLOps using Protobuf in UnityGDG-MLOps using Protobuf in Unity
GDG-MLOps using Protobuf in Unity
Ivan Chiou
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docx
RameshMishra84
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET Journal
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptx
Ankit Gupta
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with python
Kumud Arora
Manual orange
Manual orangeManual orange
Manual orange
Kishoj Bajracharya
Artificial Intelligence concepts in a Nutshell
Artificial Intelligence concepts in a NutshellArtificial Intelligence concepts in a Nutshell
Artificial Intelligence concepts in a Nutshell
kannanalagu1
Icpp power ai-workshop 2018
Icpp power ai-workshop 2018Icpp power ai-workshop 2018
Icpp power ai-workshop 2018
Ganesan Narayanasamy
PyTorch 襴 (Touch to PyTorch)
PyTorch 襴 (Touch to PyTorch)PyTorch 襴 (Touch to PyTorch)
PyTorch 襴 (Touch to PyTorch)
Hansol Kang
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
INFOGAIN PUBLICATION
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
RaginiJain21
More on Pandas.pptx
More on Pandas.pptxMore on Pandas.pptx
More on Pandas.pptx
VirajPathania1
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
rohithprabhas1
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
Machine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptxMachine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptx
pratikpatil862906
Python for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptxPython for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptx
Dr. Amanpreet Kaur
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
Polad Saruxanov
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
Ganesan Narayanasamy
GDG-MLOps using Protobuf in Unity
GDG-MLOps using Protobuf in UnityGDG-MLOps using Protobuf in Unity
GDG-MLOps using Protobuf in Unity
Ivan Chiou
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docx
RameshMishra84
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET Journal
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptx
Ankit Gupta
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with python
Kumud Arora
Artificial Intelligence concepts in a Nutshell
Artificial Intelligence concepts in a NutshellArtificial Intelligence concepts in a Nutshell
Artificial Intelligence concepts in a Nutshell
kannanalagu1
PyTorch 襴 (Touch to PyTorch)
PyTorch 襴 (Touch to PyTorch)PyTorch 襴 (Touch to PyTorch)
PyTorch 襴 (Touch to PyTorch)
Hansol Kang
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
Ijaems apr-2016-17 Raspberry PI Based Artificial Vision Assisting System for ...
INFOGAIN PUBLICATION
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
RaginiJain21
More on Pandas.pptx
More on Pandas.pptxMore on Pandas.pptx
More on Pandas.pptx
VirajPathania1
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
rohithprabhas1
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
Machine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptxMachine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptx
pratikpatil862906
Python for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptxPython for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptx
Dr. Amanpreet Kaur
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
Polad Saruxanov

More from MRKUsafzai0607 (6)

Intermediate Code Generator for compiler design
Intermediate Code Generator for compiler designIntermediate Code Generator for compiler design
Intermediate Code Generator for compiler design
MRKUsafzai0607
LECTURE-2 W12.pptx
LECTURE-2 W12.pptxLECTURE-2 W12.pptx
LECTURE-2 W12.pptx
MRKUsafzai0607
W16.pptx
W16.pptxW16.pptx
W16.pptx
MRKUsafzai0607
DBMS Intro.pptx
DBMS Intro.pptxDBMS Intro.pptx
DBMS Intro.pptx
MRKUsafzai0607
LECTURE-1 (1).pptx
LECTURE-1 (1).pptxLECTURE-1 (1).pptx
LECTURE-1 (1).pptx
MRKUsafzai0607
LECTURE-2.pptx
LECTURE-2.pptxLECTURE-2.pptx
LECTURE-2.pptx
MRKUsafzai0607
Intermediate Code Generator for compiler design
Intermediate Code Generator for compiler designIntermediate Code Generator for compiler design
Intermediate Code Generator for compiler design
MRKUsafzai0607

Recently uploaded (20)

LA11-Case study of motherboard and internal components of motheroard.docx
LA11-Case study of motherboard and internal components of motheroard.docxLA11-Case study of motherboard and internal components of motheroard.docx
LA11-Case study of motherboard and internal components of motheroard.docx
VidyaAshokNemade
Shallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdfShallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdf
DUSABEMARIYA
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptxUHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
ariomthermal2031
PLANT CELL REACTORS presenation PTC amity
PLANT CELL REACTORS presenation PTC amityPLANT CELL REACTORS presenation PTC amity
PLANT CELL REACTORS presenation PTC amity
UrjaMoon
CNC Technology Unit-2 for IV Year 24-25 MECH
CNC Technology Unit-2 for IV Year 24-25 MECHCNC Technology Unit-2 for IV Year 24-25 MECH
CNC Technology Unit-2 for IV Year 24-25 MECH
C Sai Kiran
Artificial intelligence and Machine learning in remote sensing and GIS
Artificial intelligence  and Machine learning in remote sensing and GISArtificial intelligence  and Machine learning in remote sensing and GIS
Artificial intelligence and Machine learning in remote sensing and GIS
amirthamm2083
CNC Technology Unit-1 for IV Year 24-25 MECH
CNC Technology Unit-1 for IV Year 24-25 MECHCNC Technology Unit-1 for IV Year 24-25 MECH
CNC Technology Unit-1 for IV Year 24-25 MECH
C Sai Kiran
Project Manager | Integrated Design Expert
Project Manager | Integrated Design ExpertProject Manager | Integrated Design Expert
Project Manager | Integrated Design Expert
BARBARA BIANCO
BUILD WITH AI for GDG on campus MVJCE.pptx
BUILD WITH AI for GDG on campus MVJCE.pptxBUILD WITH AI for GDG on campus MVJCE.pptx
BUILD WITH AI for GDG on campus MVJCE.pptx
greeshmadj0
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
Guru Nanak Technical Institutions
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal buildingAirport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Priyanka Dange
Mastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdfMastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdf
Brion Mario
"Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications""Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications"
GtxDriver
CNC Technology Unit-5 for IV Year 24-25 MECH
CNC Technology Unit-5 for IV Year 24-25 MECHCNC Technology Unit-5 for IV Year 24-25 MECH
CNC Technology Unit-5 for IV Year 24-25 MECH
C Sai Kiran
CNC Technology Unit-3 for IV Year 24-25 MECH
CNC Technology Unit-3 for IV Year 24-25 MECHCNC Technology Unit-3 for IV Year 24-25 MECH
CNC Technology Unit-3 for IV Year 24-25 MECH
C Sai Kiran
applicationof differential equation.pptx
applicationof differential equation.pptxapplicationof differential equation.pptx
applicationof differential equation.pptx
PPSTUDIES
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
Guru Nanak Technical Institutions
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
NIT SILCHAR
Scalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M NotificationsScalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M Notifications
Gustavo Araujo
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION .pptx
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION  .pptxUHV UNIT-I INTRODUCTION TO VALUE EDUCATION  .pptx
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION .pptx
ariomthermal2031
LA11-Case study of motherboard and internal components of motheroard.docx
LA11-Case study of motherboard and internal components of motheroard.docxLA11-Case study of motherboard and internal components of motheroard.docx
LA11-Case study of motherboard and internal components of motheroard.docx
VidyaAshokNemade
Shallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdfShallow base metal exploration in northern New Brunswick.pdf
Shallow base metal exploration in northern New Brunswick.pdf
DUSABEMARIYA
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptxUHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
UHV unit-2UNIT - II HARMONY IN THE HUMAN BEING.pptx
ariomthermal2031
PLANT CELL REACTORS presenation PTC amity
PLANT CELL REACTORS presenation PTC amityPLANT CELL REACTORS presenation PTC amity
PLANT CELL REACTORS presenation PTC amity
UrjaMoon
CNC Technology Unit-2 for IV Year 24-25 MECH
CNC Technology Unit-2 for IV Year 24-25 MECHCNC Technology Unit-2 for IV Year 24-25 MECH
CNC Technology Unit-2 for IV Year 24-25 MECH
C Sai Kiran
Artificial intelligence and Machine learning in remote sensing and GIS
Artificial intelligence  and Machine learning in remote sensing and GISArtificial intelligence  and Machine learning in remote sensing and GIS
Artificial intelligence and Machine learning in remote sensing and GIS
amirthamm2083
CNC Technology Unit-1 for IV Year 24-25 MECH
CNC Technology Unit-1 for IV Year 24-25 MECHCNC Technology Unit-1 for IV Year 24-25 MECH
CNC Technology Unit-1 for IV Year 24-25 MECH
C Sai Kiran
Project Manager | Integrated Design Expert
Project Manager | Integrated Design ExpertProject Manager | Integrated Design Expert
Project Manager | Integrated Design Expert
BARBARA BIANCO
BUILD WITH AI for GDG on campus MVJCE.pptx
BUILD WITH AI for GDG on campus MVJCE.pptxBUILD WITH AI for GDG on campus MVJCE.pptx
BUILD WITH AI for GDG on campus MVJCE.pptx
greeshmadj0
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal buildingAirport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Airport Components Part2 ppt.pptx-Apron,Hangers,Terminal building
Priyanka Dange
Mastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdfMastering Secure Login Mechanisms for React Apps.pdf
Mastering Secure Login Mechanisms for React Apps.pdf
Brion Mario
"Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications""Introduction to VLSI Design: Concepts and Applications"
"Introduction to VLSI Design: Concepts and Applications"
GtxDriver
CNC Technology Unit-5 for IV Year 24-25 MECH
CNC Technology Unit-5 for IV Year 24-25 MECHCNC Technology Unit-5 for IV Year 24-25 MECH
CNC Technology Unit-5 for IV Year 24-25 MECH
C Sai Kiran
CNC Technology Unit-3 for IV Year 24-25 MECH
CNC Technology Unit-3 for IV Year 24-25 MECHCNC Technology Unit-3 for IV Year 24-25 MECH
CNC Technology Unit-3 for IV Year 24-25 MECH
C Sai Kiran
applicationof differential equation.pptx
applicationof differential equation.pptxapplicationof differential equation.pptx
applicationof differential equation.pptx
PPSTUDIES
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
Guru Nanak Technical Institutions
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
Self-Compacting Concrete: Composition, Properties, and Applications in Modern...
NIT SILCHAR
Scalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M NotificationsScalling Rails: The Journey to 200M Notifications
Scalling Rails: The Journey to 200M Notifications
Gustavo Araujo
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION .pptx
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION  .pptxUHV UNIT-I INTRODUCTION TO VALUE EDUCATION  .pptx
UHV UNIT-I INTRODUCTION TO VALUE EDUCATION .pptx
ariomthermal2031

Introduction to Machine Learning by MARK

  • 1. Presenter: Muhammad Rizwan Khan Usafzai 1
  • 2. NumPy: NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Key Features: Array creation and manipulation Mathematical operations on arrays Linear algebra operations Fourier transforms Random number generation Applications: Scientific computing Data analysis and manipulation Machine learning 2
  • 3. How to install NumPy on Jupyter? Open the jupyter notebook and type the following code: !pip install numpy Import numpy as np Solve the following code then: n = np.array((1,2,3)) Print(n) Type of object: Print(type(n)) 3
  • 4. OpenCV (Open Source Computer Vision Library): OpenCV is an open-source computer vision and machine learning software library. It provides a wide range of functionalities for real-time computer vision, including image and video processing, object detection, face recognition, and more. Key Features: Image and video I/O Image processing algorithms Object detection and tracking Machine learning algorithms for computer vision tasks Applications: Robotics Augmented reality Surveillance systems Medical image analysis 4
  • 5. How to install Open CV on Jupyter? Open the jupyter notebook and type the following code: !pip install opencv-python import cv2 img = cv2.imread("img1.png") cv2.imshow("MRK",img) cv2.waitKey(10000) cv2.destroyAllWindows() 5
  • 6. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It provides a MATLAB-like interface and supports a wide variety of plots and graphs. Key Features: Line plots, scatter plots, and histograms 2D and 3D plotting Customization of plots Integration with NumPy arrays Applications: Data visualization Scientific plotting Statistical analysis 6
  • 7. How to install Matplotlib on Jupyter? Open the jupyter notebook and type the following code: !pip install matplotlib Import matplotlib.pyplot as plt // as means alias (named) import numpy as np xpts = np.array([0,4]) ypts = np.array([0,6]) plt.plot(xpts,ypts) plt.show() 7
  • 8. scikit-image, commonly abbreviated as skimage, is an open-source image processing library for Python. It provides a collection of algorithms for image division, feature extraction, image filtering, and other image processing tasks Image Processing Integration: It seamlessly integrates with other scientific Python libraries such as NumPy, SciPy, and Matplotlib, allowing for efficient image manipulation and analysis. User-Friendly API Community Support: Skimage benefits from an active community of developers and users, 8
  • 9. Installing scikit-image library: Pip install scikit-image Import skimage from skimage import io # Load an image from a file image = io.imread('example_image.jpg') # Display the image io.imshow(image) io.show() 9
  • 10. Pillow is a Python Imaging Library (PIL) fork, which adds extensive image processing capabilities to Python. It provides support for opening, manipulating, and saving many different image file formats. Image Manipulation: Pillow offers a wide range of image handling functionalities such as resizing, cropping, rotating, filtering, and enhancing images. Image File Support: It supports various image file formats including JPEG, PNG, GIF, etc. making it suitable for handling varied image data. Integration: Pillow seamlessly integrates with other Python libraries such as NumPy and Matplotlib, enabling easy interoperability with scientific computing and data visualization tools. Ease of Use: Pillow provides a simple and intuitive API for working with images, making it accessible to users with varying levels of programming experience. Activeness: Pillow is actively maintained and updated, ensuring compatibility with the latest Python versions and continued support for new features and improvements. 10
  • 11. Installing Pillow library: Pip install pillow from PIL import Image # Open an image file original_image = Image.open("example.jpg") # Display basic information about the image print("Original Image Format:", original_image.format) print("Original Image Size:", original_image.size) # Resize the image new_size = (original_image.size[0] // 2, original_image.size[1] // 2) # Reduce size by half resized_image = original_image.resize(new_size) 11 # Display new size print("Resized Image Size:", resized_image.size) # Save the resized image with a new name resized_image.save("resized_example.jpg") # Close the original and resized images original_image.close() resized_image.close() print("Resized image saved successfully!")
  • 12. Pandas is a powerful Python library for data manipulation and analysis. It offers data structures and functions to efficiently work with structured data like time series, tabular, and heterogeneous data. Data Structures: Pandas provides two main data structures: Series (1D labeled array) and DataFrame (2D labeled data structure), which offer powerful data manipulation capabilities. Data Handling: It offers functionalities for reading and writing data from various formats like CSV, Excel, SQL databases etc. Data Analysis: Pandas supports data analysis tasks including data cleaning, filtering, grouping, merging, and reshaping, making it indispensable for exploratory data analysis. Integration: It seamlessly integrates with other Python libraries such as NumPy, Matplotlib, and scikit-learn, enhancing its capabilities in scientific computing and machine learning tasks. 12
  • 13. Installing Pandas library: Pip install pandas Some time it shows for pip upgrade then use the following to upgrade your pip: Python.exe -m pip install --upgrade pip import pandas as pd # Read a CSV file into a DataFrame df = pd.read_csv("example.csv") # Display the first few rows of the DataFrame print("First few rows of the DataFrame:") print(df.head()) 13 # Display summary information about the DataFrame print("nSummary information:") print(df.info()) # Display basic statistics of numerical columns print("nBasic statistics:") print(df.describe())
  • 14. Definition: scikit-learn is a versatile machine learning library for Python. It offers simple and efficient tools for data mining and data analysis, implementing a wide range of machine learning algorithms. Machine Learning Algorithms: scikit-learn provides implementations for various machine learning algorithms including classification, regression, clustering, dimensionality reduction, and model selection. Model Evaluation: It offers tools for model evaluation, cross-validation, and hyperparameter tuning, facilitating the development of robust and accurate machine learning models. Integration: scikit-learn seamlessly integrates with other Python libraries such as NumPy, SciPy, and Pandas, enabling easy preprocessing, training, and evaluation of machine learning models. Scalability: It is designed to be scalable and efficient, making it suitable for working with large datasets and complex models. 14
  • 15. Installing scikit-learn library: Pip install scikit-learn Import sklearn from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score, classification_report # Load the Iris dataset iris = load_iris() X = iris.data # Features y = iris.target # Target variable # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 15 # Initialize the Random Forest classifier rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42) # Train the classifier rf_classifier.fit(X_train, y_train) # Predict on the test set y_pred = rf_classifier.predict(X_test) # Evaluate the model accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy) # Display classification report print("nClassification Report:") print(classification_report(y_test, y_pred, target_names=iris.target_names))
  • 16. Seaborn is a Python library for creating attractive statistical graphics. Statistical Visualization: Seaborn excels in generating plots like scatter plots, bar charts, and heatmaps for effective data exploration. Integration with Pandas: It seamlessly works with Pandas DataFrames, making data visualization straightforward. Customization: Users can easily customize plot aesthetics to suit their preferences. Statistical Analysis: Seaborn offers tools for visualizing relationships between variables and conducting statistical analysis. Community and Documentation: Supported by an active community and comprehensive documentation for easy learning. 16
  • 17. Installing seaborn library: Pip install seaborn import seaborn as sns import matplotlib.pyplot as plt from sklearn.datasets import load_iris # Load the Iris dataset iris = load_iris() iris_df = sns.load_dataset("iris") # Load Iris dataset as a DataFrame # Create a pairplot using Seaborn sns.pairplot(iris_df, hue='species', palette='Set1') # Add title plt.suptitle("Pairplot of Iris Dataset") # Show the plot plt.show() 17
  • 18. Plotly is a Python library for creating interactive and publication-quality graphs. Interactive Visualization: Plotly allows users to interactively explore data through zooming and hovering over data points. Online Platform: It offers an online platform for hosting and sharing interactive plots. Chart Types: Supports a wide range of chart types including scatter plots, line plots, and 3D surface plots. Integration: Easily integrates with other Python libraries for seamless data manipulation and visualization. Customization: Provides extensive options for customizing plot appearance for tailored visualizations. 18
  • 19. Installing plotly library: Pip install plotly import plotly.graph_objects as go # Sample data x_values = [1, 2, 3, 4, 5] y_values = [2, 3, 5, 7, 11] # Create a line plot fig = go.Figure(data=go.Scatter(x=x_values, y=y_values, mode='lines')) # Add title and axis labels fig.update_layout(title='Simple Line Plot', xaxis_title='X-axis', yaxis_title='Y-axis') # Show the plot fig.show() 19
  • 20. Data Pre Processing: Data preprocessing is a critical step in machine learning pipelines. It is define as the techniques and procedures used to prepare raw data for analysis. It involves several tasks such as importing and exporting data, cleaning and formatting data, handling missing values, and feature scaling. 20 Importing and Exporting Data: Importing data involves loading datasets into the machine learning environment. This can be done using libraries like Pandas in Python or functions like read_csv() for CSV files, read_excel() for Excel files, etc. import pandas as pd df=pd.read_csv(ML.csv) df.shape #show number of rows and columns df.describe() #calculate the SD, mean etc.
  • 21. Exporting the Data : import pandas as pd # Example DataFrame data = { 'Name': ['John', 'Alice', 'Bob'], 'Age': [25, 30, 35], 'City': ['New York', 'Los Angeles', 'Chicago'] } df = pd.DataFrame(data) # Export DataFrame to CSV df.to_csv('output.csv', index=False) 21
  • 22. Cleaning and Formatting Data: Cleaning data involves identifying and handling anomalies, inconsistencies, and errors in the dataset. This may include removing duplicates, correcting data types, dealing with outliers, etc. Formatting data involves ensuring that data is in the appropriate format for analysis. For example, converting categorical variables into numerical representations, standardizing date formats, etc. 22
  • 23. import pandas as pd # Load the dataset data = { 'Name': ['John', 'Alice', 'Bob', 'Anna', 'Mike', 'Emily'], 'Age': [25, 30, None, 35, 40, ''], 'City': ['New York', 'Los Angeles', 'Chicago', 'San Francisco', '', 'Seattle'], 'Gender': ['Male', 'Female', 'Male', '', 'Male', 'Female'], 'Salary': ['$50000', '$60000', '$70000', '$80000', '90000', '$100000'] } df = pd.DataFrame(data) # Display the original DataFrame print("Original DataFrame:") print(df) print() # Clean and format the data # 1. Convert Age to numeric and fill missing values with the median age df['Age'] = pd.to_numeric(df['Age'], errors='coerce') 23 median_age = df['Age'].median() # Calculate median age df['Age'].fillna(median_age, inplace=True) # Fill missing values with median # 2. Remove rows with missing or empty values in City and Gender columns df = df[df['City'].notna() & df['Gender'].notna() & (df['Gender'] != '')] # 3. Convert Salary to numeric and remove dollar signs df['Salary'] = df['Salary'].replace('[$,]', '', regex=True).astype(float) # Display the cleaned and formatted DataFrame print("Cleaned and Formatted DataFrame:") print(df)
  • 24. Handling Missing Values: Missing values are common in datasets and can significantly affect the performance of machine learning models if not handled properly. Techniques for handling missing values include: Imputation: Replacing missing values with a calculated or estimated value (e.g., mean, median, mode). Deletion: Removing rows or columns with missing values. Advanced techniques like predictive modeling to estimate missing values based on other features. The example is same as previous. 24
  • 25. Feature Scaling: Feature scaling is the process of standardizing or normalizing the range of independent variables or features in the dataset. It is essential for algorithms that are sensitive to the scale of the input features, such as gradient descent-based algorithms (e.g., linear regression, logistic regression) or distance-based algorithms (e.g., k-nearest neighbors, support vector machines). Common techniques for feature scaling include: Min-Max Scaling: Scaling features to a fixed range, usually [0, 1]. Standardization (Z-score normalization): Scaling features so that they have the properties of a standard normal distribution with a mean of 0 and a standard deviation of 1. Robust Scaling: Scaling features using statistics that are robust to outliers, such as the median and interquartile range. 25
  • 26. Feature Scaling: import numpy as np from sklearn.preprocessing import MinMaxScaler, StandardScaler # Sample dataset with two features data = np.array([[10, 0.5], [20, 0.7], [30, 0.9]]) # Min-Max Scaling scaler_minmax = MinMaxScaler() # Initialize MinMaxScaler data_minmax = scaler_minmax.fit_transform(data) # Perform Min-Max Scaling print("Min-Max Scaled Data:") print(data_minmax) print() # Standardization (Z-score normalization) scaler_standard = StandardScaler() # Initialize StandardScaler data_standard = scaler_standard.fit_transform(data) # Perform Standardization print("Standardized Data:") print(data_standard) 26