際際滷

際際滷Share a Scribd company logo
Introduction to PyCaret and
installation
What is PyCaret?
 PyCaret is an open-source, low-code machine learning
library in Python that automates machine learning
workflows.
 PyCaret can be used to replace hundreds of lines of
code with few lines only. You spend less time coding
and more time on analysis
 PyCaret is essentially a Python wrapper around
several machine learning libraries and frameworks
such as scikit-learn, XGBoost, LightGBM, CatBoost,
and few more.
PyCaret is ideal for:
 Experienced Data Scientists who want to increase productivity.
 Citizen Data Scientists who prefer a low code machine learning solution.
 Data Science Professionals who want to build rapid prototypes.
 Data Science and Machine Learning students and enthusiasts.
Preprocessing (setup)
Data Preparation Scale and
Transform
Feature
Engineering
Feature Selection
 Missing values
 Data Types
 One-Hot Encoding
 Ordinal Encoding
 Cardinal Encoding
 Handle Unknown Levels
 Target Imbalance
 Remove outliers
 Normalize
 Feature Transform
 Target Transform
 Feature interaction
 Polynomial Features
 Group Features
 Bin Numeric Features
 Combine Rare Levels
 Create Clusters
 Feature Selection
 Remove Multicollinearity
 Principal Component Analysis
 Ignore Low Variance
Model training
PyCaret trains multiple models simultaneously and outputs a table comparing
the performance of each model by considering a few performance metrics.
 Creating models: create_model(dt, fold=n, )
 Comparing models: compare_models(n_select = n, sort=Accuracy, )
 Tuning hyperparameters: tune_model(dt, custom_grid: Optional, )
List of models (Regression)
List of models (Classification)
List of models (Clustering)
List of models (Anomaly Detection)
Analysis and interpretability
My_model = create_model(Model_name)
 plot_model(my_model)
 interpret_model(model)
Finalize, Predict, Save and Deploy model
My_model = create_model(Model_name)
 finalize_model(my_model)
 predict_model(my_model)
 save_model(my_model)
 deploy_model(model)
 Finalize: This function trains a given estimator on the entire dataset including the
holdout set
 predict: This function makes predictions on the test data set.
 Save: This function saves the transformation pipeline and trained model object
into the current working directory as a pickle file for later use (load_model)
 Deploy: This function deploys the transformation pipeline and trained model on
cloud.
Workflow
 PyCaret offers both supervised and unsupervised workflow
Classification Regression
Workflow
 PyCaret offers both supervised and unsupervised workflow
Clustering Anomaly detection
Installation
 The most efficient way of installing PyCaret is through a virtual environment!
Here are the steps:
1. Install anaconda https://www.anaconda.com/products/distribution
2. Create a conda environment: conda create --name yourenvname python=3.8
3. Activate conda environment: conda activate yourenvname
4. Install pycaret 3.0: pip install pycaret[full]
5. Create notebook kernel:
python -m ipykernel install --user --name yourenvname --display-name "display-name
Important Links
皚 Tutorials New to PyCaret? Checkout our official notebooks!
 Example Notebooks Example notebooks created by community.
 Official Blog Tutorials and articles by contributors.
 Documentation The detailed API docs of PyCaret
 Video Tutorials Our video tutorial from various events.
鏝 Cheat sheet Cheat sheet for all functions across modules.
 Discussions Have questions? Engage with community and contributors.
鏝 Changelog Changes and version history.
 Roadmap PyCaret's software and community development plan.
PyCaret Time Series Module
皚 Time Series Quickstart Get started with Time Series Analysis
 Time Series Notebooks
New to Time Series? Checkout our official (detailed)
notebooks!
 Time Series Video Tutorials Our video tutorial from various events.
 Time Series FAQs Have questions? Queck out the FAQ's
鏝 Time Series API Interface The detailed API interface for the Time Series Module
 Time Series Features and Roadmap PyCaret's software and community development plan.
PyCaret new time series module is now available with the main pycaret
installation. Staying true to simplicity of PyCaret, it is consistent with the
existing API and fully loaded with functionalities
Practical example in Python
Now lets look at some practical examples in Python!
https://github.com/PJalgotrader/platforms-and-tools/tree/main/PyCaret

More Related Content

Similar to PyCaret_PedramJahangiryTUTORIALPYTHON.pdf (20)

Introduction to Python and Django
Introduction to Python and DjangoIntroduction to Python and Django
Introduction to Python and Django
solutionstreet
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
DataScienceConferenc1
Low coding and MSTR Robotics.pptx
Low coding and MSTR Robotics.pptxLow coding and MSTR Robotics.pptx
Low coding and MSTR Robotics.pptx
Daniel Mager
Python ml
Python mlPython ml
Python ml
Shubham Sharma
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
Mark Tabladillo
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
eltonrodriguez11
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
Software variability management - 2019
Software variability management - 2019Software variability management - 2019
Software variability management - 2019
XavierDevroey
Design p atterns
Design p atternsDesign p atterns
Design p atterns
Amr Abd El Latief
Codeinator
CodeinatorCodeinator
Codeinator
Muhammed Thanveer M
Building machine learning service in your business Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business  Eric Chen (Uber) @PAPIs ...Building machine learning service in your business  Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business Eric Chen (Uber) @PAPIs ...
PAPIs.io
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
EDB
Consolidating MLOps at One of Europes Biggest Airports
Consolidating MLOps at One of Europes Biggest AirportsConsolidating MLOps at One of Europes Biggest Airports
Consolidating MLOps at One of Europes Biggest Airports
Databricks
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
AZUG FR
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Sri Ambati
GDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App EngineGDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App Engine
Yared Ayalew
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonThe Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine
Introduction to Python and Django
Introduction to Python and DjangoIntroduction to Python and Django
Introduction to Python and Django
solutionstreet
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
DataScienceConferenc1
Low coding and MSTR Robotics.pptx
Low coding and MSTR Robotics.pptxLow coding and MSTR Robotics.pptx
Low coding and MSTR Robotics.pptx
Daniel Mager
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
Mark Tabladillo
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
eltonrodriguez11
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
Software variability management - 2019
Software variability management - 2019Software variability management - 2019
Software variability management - 2019
XavierDevroey
Building machine learning service in your business Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business  Eric Chen (Uber) @PAPIs ...Building machine learning service in your business  Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business Eric Chen (Uber) @PAPIs ...
PAPIs.io
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
EDB
Consolidating MLOps at One of Europes Biggest Airports
Consolidating MLOps at One of Europes Biggest AirportsConsolidating MLOps at One of Europes Biggest Airports
Consolidating MLOps at One of Europes Biggest Airports
Databricks
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
AZUG FR
dbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchezdbt Python models - GoDataFest by Guillermo Sanchez
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Sri Ambati
GDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App EngineGDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App Engine
Yared Ayalew
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonThe Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine

Recently uploaded (20)

BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot DataBSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
Anna Lewis
Blood Bank Management Skahfhfhystem.pptx
Blood Bank Management Skahfhfhystem.pptxBlood Bank Management Skahfhfhystem.pptx
Blood Bank Management Skahfhfhystem.pptx
vedantgupta411
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdhFOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
cshdhdhvfsbzdb
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
Metehan Yeilyurt
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
soniaseo850
20-NoSQLMongoDbiig data analytics hB.pdf
20-NoSQLMongoDbiig data analytics hB.pdf20-NoSQLMongoDbiig data analytics hB.pdf
20-NoSQLMongoDbiig data analytics hB.pdf
ssuser2d043c
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEMTCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
sharmilafaller
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - PromptMeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
Yasen Lilov
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
ali2k2sec
AI-vs-Data-Science-Unraveling-the-Tech-Landscape
AI-vs-Data-Science-Unraveling-the-Tech-LandscapeAI-vs-Data-Science-Unraveling-the-Tech-Landscape
AI-vs-Data-Science-Unraveling-the-Tech-Landscape
Ozias Rondon
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
jimmy841199
100 questions on Data Science to Master interview
100 questions on Data Science to Master interview100 questions on Data Science to Master interview
100 questions on Data Science to Master interview
yashikanigam1
20230109_NLDL_Tutorial_Tan.pdf data analysis
20230109_NLDL_Tutorial_Tan.pdf data analysis20230109_NLDL_Tutorial_Tan.pdf data analysis
20230109_NLDL_Tutorial_Tan.pdf data analysis
aitaghavi
Turinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI PlatformTurinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI Platform
vikrant530668
Introduction to Microsoft Power BI is a business analytics service
Introduction to Microsoft Power BI is a business analytics serviceIntroduction to Microsoft Power BI is a business analytics service
Introduction to Microsoft Power BI is a business analytics service
Kongu Engineering College, Perundurai, Erode
dOWNLOADED_1_Solar_Thermal_Introduction.pptx
dOWNLOADED_1_Solar_Thermal_Introduction.pptxdOWNLOADED_1_Solar_Thermal_Introduction.pptx
dOWNLOADED_1_Solar_Thermal_Introduction.pptx
WahyuPutraSejati
The rise of AI Agents - Beyond Automation_ The Rise of AI Agents in Service ...
The rise of AI Agents -  Beyond Automation_ The Rise of AI Agents in Service ...The rise of AI Agents -  Beyond Automation_ The Rise of AI Agents in Service ...
The rise of AI Agents - Beyond Automation_ The Rise of AI Agents in Service ...
Yasen Lilov
PPT_OOSE software engineering data .pptx
PPT_OOSE software engineering data .pptxPPT_OOSE software engineering data .pptx
PPT_OOSE software engineering data .pptx
ssuser2d043c
IT Professional Ethics, Moral and Cu.ppt
IT Professional Ethics, Moral and Cu.pptIT Professional Ethics, Moral and Cu.ppt
IT Professional Ethics, Moral and Cu.ppt
FrancisFayiah
Risk Based Supervision Model: Introduction
Risk Based Supervision Model: IntroductionRisk Based Supervision Model: Introduction
Risk Based Supervision Model: Introduction
ShohanurRahman76
BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot DataBSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
BSEO - The Ultimate GA4 Audit - Anna Lewis - Polka Dot Data
Anna Lewis
Blood Bank Management Skahfhfhystem.pptx
Blood Bank Management Skahfhfhystem.pptxBlood Bank Management Skahfhfhystem.pptx
Blood Bank Management Skahfhfhystem.pptx
vedantgupta411
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdhFOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
cshdhdhvfsbzdb
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
brightonSEO - Metehan Yesilyurt - Generative AI & GEO: the new SEO race and h...
Metehan Yeilyurt
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
cPanel Dedicated Server Hosting at Top-Tier Data Center comes with a Premier ...
soniaseo850
20-NoSQLMongoDbiig data analytics hB.pdf
20-NoSQLMongoDbiig data analytics hB.pdf20-NoSQLMongoDbiig data analytics hB.pdf
20-NoSQLMongoDbiig data analytics hB.pdf
ssuser2d043c
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEMTCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
sharmilafaller
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - PromptMeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
Yasen Lilov
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
01125867_HPE_Primera_Customer_Presentation_FINAL.pptx
ali2k2sec
AI-vs-Data-Science-Unraveling-the-Tech-Landscape
AI-vs-Data-Science-Unraveling-the-Tech-LandscapeAI-vs-Data-Science-Unraveling-the-Tech-Landscape
AI-vs-Data-Science-Unraveling-the-Tech-Landscape
Ozias Rondon
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
jimmy841199
100 questions on Data Science to Master interview
100 questions on Data Science to Master interview100 questions on Data Science to Master interview
100 questions on Data Science to Master interview
yashikanigam1
20230109_NLDL_Tutorial_Tan.pdf data analysis
20230109_NLDL_Tutorial_Tan.pdf data analysis20230109_NLDL_Tutorial_Tan.pdf data analysis
20230109_NLDL_Tutorial_Tan.pdf data analysis
aitaghavi
Turinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI PlatformTurinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI Platform
vikrant530668
dOWNLOADED_1_Solar_Thermal_Introduction.pptx
dOWNLOADED_1_Solar_Thermal_Introduction.pptxdOWNLOADED_1_Solar_Thermal_Introduction.pptx
dOWNLOADED_1_Solar_Thermal_Introduction.pptx
WahyuPutraSejati
The rise of AI Agents - Beyond Automation_ The Rise of AI Agents in Service ...
The rise of AI Agents -  Beyond Automation_ The Rise of AI Agents in Service ...The rise of AI Agents -  Beyond Automation_ The Rise of AI Agents in Service ...
The rise of AI Agents - Beyond Automation_ The Rise of AI Agents in Service ...
Yasen Lilov
PPT_OOSE software engineering data .pptx
PPT_OOSE software engineering data .pptxPPT_OOSE software engineering data .pptx
PPT_OOSE software engineering data .pptx
ssuser2d043c
IT Professional Ethics, Moral and Cu.ppt
IT Professional Ethics, Moral and Cu.pptIT Professional Ethics, Moral and Cu.ppt
IT Professional Ethics, Moral and Cu.ppt
FrancisFayiah
Risk Based Supervision Model: Introduction
Risk Based Supervision Model: IntroductionRisk Based Supervision Model: Introduction
Risk Based Supervision Model: Introduction
ShohanurRahman76

PyCaret_PedramJahangiryTUTORIALPYTHON.pdf

  • 1. Introduction to PyCaret and installation
  • 2. What is PyCaret? PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. PyCaret can be used to replace hundreds of lines of code with few lines only. You spend less time coding and more time on analysis PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, and few more.
  • 3. PyCaret is ideal for: Experienced Data Scientists who want to increase productivity. Citizen Data Scientists who prefer a low code machine learning solution. Data Science Professionals who want to build rapid prototypes. Data Science and Machine Learning students and enthusiasts.
  • 4. Preprocessing (setup) Data Preparation Scale and Transform Feature Engineering Feature Selection Missing values Data Types One-Hot Encoding Ordinal Encoding Cardinal Encoding Handle Unknown Levels Target Imbalance Remove outliers Normalize Feature Transform Target Transform Feature interaction Polynomial Features Group Features Bin Numeric Features Combine Rare Levels Create Clusters Feature Selection Remove Multicollinearity Principal Component Analysis Ignore Low Variance
  • 5. Model training PyCaret trains multiple models simultaneously and outputs a table comparing the performance of each model by considering a few performance metrics. Creating models: create_model(dt, fold=n, ) Comparing models: compare_models(n_select = n, sort=Accuracy, ) Tuning hyperparameters: tune_model(dt, custom_grid: Optional, )
  • 6. List of models (Regression)
  • 7. List of models (Classification)
  • 8. List of models (Clustering)
  • 9. List of models (Anomaly Detection)
  • 10. Analysis and interpretability My_model = create_model(Model_name) plot_model(my_model) interpret_model(model)
  • 11. Finalize, Predict, Save and Deploy model My_model = create_model(Model_name) finalize_model(my_model) predict_model(my_model) save_model(my_model) deploy_model(model) Finalize: This function trains a given estimator on the entire dataset including the holdout set predict: This function makes predictions on the test data set. Save: This function saves the transformation pipeline and trained model object into the current working directory as a pickle file for later use (load_model) Deploy: This function deploys the transformation pipeline and trained model on cloud.
  • 12. Workflow PyCaret offers both supervised and unsupervised workflow Classification Regression
  • 13. Workflow PyCaret offers both supervised and unsupervised workflow Clustering Anomaly detection
  • 14. Installation The most efficient way of installing PyCaret is through a virtual environment! Here are the steps: 1. Install anaconda https://www.anaconda.com/products/distribution 2. Create a conda environment: conda create --name yourenvname python=3.8 3. Activate conda environment: conda activate yourenvname 4. Install pycaret 3.0: pip install pycaret[full] 5. Create notebook kernel: python -m ipykernel install --user --name yourenvname --display-name "display-name
  • 15. Important Links 皚 Tutorials New to PyCaret? Checkout our official notebooks! Example Notebooks Example notebooks created by community. Official Blog Tutorials and articles by contributors. Documentation The detailed API docs of PyCaret Video Tutorials Our video tutorial from various events. 鏝 Cheat sheet Cheat sheet for all functions across modules. Discussions Have questions? Engage with community and contributors. 鏝 Changelog Changes and version history. Roadmap PyCaret's software and community development plan.
  • 16. PyCaret Time Series Module 皚 Time Series Quickstart Get started with Time Series Analysis Time Series Notebooks New to Time Series? Checkout our official (detailed) notebooks! Time Series Video Tutorials Our video tutorial from various events. Time Series FAQs Have questions? Queck out the FAQ's 鏝 Time Series API Interface The detailed API interface for the Time Series Module Time Series Features and Roadmap PyCaret's software and community development plan. PyCaret new time series module is now available with the main pycaret installation. Staying true to simplicity of PyCaret, it is consistent with the existing API and fully loaded with functionalities
  • 17. Practical example in Python Now lets look at some practical examples in Python! https://github.com/PJalgotrader/platforms-and-tools/tree/main/PyCaret