際際滷

際際滷Share a Scribd company logo
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Demonstrating My Projects With AWS
Sagemaker Clarify, Sagemaker Tuner & AWS
Feature Store
Varun Garg, Ph.D.
Algorithm Engineer at Magna
Ph.D. in Data Fusion
Department of Electrical and Computer Engineering
University of Massachusetts Lowell, Lowell, MA, USA
February 19, 2024
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Proposal Summary
This presentation goes over a few tasks I did on the AWS Sagmaker service
AWS Sagemaker is a fully managed machine learning service offered by Amazon
Web Services.
Sagemaker offers capabilities for data pre-processing, model training,
hyper-parameter tuning, and model deployment.
Sagemaker provides a wide range of built-in algorithms and frameworks,
including TensorFlow, PyTorch,
In this presentation, I will be presenting:
Bias Calculation with Sagemaker Clarify different bias metrics
Hyper-parameter tuning using AWS Sagemakers Hyper parameter Tuner
AWS feature Store where the data scientist can share the feature with
other team members by writing AWS Athena queries interfaced by
using AWS Glue.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Outline of the Presentation
1 AWS Clarify: Statistical Bias
2 AWS Hyperparameter Tuning
3 AWS Feature Store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify Statistical Bias
AWS Clarify Statistical Bias
Amazon SageMaker Clarify offers enhanced visibility to data scientists by
enabling them to effectively compute pre-training bias metrics (data bias, class
imbalance) or post-training metrics bias (model bias, etc)
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Detecting Statistical Bias: FlowChart
Amazon SageMaker Clarify has the ability to identify various useful
measurements at different steps of data science projects, such as during data
preparation, and post-model deployment.
In the following slides we will demonstrate the usage of Sagemaker Clarify
These metrics were applied to a Kaggle dataset [1]
Figure: Figure showing the usage of Sagemaker Clarify in data preparation phase,
model training phase and the deployment phase of the ML pipeline making. Figure
Credits AWS Documentation [2]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Configuring AWS Clarify
We will be using Sagemaker Clarify for calculating Pre-training bias metrics
such as class imbalance, KL- Divergence, etc.
In the code snippet below multiple variables are defined to configure AWS to
clarify such input data, output path of the report, column name in data having
labels
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify: Launching Task
In AWS clarify configuration we define the calculation metrics such as Class
Imbalance, KL divergence, LP norm.
Figure: AWS executing the AWS clarify task using AWS
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Results: Bias Report
In the following table for the column in the dataset called dresses we can
observe the calculated metrics such as Class Imbalance, KL divergence, and LP
norm.
Since the dataset is unbalanced we can see class unbalance score is high
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SagemakerHyperparameter Tuning
Sagemaker Hyperparameter Tuning
SageMaker tuner offers data scientists the ability to identify optimum
hyper-parameters. It offers state-of-the-art search strategies for tuning ML
models
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning Step
Figure: AWS Hyper-parameter tuning Flowchart. Figure Credits AWS [3]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning: Steps
As per AWS documentation, A hyper-parameter tuning job contains the
following components:
Tuning job settings
Training job definitions
Tuning job configuration
The following are few types of tuning job settings:
Warm start: In this job, the results from the previous tuning job can be
utilized to for performance improvement in a new tuning job.
Early stopping: This is common-type of job that stops the execution of
the training when the performance of the model has not improved after
multiple consecutive epochs
In the following slides we will demonstrate a tuning job for parameters such
as learning rate and batch size. A random search strategy was utilized.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Define Hyper-parameters
Figure: Defining the hyper-parameters (learning rate, batch size) to be tuned with
their corresponding ranges
Figure: Defining the metric for evaluation of the performance of the model. This will
be used to select what hyper-parameter value is optimum
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Defining Pytorch Estimator
Figure: Creating a PyTorch model and adding different input arguments such as,
instance type, metric definitions defined earlier
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Tuning job configuration: Defining Sagemaker Tuner
Figure: Created an AWS Sagemaker tuner object and provided the arguments such as
tuning, tuning metrics, and number of parallel jobs
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
RESULTS: AWS Sagemaker Tuner
Figure: Executing the AWS Sagemaker Tuner Object
Figure: Sorted list (descending order) of the validation accuracy with different AWS
Sagemaker Tuner with respect to the batch size and learning rate. Learning rate
0.000021 and batch size 128 were found as the optimum parameters
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store
Feature Store
SageMaker Feature Store enables data science teams to efficiently reuse ML
features for various teams and models. This functionality helps with the smooth
delivery of features for large-scale predictive modeling.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Feature Store: Overview
Feature engineering is an important process of ML pipeline development. AWS
feature store allows the sharing/managing of features between multiple users.
This allows efficient development of new models and maintaining or
troubleshooting existing data pipelines
AWS feature store allows users to track metadata such as:
Data sources using the features
Models using the features
Transformations used for the calculation of the features
AWS feature store allows users to avoid reinventing features from scratch and
troubleshooting existing models.
In the following slides we will demonstrate steps to create a new feature
store. Using AWS Athena queries will store and retrieve new features.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store: FlowChart
Figure: AWS Feature Store Pipeline showing the use of feature store by multiple
users for different applications and different data sources. Figure Credits AWS
Documentation [4]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SageMaker: Initialization Feature Group
Figure: Creating a new feature store service in AWS Sagemaker by using the boto3
client
Figure: Initialize a new feature group and AWS feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Sagemaker Feature Group: Adding Features
Figure: Defining the features to be added in the store by providing the feature name,
data type
Figure: Created an object of AWS feature group and provided the input arguments
such as the feature definitions, AWS Sagemaker session type
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Storing Features
Figure: Storing the features in the AWS feature Group in AWS Glue table using AWS
Athena query. Using AWS Athena to interface with the AWS Glue table so that the
feature can accessed by other team members
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Reading Stored Features
Figure: Reading/Extracting the features in the AWS feature Group from the AWS
Glue table using AWS Athena query.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Summary
In this presentation we presented different functionalities within AWS Sagemaker
We discussed about pre-training bias calculation using Sagemaker clarify. We
computed class imbalance, and KL divergence using the Womens Clothing
Reviews dataset.
In the second section of the presentation we discussed how to perform
hyper-parameter tuning using AWS Sagemakers. We tuned parameters such as
learning rate and batch over a Pytorch model
In the third section of the presentation we discussed about AWS feature store.
We discussed its benefits and demonstrated how to setup a new AWS feature
store and insert or retrieve features to/from the feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References I
A. F. Agarap, Womens e-commerce clothing reviews.
%https://www.kaggle.com/datasets/nicapotato/
womens-ecommerce-clothing-reviews.
Accessed: 2024-01-10.
N. M. Kado and K. Wadia, What is amazon sagemaker?. https:
//community.aws/concepts/what-is-sagemaker#sagemaker-clarify.
Accessed: 2024-01-10.
D. Mbaya, Amazon sagemaker automatic model tuning now supports
grid search. https://aws.amazon.com/blogs/machine-learning/
amazon-sagemaker-automatic-model-tuning-now-supports-grid-search/.
Accessed: 2024-01-10.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References II
B. Lindsey, M. Pasappulatti, and M. Roy, Extend model lineage to
include ml features using amazon sagemaker feature store.
https://aws.amazon.com/blogs/machine-learning/
extend-model-lineage-to-include-ml-features-using-amazon-sagemaker
Accessed: 2024-01-15.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P

More Related Content

AWS_projects related AWS services such as feature store store and clarify

  • 1. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store Varun Garg, Ph.D. Algorithm Engineer at Magna Ph.D. in Data Fusion Department of Electrical and Computer Engineering University of Massachusetts Lowell, Lowell, MA, USA February 19, 2024 Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 2. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Proposal Summary This presentation goes over a few tasks I did on the AWS Sagmaker service AWS Sagemaker is a fully managed machine learning service offered by Amazon Web Services. Sagemaker offers capabilities for data pre-processing, model training, hyper-parameter tuning, and model deployment. Sagemaker provides a wide range of built-in algorithms and frameworks, including TensorFlow, PyTorch, In this presentation, I will be presenting: Bias Calculation with Sagemaker Clarify different bias metrics Hyper-parameter tuning using AWS Sagemakers Hyper parameter Tuner AWS feature Store where the data scientist can share the feature with other team members by writing AWS Athena queries interfaced by using AWS Glue. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 3. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Outline of the Presentation 1 AWS Clarify: Statistical Bias 2 AWS Hyperparameter Tuning 3 AWS Feature Store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 4. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Clarify Statistical Bias AWS Clarify Statistical Bias Amazon SageMaker Clarify offers enhanced visibility to data scientists by enabling them to effectively compute pre-training bias metrics (data bias, class imbalance) or post-training metrics bias (model bias, etc) Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 5. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Detecting Statistical Bias: FlowChart Amazon SageMaker Clarify has the ability to identify various useful measurements at different steps of data science projects, such as during data preparation, and post-model deployment. In the following slides we will demonstrate the usage of Sagemaker Clarify These metrics were applied to a Kaggle dataset [1] Figure: Figure showing the usage of Sagemaker Clarify in data preparation phase, model training phase and the deployment phase of the ML pipeline making. Figure Credits AWS Documentation [2] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 6. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Configuring AWS Clarify We will be using Sagemaker Clarify for calculating Pre-training bias metrics such as class imbalance, KL- Divergence, etc. In the code snippet below multiple variables are defined to configure AWS to clarify such input data, output path of the report, column name in data having labels Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 7. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Clarify: Launching Task In AWS clarify configuration we define the calculation metrics such as Class Imbalance, KL divergence, LP norm. Figure: AWS executing the AWS clarify task using AWS Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 8. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Results: Bias Report In the following table for the column in the dataset called dresses we can observe the calculated metrics such as Class Imbalance, KL divergence, and LP norm. Since the dataset is unbalanced we can see class unbalance score is high Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 9. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS SagemakerHyperparameter Tuning Sagemaker Hyperparameter Tuning SageMaker tuner offers data scientists the ability to identify optimum hyper-parameters. It offers state-of-the-art search strategies for tuning ML models Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 10. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Hyper-parameter Tuning Step Figure: AWS Hyper-parameter tuning Flowchart. Figure Credits AWS [3] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 11. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Hyper-parameter Tuning: Steps As per AWS documentation, A hyper-parameter tuning job contains the following components: Tuning job settings Training job definitions Tuning job configuration The following are few types of tuning job settings: Warm start: In this job, the results from the previous tuning job can be utilized to for performance improvement in a new tuning job. Early stopping: This is common-type of job that stops the execution of the training when the performance of the model has not improved after multiple consecutive epochs In the following slides we will demonstrate a tuning job for parameters such as learning rate and batch size. A random search strategy was utilized. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 12. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Training job definitions: Define Hyper-parameters Figure: Defining the hyper-parameters (learning rate, batch size) to be tuned with their corresponding ranges Figure: Defining the metric for evaluation of the performance of the model. This will be used to select what hyper-parameter value is optimum Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 13. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Training job definitions: Defining Pytorch Estimator Figure: Creating a PyTorch model and adding different input arguments such as, instance type, metric definitions defined earlier Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 14. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Tuning job configuration: Defining Sagemaker Tuner Figure: Created an AWS Sagemaker tuner object and provided the arguments such as tuning, tuning metrics, and number of parallel jobs Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 15. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store RESULTS: AWS Sagemaker Tuner Figure: Executing the AWS Sagemaker Tuner Object Figure: Sorted list (descending order) of the validation accuracy with different AWS Sagemaker Tuner with respect to the batch size and learning rate. Learning rate 0.000021 and batch size 128 were found as the optimum parameters Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 16. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Feature Store Feature Store SageMaker Feature Store enables data science teams to efficiently reuse ML features for various teams and models. This functionality helps with the smooth delivery of features for large-scale predictive modeling. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 17. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Feature Store: Overview Feature engineering is an important process of ML pipeline development. AWS feature store allows the sharing/managing of features between multiple users. This allows efficient development of new models and maintaining or troubleshooting existing data pipelines AWS feature store allows users to track metadata such as: Data sources using the features Models using the features Transformations used for the calculation of the features AWS feature store allows users to avoid reinventing features from scratch and troubleshooting existing models. In the following slides we will demonstrate steps to create a new feature store. Using AWS Athena queries will store and retrieve new features. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 18. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Feature Store: FlowChart Figure: AWS Feature Store Pipeline showing the use of feature store by multiple users for different applications and different data sources. Figure Credits AWS Documentation [4] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 19. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS SageMaker: Initialization Feature Group Figure: Creating a new feature store service in AWS Sagemaker by using the boto3 client Figure: Initialize a new feature group and AWS feature store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 20. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Sagemaker Feature Group: Adding Features Figure: Defining the features to be added in the store by providing the feature name, data type Figure: Created an object of AWS feature group and provided the input arguments such as the feature definitions, AWS Sagemaker session type Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 21. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Output: AWS Feature Store: Storing Features Figure: Storing the features in the AWS feature Group in AWS Glue table using AWS Athena query. Using AWS Athena to interface with the AWS Glue table so that the feature can accessed by other team members Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 22. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Output: AWS Feature Store: Reading Stored Features Figure: Reading/Extracting the features in the AWS feature Group from the AWS Glue table using AWS Athena query. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 23. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Summary In this presentation we presented different functionalities within AWS Sagemaker We discussed about pre-training bias calculation using Sagemaker clarify. We computed class imbalance, and KL divergence using the Womens Clothing Reviews dataset. In the second section of the presentation we discussed how to perform hyper-parameter tuning using AWS Sagemakers. We tuned parameters such as learning rate and batch over a Pytorch model In the third section of the presentation we discussed about AWS feature store. We discussed its benefits and demonstrated how to setup a new AWS feature store and insert or retrieve features to/from the feature store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 24. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store References I A. F. Agarap, Womens e-commerce clothing reviews. %https://www.kaggle.com/datasets/nicapotato/ womens-ecommerce-clothing-reviews. Accessed: 2024-01-10. N. M. Kado and K. Wadia, What is amazon sagemaker?. https: //community.aws/concepts/what-is-sagemaker#sagemaker-clarify. Accessed: 2024-01-10. D. Mbaya, Amazon sagemaker automatic model tuning now supports grid search. https://aws.amazon.com/blogs/machine-learning/ amazon-sagemaker-automatic-model-tuning-now-supports-grid-search/. Accessed: 2024-01-10. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 25. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store References II B. Lindsey, M. Pasappulatti, and M. Roy, Extend model lineage to include ml features using amazon sagemaker feature store. https://aws.amazon.com/blogs/machine-learning/ extend-model-lineage-to-include-ml-features-using-amazon-sagemaker Accessed: 2024-01-15. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P