This presentation summarizes Varun Garg's work with AWS Sagemaker Clarify, Sagemaker Hyperparameter Tuner, and AWS Feature Store. It includes calculating statistical bias with Clarify, hyperparameter tuning using the Tuner, and sharing features between teams using the Feature Store. The presentation outlines the tools, demonstrates their use through code snippets and results, and discusses how each supports machine learning tasks like bias detection, hyperparameter optimization, and feature engineering reuse.
1 of 25
Download to read offline
More Related Content
AWS_projects related AWS services such as feature store store and clarify
1. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Demonstrating My Projects With AWS
Sagemaker Clarify, Sagemaker Tuner & AWS
Feature Store
Varun Garg, Ph.D.
Algorithm Engineer at Magna
Ph.D. in Data Fusion
Department of Electrical and Computer Engineering
University of Massachusetts Lowell, Lowell, MA, USA
February 19, 2024
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
2. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Proposal Summary
This presentation goes over a few tasks I did on the AWS Sagmaker service
AWS Sagemaker is a fully managed machine learning service offered by Amazon
Web Services.
Sagemaker offers capabilities for data pre-processing, model training,
hyper-parameter tuning, and model deployment.
Sagemaker provides a wide range of built-in algorithms and frameworks,
including TensorFlow, PyTorch,
In this presentation, I will be presenting:
Bias Calculation with Sagemaker Clarify different bias metrics
Hyper-parameter tuning using AWS Sagemakers Hyper parameter Tuner
AWS feature Store where the data scientist can share the feature with
other team members by writing AWS Athena queries interfaced by
using AWS Glue.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
3. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Outline of the Presentation
1 AWS Clarify: Statistical Bias
2 AWS Hyperparameter Tuning
3 AWS Feature Store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
4. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify Statistical Bias
AWS Clarify Statistical Bias
Amazon SageMaker Clarify offers enhanced visibility to data scientists by
enabling them to effectively compute pre-training bias metrics (data bias, class
imbalance) or post-training metrics bias (model bias, etc)
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
5. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Detecting Statistical Bias: FlowChart
Amazon SageMaker Clarify has the ability to identify various useful
measurements at different steps of data science projects, such as during data
preparation, and post-model deployment.
In the following slides we will demonstrate the usage of Sagemaker Clarify
These metrics were applied to a Kaggle dataset [1]
Figure: Figure showing the usage of Sagemaker Clarify in data preparation phase,
model training phase and the deployment phase of the ML pipeline making. Figure
Credits AWS Documentation [2]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
6. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Configuring AWS Clarify
We will be using Sagemaker Clarify for calculating Pre-training bias metrics
such as class imbalance, KL- Divergence, etc.
In the code snippet below multiple variables are defined to configure AWS to
clarify such input data, output path of the report, column name in data having
labels
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
7. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify: Launching Task
In AWS clarify configuration we define the calculation metrics such as Class
Imbalance, KL divergence, LP norm.
Figure: AWS executing the AWS clarify task using AWS
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
8. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Results: Bias Report
In the following table for the column in the dataset called dresses we can
observe the calculated metrics such as Class Imbalance, KL divergence, and LP
norm.
Since the dataset is unbalanced we can see class unbalance score is high
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
9. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SagemakerHyperparameter Tuning
Sagemaker Hyperparameter Tuning
SageMaker tuner offers data scientists the ability to identify optimum
hyper-parameters. It offers state-of-the-art search strategies for tuning ML
models
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
10. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning Step
Figure: AWS Hyper-parameter tuning Flowchart. Figure Credits AWS [3]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
11. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning: Steps
As per AWS documentation, A hyper-parameter tuning job contains the
following components:
Tuning job settings
Training job definitions
Tuning job configuration
The following are few types of tuning job settings:
Warm start: In this job, the results from the previous tuning job can be
utilized to for performance improvement in a new tuning job.
Early stopping: This is common-type of job that stops the execution of
the training when the performance of the model has not improved after
multiple consecutive epochs
In the following slides we will demonstrate a tuning job for parameters such
as learning rate and batch size. A random search strategy was utilized.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
12. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Define Hyper-parameters
Figure: Defining the hyper-parameters (learning rate, batch size) to be tuned with
their corresponding ranges
Figure: Defining the metric for evaluation of the performance of the model. This will
be used to select what hyper-parameter value is optimum
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
13. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Defining Pytorch Estimator
Figure: Creating a PyTorch model and adding different input arguments such as,
instance type, metric definitions defined earlier
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
14. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Tuning job configuration: Defining Sagemaker Tuner
Figure: Created an AWS Sagemaker tuner object and provided the arguments such as
tuning, tuning metrics, and number of parallel jobs
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
15. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
RESULTS: AWS Sagemaker Tuner
Figure: Executing the AWS Sagemaker Tuner Object
Figure: Sorted list (descending order) of the validation accuracy with different AWS
Sagemaker Tuner with respect to the batch size and learning rate. Learning rate
0.000021 and batch size 128 were found as the optimum parameters
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
16. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store
Feature Store
SageMaker Feature Store enables data science teams to efficiently reuse ML
features for various teams and models. This functionality helps with the smooth
delivery of features for large-scale predictive modeling.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
17. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Feature Store: Overview
Feature engineering is an important process of ML pipeline development. AWS
feature store allows the sharing/managing of features between multiple users.
This allows efficient development of new models and maintaining or
troubleshooting existing data pipelines
AWS feature store allows users to track metadata such as:
Data sources using the features
Models using the features
Transformations used for the calculation of the features
AWS feature store allows users to avoid reinventing features from scratch and
troubleshooting existing models.
In the following slides we will demonstrate steps to create a new feature
store. Using AWS Athena queries will store and retrieve new features.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
18. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store: FlowChart
Figure: AWS Feature Store Pipeline showing the use of feature store by multiple
users for different applications and different data sources. Figure Credits AWS
Documentation [4]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
19. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SageMaker: Initialization Feature Group
Figure: Creating a new feature store service in AWS Sagemaker by using the boto3
client
Figure: Initialize a new feature group and AWS feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
20. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Sagemaker Feature Group: Adding Features
Figure: Defining the features to be added in the store by providing the feature name,
data type
Figure: Created an object of AWS feature group and provided the input arguments
such as the feature definitions, AWS Sagemaker session type
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
21. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Storing Features
Figure: Storing the features in the AWS feature Group in AWS Glue table using AWS
Athena query. Using AWS Athena to interface with the AWS Glue table so that the
feature can accessed by other team members
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
22. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Reading Stored Features
Figure: Reading/Extracting the features in the AWS feature Group from the AWS
Glue table using AWS Athena query.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
23. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Summary
In this presentation we presented different functionalities within AWS Sagemaker
We discussed about pre-training bias calculation using Sagemaker clarify. We
computed class imbalance, and KL divergence using the Womens Clothing
Reviews dataset.
In the second section of the presentation we discussed how to perform
hyper-parameter tuning using AWS Sagemakers. We tuned parameters such as
learning rate and batch over a Pytorch model
In the third section of the presentation we discussed about AWS feature store.
We discussed its benefits and demonstrated how to setup a new AWS feature
store and insert or retrieve features to/from the feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
24. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References I
A. F. Agarap, Womens e-commerce clothing reviews.
%https://www.kaggle.com/datasets/nicapotato/
womens-ecommerce-clothing-reviews.
Accessed: 2024-01-10.
N. M. Kado and K. Wadia, What is amazon sagemaker?. https:
//community.aws/concepts/what-is-sagemaker#sagemaker-clarify.
Accessed: 2024-01-10.
D. Mbaya, Amazon sagemaker automatic model tuning now supports
grid search. https://aws.amazon.com/blogs/machine-learning/
amazon-sagemaker-automatic-model-tuning-now-supports-grid-search/.
Accessed: 2024-01-10.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
25. B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References II
B. Lindsey, M. Pasappulatti, and M. Roy, Extend model lineage to
include ml features using amazon sagemaker feature store.
https://aws.amazon.com/blogs/machine-learning/
extend-model-lineage-to-include-ml-features-using-amazon-sagemaker
Accessed: 2024-01-15.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P