This document discusses how data analytics can be used for fraud detection and analyzing customer feedback in ecommerce. It outlines common types of ecommerce frauds committed by buyers and sellers. It then describes how machine learning can be used to identify fraud buyers based on labeled transaction data and generated features. Customer feedback is also discussed, highlighting metrics like net promoter score and how natural language processing and bag of words models can analyze sentiment and pain points from reviews.
1 of 12
Downloaded 15 times
More Related Content
Data analytics in fraud detection and customer feedback
3. Identifying Fraud Buyers
Preempting a fraud transaction is key to success for an
ecommerce business
There are several ways in operations to detect frauds like
Two factor authentication for credit card frauds
Address parsing for COD/RTO frauds
Inspite of all this, machine learning can prove extremely
4. Labelled Data Generation
Labelled Data is food for supervised learning problems.
Generally human raters are employed to generate a
labelled data set.
Platforms such as Amazon Mechanical turks are used in
this case.
5. Feature Generation
For each human rated transaction, we generate features
which might be a good predictor of whether that
transaction is fraudulent or not
Some examples of features are :
Buyer Rating
# Credit Cards used by buyer
# previous fraudulent purchases by buyer
6. Machine Learning
Once you have labelled data and features, we can use
classification techniques like Logistic Regression,
Random Forests to detect fraudulent users.
Issues:
Imbalanced Datasets
Evaluation Metric: Depends on application
7. Human in the loop approach
As a result of machine learning, humans are not
eliminated but their job is reduced.
0 0.3 0.7 1
Human Evaluation Definitely FraudDefinitely Legitimate
Fraud Probability
8. Customer Feedback
Customer Service is one of the integral part of customer
experience for ecommerce companies. A good customer
service contributes to the brand value of the company.
Serves two purposes:
Address customer grievances
Serve as feedback loop for the product
9. Metrics on Customer Feedback
Explicit
Net Promoter Score
Promoters : People rating the product 9 and 10
Detractors: People rating the product 6 or below
Passives: People rating the product 7 or 8
NPS above 0 are considered decent and above 30-40 are
considered great
10. Data Analytics on Feedbacks
While giving feedback, a customer writes lot of stuff in
feedback form.
Natural Language Processing (NLP) can be used to
identify the sentiment of reviews and understand the
frequent pain points of the customers
11. Bag of Words Model
This model can be used to identify classify the reviews into positive and
negative using supervised classification techniques
Again, the first step here is to generate labelled data using human raters
Removal of english stop words from the reviews
Filtering out only the adjectives which might correspond to positive or
negative words.
Construct a feature saying whether a particular adjective appears in the
review or not