際際滷

際際滷Share a Scribd company logo
Image analysis for fraudulent advertisements
Jithendranath J V
Awful
Awesome Advertisement

2

12/4/2013
Image Analyzer for Creative Tester
User Issue / Yahoo! Challenge

Roadmap Theme & Goal

 Creative Tester gets approximately around a
million creatives per day to be tested for
malicious content. Of this 2 % 5 % of adverts
are of category windows mimic. These needs
to be detected and banned at the earliest, with
less human intervention.
 Need to validate brand safety and ensure
quality impressions for advertisers.

 Trust and Safety team in collaboration with
Sciences came up with a Image Analyzer
module that can detect the malicious
advertisements like windows mimic or fake
brands with phony downloads and tag them
appropriately to be banned.

Value Proposition/Positioning  To reduce
the manual effort in recognizing and
banning of malicious advertisements that
can be visually identified as fraudulent

3

12/4/2013
IONIX / CT Ecosystem
Downloaders
(Chrome,
Firefox, IE)

Cqueuer
(RMX
Apps)

Primary/Secondary
Creative/Click_URL Review

IONIX
Minbar/Technical
Tags

Creatives/LineItems gets
banned with Min-Bar

Min-bar /
Technical
Tags

Classifiers

Creative
Tester
(CT)

Creative Feed based
on Advertisers profile

Domain
Lookup
Service
Virus
Checker
(ClamAv /
Trend Micro)
Image
Analyzer

TRF_PRO
D DB

Creatives Banned

Media Guard Manual
Audit Queue.

4

Media Trust
(3rd Party)

Flash
Checker

12/4/2013
IA Internals - Modeler

5

12/4/2013
IA Internals - Classifier

6

12/4/2013
Performance  Precision and Recall
Precision

0.81818

0.81818

0.81818

0.8125

0.8125

0.76522

0.74638

0.72327

0.67895

0.65198

0.55102

0.41163

0.34281

0.27941

0.22636

0.1649

Recall

0.00402

0.06827

0.13253

0.19679

0.26104

0.3253

0.38956

0.45382

0.51807

0.58233

0.64659

0.71084

0.7751

0.83936

0.90361

1

Threshold

3.64068

2.45085

1.85615

1.24538

0.91759

0.29167

0.06556

-0.18092

-0.52049

-0.78885

-1.18095

-1.75574

-2.07244

-2.52684

-3.13143

-4.45694

5

PrecisionVsRecall

4

1.2

3
1

2

0.8

1
0

0.6

Precision
Recall

PrecisionVsRecall
-1

0.4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Threshold

-2
0.2
-3
0
0

0.5

-4

1

-5

7

12/4/2013
IA Integration with CT
Image Analyzer

HTTP

Creative Tester

Servlet

Feature Extractor
K Means

8

Histogram
Classifier

12/4/2013
IA - API Example
Response:

Request:

{

{

"responses":[
{

"requests":[
{
"imgid":"1",

"imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079
b5de50f6b30602e4a00b84a6e49e9443af7.jpg",
"imgid":"1",
"classifiers":[
{
"classifier":"wnddlg",
"status:true,
"result":true,
"conf":0.40216639639794
}]
}
]
}

"imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d0
79b5de50f6b30602e4a00b84a6e49e9443af7.jpg",
"run_wnddlg":true
}
]
}

9

12/4/2013
Sample  Classified images

Yahoo! Confidential & Proprietary.

10

12/4/2013
What Does Success Look Like
 Who are the customers?
 RMX and APT creative serving systems.
 Moneyball (Going forward)

 Success metrics
 Reducing the manual effort needed in identifying win mimic based
advertisements
 This would be measured by the confidence score generated by the system, that
would eventually help us do everything automated
 Reduction in customer complaints.

 Key business stakeholders who have/will validate success
 Serving systems
 Business teams
 Manual review teams

11

12/4/2013
Competitive Landscape
 3rd party ad verification companies.

etc.,
 What differentiates our product/Solution?





Avoiding the need to expose and send out demand inventory.
Flexibility to keep improvising the algorithms for higher precision/recall.
Quick turn around time for validation.
Building highly targeted models ( for ex: fake facebook, or fake adobe)

12

12/4/2013
Image analysis for malicious advertisement detection
Ad

Recommended

PPTX
Violencia en contra de la mujer
Cedoc Inamu
PPTX
Testslide
Jithendranath Joijoide
PDF
SUPERGOOGLE, GESS - Intelligent search engine for Smart Web X.0
Azamat Abdoullaev
PDF
Clickstream Data Warehouse - Turning clicks into customers
Albert Hui
PDF
Penetration Testing Procedures & Methodologies.pdf
Himalaya raj Sinha
PPT
Dcc Cheque Scanner
arigleton
PDF
HELMET DETECTION ON TWO-WHEELER RIDERS USING MACHINE LEARNING
IRJET Journal
PDF
IRJET- Drive Assistance an Android Application for Drowsiness Detection
IRJET Journal
PDF
Event-Based Vision Systems Technology and R&D Trends Analysis Report
Netscribes
PDF
IRJET- Credit Card Authentication using Facial Recognition
IRJET Journal
PDF
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
Kai W辰hner
PPT
Mechatriks automation - Vision Inspection/Machine Vision System
Mechatriks Industrial Services Pvt Ltd
DOCX
Projectproposal
Prosper Muzenda
PPTX
Machine Learning Impact on IoT - Part 2
Value Amplify Consulting
PDF
Car Damage Assessment to Automate Insurance Claim
IRJET Journal
PPTX
Data Annotation_Cars.pptx
ssuserfb92ae
PDF
Traceability in Manufacturing
Barcoding, Inc.
PPTX
An geo Auto beam presention on report.pptx
TejasKadam44
PPTX
Project presention on Autobeam an beams.pptx
TejasKadam44
PDF
IRJET - Smart Assistance System for Drivers
IRJET Journal
PDF
cctv hand book 2013 xxx.pdf
PawachMetharattanara
PPTX
Internet of things & predictive analytics
Prasad Narasimhan
PDF
IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry Pi
IRJET Journal
PDF
IRJET - Face Detection based ATM Safety System for Secured Transaction
IRJET Journal
PDF
IRJET- Damage Assessment for Car Insurance
IRJET Journal
PDF
Bank Customer Segmentation & Insurance Claim Prediction
IRJET Journal
PDF
IRJET- Drivers Sleep Detection
IRJET Journal
PPTX
Beautiful and Fast Images
Doug Sillars
PDF
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
PDF
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D

More Related Content

Similar to Image analysis for malicious advertisement detection (20)

PDF
Event-Based Vision Systems Technology and R&D Trends Analysis Report
Netscribes
PDF
IRJET- Credit Card Authentication using Facial Recognition
IRJET Journal
PDF
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
Kai W辰hner
PPT
Mechatriks automation - Vision Inspection/Machine Vision System
Mechatriks Industrial Services Pvt Ltd
DOCX
Projectproposal
Prosper Muzenda
PPTX
Machine Learning Impact on IoT - Part 2
Value Amplify Consulting
PDF
Car Damage Assessment to Automate Insurance Claim
IRJET Journal
PPTX
Data Annotation_Cars.pptx
ssuserfb92ae
PDF
Traceability in Manufacturing
Barcoding, Inc.
PPTX
An geo Auto beam presention on report.pptx
TejasKadam44
PPTX
Project presention on Autobeam an beams.pptx
TejasKadam44
PDF
IRJET - Smart Assistance System for Drivers
IRJET Journal
PDF
cctv hand book 2013 xxx.pdf
PawachMetharattanara
PPTX
Internet of things & predictive analytics
Prasad Narasimhan
PDF
IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry Pi
IRJET Journal
PDF
IRJET - Face Detection based ATM Safety System for Secured Transaction
IRJET Journal
PDF
IRJET- Damage Assessment for Car Insurance
IRJET Journal
PDF
Bank Customer Segmentation & Insurance Claim Prediction
IRJET Journal
PDF
IRJET- Drivers Sleep Detection
IRJET Journal
PPTX
Beautiful and Fast Images
Doug Sillars
Event-Based Vision Systems Technology and R&D Trends Analysis Report
Netscribes
IRJET- Credit Card Authentication using Facial Recognition
IRJET Journal
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
Kai W辰hner
Mechatriks automation - Vision Inspection/Machine Vision System
Mechatriks Industrial Services Pvt Ltd
Projectproposal
Prosper Muzenda
Machine Learning Impact on IoT - Part 2
Value Amplify Consulting
Car Damage Assessment to Automate Insurance Claim
IRJET Journal
Data Annotation_Cars.pptx
ssuserfb92ae
Traceability in Manufacturing
Barcoding, Inc.
An geo Auto beam presention on report.pptx
TejasKadam44
Project presention on Autobeam an beams.pptx
TejasKadam44
IRJET - Smart Assistance System for Drivers
IRJET Journal
cctv hand book 2013 xxx.pdf
PawachMetharattanara
Internet of things & predictive analytics
Prasad Narasimhan
IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry Pi
IRJET Journal
IRJET - Face Detection based ATM Safety System for Secured Transaction
IRJET Journal
IRJET- Damage Assessment for Car Insurance
IRJET Journal
Bank Customer Segmentation & Insurance Claim Prediction
IRJET Journal
IRJET- Drivers Sleep Detection
IRJET Journal
Beautiful and Fast Images
Doug Sillars

Recently uploaded (20)

PDF
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
PDF
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
PDF
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
PDF
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
PDF
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
PDF
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
PDF
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
PPTX
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
PDF
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
PPTX
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
PDF
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
PDF
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
PDF
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
PPTX
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
PDF
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
PDF
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
PPTX
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
PDF
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
PDF
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
PDF
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
Ad

Image analysis for malicious advertisement detection

  • 1. Image analysis for fraudulent advertisements Jithendranath J V
  • 3. Image Analyzer for Creative Tester User Issue / Yahoo! Challenge Roadmap Theme & Goal Creative Tester gets approximately around a million creatives per day to be tested for malicious content. Of this 2 % 5 % of adverts are of category windows mimic. These needs to be detected and banned at the earliest, with less human intervention. Need to validate brand safety and ensure quality impressions for advertisers. Trust and Safety team in collaboration with Sciences came up with a Image Analyzer module that can detect the malicious advertisements like windows mimic or fake brands with phony downloads and tag them appropriately to be banned. Value Proposition/Positioning To reduce the manual effort in recognizing and banning of malicious advertisements that can be visually identified as fraudulent 3 12/4/2013
  • 4. IONIX / CT Ecosystem Downloaders (Chrome, Firefox, IE) Cqueuer (RMX Apps) Primary/Secondary Creative/Click_URL Review IONIX Minbar/Technical Tags Creatives/LineItems gets banned with Min-Bar Min-bar / Technical Tags Classifiers Creative Tester (CT) Creative Feed based on Advertisers profile Domain Lookup Service Virus Checker (ClamAv / Trend Micro) Image Analyzer TRF_PRO D DB Creatives Banned Media Guard Manual Audit Queue. 4 Media Trust (3rd Party) Flash Checker 12/4/2013
  • 5. IA Internals - Modeler 5 12/4/2013
  • 6. IA Internals - Classifier 6 12/4/2013
  • 7. Performance Precision and Recall Precision 0.81818 0.81818 0.81818 0.8125 0.8125 0.76522 0.74638 0.72327 0.67895 0.65198 0.55102 0.41163 0.34281 0.27941 0.22636 0.1649 Recall 0.00402 0.06827 0.13253 0.19679 0.26104 0.3253 0.38956 0.45382 0.51807 0.58233 0.64659 0.71084 0.7751 0.83936 0.90361 1 Threshold 3.64068 2.45085 1.85615 1.24538 0.91759 0.29167 0.06556 -0.18092 -0.52049 -0.78885 -1.18095 -1.75574 -2.07244 -2.52684 -3.13143 -4.45694 5 PrecisionVsRecall 4 1.2 3 1 2 0.8 1 0 0.6 Precision Recall PrecisionVsRecall -1 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Threshold -2 0.2 -3 0 0 0.5 -4 1 -5 7 12/4/2013
  • 8. IA Integration with CT Image Analyzer HTTP Creative Tester Servlet Feature Extractor K Means 8 Histogram Classifier 12/4/2013
  • 9. IA - API Example Response: Request: { { "responses":[ { "requests":[ { "imgid":"1", "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079 b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "imgid":"1", "classifiers":[ { "classifier":"wnddlg", "status:true, "result":true, "conf":0.40216639639794 }] } ] } "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d0 79b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "run_wnddlg":true } ] } 9 12/4/2013
  • 10. Sample Classified images Yahoo! Confidential & Proprietary. 10 12/4/2013
  • 11. What Does Success Look Like Who are the customers? RMX and APT creative serving systems. Moneyball (Going forward) Success metrics Reducing the manual effort needed in identifying win mimic based advertisements This would be measured by the confidence score generated by the system, that would eventually help us do everything automated Reduction in customer complaints. Key business stakeholders who have/will validate success Serving systems Business teams Manual review teams 11 12/4/2013
  • 12. Competitive Landscape 3rd party ad verification companies. etc., What differentiates our product/Solution? Avoiding the need to expose and send out demand inventory. Flexibility to keep improvising the algorithms for higher precision/recall. Quick turn around time for validation. Building highly targeted models ( for ex: fake facebook, or fake adobe) 12 12/4/2013

Editor's Notes

  • #5: Cequer -> Ionix -> Can send to DLSor to CT -> set of tests like downloader, virus checker, flash checker etc.,classifier for landing page detection and also 3rd party validations.The results are tagged and sent back.Business logic on what action to take will be done by cequer.
  • #6: Feature Extraction:In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (e.g. the same measurement in both feet and meters) then the input data will be transformed into a reduced representation set of features (also named features vector). Result of application of local neighborhood operation on the image. Neighborhood operation means, going to every point and then applying a function on that point, based on its neighbor. Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N)} Edge detection:Edge detection is the name for a set of mathematical methods which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. Corner detection:Interest point detection or corner can be defined as the intersection of two edges. A corner can also be defined as a point for which there are two dominant and different edge directions in a local neighborhood of the point. Blob detection:Informally, a blob is a region of a digital image in which some properties are constant or vary within a prescribed range of values; all the points in a blob can be considered in some sense to be similar to each other.Sift - Scale invariant feature transform:Sift algorithm will calculate features that are scale invariant. Which means the image can still be recognized, when it is rotated or scaled or when viewed from a different view point.CBOW - Contextual Bag of WordsIn computer vision, the bag-of-words model (BoW model) can be applied to image classification, by treating image features as words. In document classification, a bag of words is a sparse vector of occurrence counts of words; that is, a sparse histogram over the vocabulary. In computer vision, a bag of visual words is a sparse vector of occurrence counts of a vocabulary of local image features.SVM:a support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks.