際際滷

際際滷Share a Scribd company logo
API Scraping.
How to protect your API against something that is not
necessarily an attack
Artem Demchenkov
Illustration 息 Caterina Carraro/Billie
2
Artem Demchenkov
 CTO & Co-Founder at Billie
 ardemchenkov@gmail.com
 https://www.linkedin.com/in/artem-demchenkov-76b69934/
Agenda
3
 API scraping may not look like an attack
 Why should we worry then?
 KYCB
 How to recognize that youre being scrapped
 How to stop API scraping. It is even possible?
 Bonus
4
Lets begin with numbers
Some numbers
5
 46% of web traffic are scraping bots (Distil Networks, 2016)
 2% of online revenue is lost due to web scraping (Distil Networks, 2016)
 Global e-commerce sales are 3.53 trillion dollars (Statista, 2019)
 2% of it is >70 billion dollars
6
API Attacks
API Attacks
7
 DoS/DDoS
 Injections
 Data Exposure
 Authentication Hijacking
 Man in the Middle
 Unencrypted Communication
 Application Abuse
 Parameter Tampering
}Unexpected API usage
8
API Scraping is different
API Scraping
9
 Is a pretty ordinary and expected user behaviour
 It doesnt try to hijack or break anything
 Content - is the main point of interest
 Thats why it is not always easy to identify that you are being scraped
Ordinary user behavior 鍖ow
10
USER UI API DATA SOURCE
Ordinary user behavior 鍖ow
11
USER UI API DATA SOURCE
Scraping 鍖ow
12
USER UI API DATA SOURCE
13
Scraping is not allowed
Google Maps Terms of Service
14
3.2.3 Restrictions Against Misusing the Services.
(a) No Scraping. Customer will not export, extract, or otherwise scrape Google Maps
Content for use outside the Services. For example, Customer will not: (i) pre-fetch, index,
store, reshare, or rehost Google Maps Content outside the services; (ii) bulk download
Google Maps tiles, Street View images, geocodes, directions, distance matrix results, roads
information, places information, elevation values, and time zone details; (iii) copy and save
business names, addresses, or user reviews; or (iv) use Google Maps Content with
text-to-speech services.
15
But not everyone is Google, isnt it?
Scraping 鍖ow
16
USER UI API DATA SOURCEBOT
Scraping communication 鍖ow
17
USER UI API EXTERNAL APIBOT
What makes API Scraping dangerous for API Owners
18
 Unpredictable expenses
 Income losses
 Loss of intellectual property ownership
 Loss of competitive advantage
 Risk of being reverse engineered
19
KYCB
Know Your Customers Behavior
How to recognize that your API is being scrapped?
20
Logging Thresholds Monitoring alerts
How to set up a proper Data Monitoring
Smart Data-driven Alerts with Prometheus and Grafana
https://youtu.be/GbwyF6xZwwc
How to prevent API Scraping
22
HTTP-Request limits Per user, Per time period
Basic Firewall rules Headers, Content-Size, IPs
More extended rules Geography, Pattern Detection
IP databases https://www.abuseipdb.com/
Real-time ML based bot detection Specific services (e.g. Cloudflare)
Data thresholds Num or registrations, num of API calls...
23
They say:
no one can block API scraping completely
24
We say:
it is a race and one who stops 鍖rst loses
25
Bonus Track
Scraping communication 鍖ow
26
USER UI API DATA SOURCEBOTHUMAN
This one leaves traces
27
Prior to building a bot,
a human needs to explore your API
Thank you
 ardemchenkov@gmail.com
 https://www.linkedin.com/in/artem-demchenkov-76b69934/
Check it out. Theres interesting
More in our Engineering Corner on Medium
29
https://medium.com/billie-鍖nanzratgeber

More Related Content

Similar to API Scraping. How to protect your API against something that is not necessarily an attack (20)

API Security Best Practices and Guidelines
API Security Best Practices and GuidelinesAPI Security Best Practices and Guidelines
API Security Best Practices and Guidelines
WSO2
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
JeremySnyder8
architecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdfarchitecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdf
WSO2
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDATop OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
NowSecure
IRJET- 3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET-  	  3 - Tier Cross Platform Application for Digital Visiting Card with...IRJET-  	  3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET- 3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET Journal
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
2016: The Year to Align Marketing & IT Departments
2016: The Year to Align Marketing & IT Departments2016: The Year to Align Marketing & IT Departments
2016: The Year to Align Marketing & IT Departments
Yottaa
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
WSO2
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
WSO2
Cyber security series Application Security
Cyber security series   Application SecurityCyber security series   Application Security
Cyber security series Application Security
Jim Kaplan CIA CFE
Find IT & Marketings Common Ground: Make Your Site Faster
Find IT & Marketings Common Ground: Make Your Site FasterFind IT & Marketings Common Ground: Make Your Site Faster
Find IT & Marketings Common Ground: Make Your Site Faster
Ghostery, Inc.
SF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SF Big Analytics Meetup - Exact Count Distinct with Apache KylinSF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SamanthaBerlant
API Gateway How-To: The Many Ways to Apply the Gateway Pattern
API Gateway How-To: The Many Ways to Apply the Gateway PatternAPI Gateway How-To: The Many Ways to Apply the Gateway Pattern
API Gateway How-To: The Many Ways to Apply the Gateway Pattern
VMware Tanzu
INTERFACE by apidays 2023 - Everything you need to know about API security, T...
INTERFACE by apidays 2023 - Everything you need to know about API security, T...INTERFACE by apidays 2023 - Everything you need to know about API security, T...
INTERFACE by apidays 2023 - Everything you need to know about API security, T...
apidays
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
Tyler Wishnoff
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
SamanthaBerlant
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
Zymr Fintech app development
 Zymr Fintech app development Zymr Fintech app development
Zymr Fintech app development
Zymr Inc
CASB Your new best friend for safe cloud adoption?
CASB  Your new best friend for safe cloud adoption? CASB  Your new best friend for safe cloud adoption?
CASB Your new best friend for safe cloud adoption?
Digital Transformation EXPO Event Series
API Security Best Practices and Guidelines
API Security Best Practices and GuidelinesAPI Security Best Practices and Guidelines
API Security Best Practices and Guidelines
WSO2
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays Helsinki & North 2023 - API Security in the era of Generative AI, Mat...
apidays
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
JeremySnyder8
architecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdfarchitecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdf
WSO2
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDATop OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
Top OSS for Mobile AppSec Testing: The Latest on R2 and FRIDA
NowSecure
IRJET- 3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET-  	  3 - Tier Cross Platform Application for Digital Visiting Card with...IRJET-  	  3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET- 3 - Tier Cross Platform Application for Digital Visiting Card with...
IRJET Journal
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
2016: The Year to Align Marketing & IT Departments
2016: The Year to Align Marketing & IT Departments2016: The Year to Align Marketing & IT Departments
2016: The Year to Align Marketing & IT Departments
Yottaa
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
The Best of Both Worlds: Introducing WSO2 API Manager 4.0.0 [ANZ]
WSO2
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
[WSO2 Integration Summit Singapore 2019] Transforming Your Business through APIs
WSO2
Cyber security series Application Security
Cyber security series   Application SecurityCyber security series   Application Security
Cyber security series Application Security
Jim Kaplan CIA CFE
Find IT & Marketings Common Ground: Make Your Site Faster
Find IT & Marketings Common Ground: Make Your Site FasterFind IT & Marketings Common Ground: Make Your Site Faster
Find IT & Marketings Common Ground: Make Your Site Faster
Ghostery, Inc.
SF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SF Big Analytics Meetup - Exact Count Distinct with Apache KylinSF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SF Big Analytics Meetup - Exact Count Distinct with Apache Kylin
SamanthaBerlant
API Gateway How-To: The Many Ways to Apply the Gateway Pattern
API Gateway How-To: The Many Ways to Apply the Gateway PatternAPI Gateway How-To: The Many Ways to Apply the Gateway Pattern
API Gateway How-To: The Many Ways to Apply the Gateway Pattern
VMware Tanzu
INTERFACE by apidays 2023 - Everything you need to know about API security, T...
INTERFACE by apidays 2023 - Everything you need to know about API security, T...INTERFACE by apidays 2023 - Everything you need to know about API security, T...
INTERFACE by apidays 2023 - Everything you need to know about API security, T...
apidays
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
Tyler Wishnoff
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
How to Guarantee Exact Count Distinct Queries with Sub-Second Latency on Mass...
SamanthaBerlant
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
Zymr Fintech app development
 Zymr Fintech app development Zymr Fintech app development
Zymr Fintech app development
Zymr Inc

Recently uploaded (20)

Unit II: Design of Static Equipment Foundations
Unit II: Design of Static Equipment FoundationsUnit II: Design of Static Equipment Foundations
Unit II: Design of Static Equipment Foundations
Sanjivani College of Engineering, Kopargaon
How to Make an RFID Door Lock System using Arduino
How to Make an RFID Door Lock System using ArduinoHow to Make an RFID Door Lock System using Arduino
How to Make an RFID Door Lock System using Arduino
CircuitDigest
Frankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkundeFrankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkunde
Lisa Emerson
AI, Tariffs and Supply Chains in Knowledge Graphs
AI, Tariffs and Supply Chains in Knowledge GraphsAI, Tariffs and Supply Chains in Knowledge Graphs
AI, Tariffs and Supply Chains in Knowledge Graphs
Max De Marzi
Taykon-Kalite belgeleri
Taykon-Kalite belgeleriTaykon-Kalite belgeleri
Taykon-Kalite belgeleri
TAYKON
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptxMathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
ppkmurthy2006
Turbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdfTurbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdf
Totok Sulistiyanto
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
ASHISHDESAI85
04 MAINTENANCE OF CONCRETE PAVEMENTS.ppt
04  MAINTENANCE OF CONCRETE PAVEMENTS.ppt04  MAINTENANCE OF CONCRETE PAVEMENTS.ppt
04 MAINTENANCE OF CONCRETE PAVEMENTS.ppt
sreenath seenu
Industrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products ProfileIndustrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products Profile
zebcoeng
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
How Engineering Model Making Brings Designs to Life.pdf
How Engineering Model Making Brings Designs to Life.pdfHow Engineering Model Making Brings Designs to Life.pdf
How Engineering Model Making Brings Designs to Life.pdf
Maadhu Creatives-Model Making Company
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
J. Agricultural Machinery
Engineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdfEngineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdf
Sona
How to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using ArduinoHow to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using Arduino
CircuitDigest
Lessons learned when managing MySQL in the Cloud
Lessons learned when managing MySQL in the CloudLessons learned when managing MySQL in the Cloud
Lessons learned when managing MySQL in the Cloud
Igor Donchovski
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdfCS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
PonniS7
Cyber Security_ Protecting the Digital World.pptx
Cyber Security_ Protecting the Digital World.pptxCyber Security_ Protecting the Digital World.pptx
Cyber Security_ Protecting the Digital World.pptx
Harshith A S
Env and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdfEnv and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdf
MahmudHasan747870
How to Make an RFID Door Lock System using Arduino
How to Make an RFID Door Lock System using ArduinoHow to Make an RFID Door Lock System using Arduino
How to Make an RFID Door Lock System using Arduino
CircuitDigest
Frankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkundeFrankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkunde
Lisa Emerson
AI, Tariffs and Supply Chains in Knowledge Graphs
AI, Tariffs and Supply Chains in Knowledge GraphsAI, Tariffs and Supply Chains in Knowledge Graphs
AI, Tariffs and Supply Chains in Knowledge Graphs
Max De Marzi
Taykon-Kalite belgeleri
Taykon-Kalite belgeleriTaykon-Kalite belgeleri
Taykon-Kalite belgeleri
TAYKON
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptxMathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
ppkmurthy2006
Turbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdfTurbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdf
Totok Sulistiyanto
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
ASHISHDESAI85
04 MAINTENANCE OF CONCRETE PAVEMENTS.ppt
04  MAINTENANCE OF CONCRETE PAVEMENTS.ppt04  MAINTENANCE OF CONCRETE PAVEMENTS.ppt
04 MAINTENANCE OF CONCRETE PAVEMENTS.ppt
sreenath seenu
Industrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products ProfileIndustrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products Profile
zebcoeng
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...
J. Agricultural Machinery
Engineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdfEngineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdf
Sona
How to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using ArduinoHow to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using Arduino
CircuitDigest
Lessons learned when managing MySQL in the Cloud
Lessons learned when managing MySQL in the CloudLessons learned when managing MySQL in the Cloud
Lessons learned when managing MySQL in the Cloud
Igor Donchovski
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdfCS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
PonniS7
Cyber Security_ Protecting the Digital World.pptx
Cyber Security_ Protecting the Digital World.pptxCyber Security_ Protecting the Digital World.pptx
Cyber Security_ Protecting the Digital World.pptx
Harshith A S
Env and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdfEnv and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdf
MahmudHasan747870

API Scraping. How to protect your API against something that is not necessarily an attack

  • 1. API Scraping. How to protect your API against something that is not necessarily an attack Artem Demchenkov Illustration 息 Caterina Carraro/Billie
  • 2. 2 Artem Demchenkov CTO & Co-Founder at Billie ardemchenkov@gmail.com https://www.linkedin.com/in/artem-demchenkov-76b69934/
  • 3. Agenda 3 API scraping may not look like an attack Why should we worry then? KYCB How to recognize that youre being scrapped How to stop API scraping. It is even possible? Bonus
  • 5. Some numbers 5 46% of web traffic are scraping bots (Distil Networks, 2016) 2% of online revenue is lost due to web scraping (Distil Networks, 2016) Global e-commerce sales are 3.53 trillion dollars (Statista, 2019) 2% of it is >70 billion dollars
  • 7. API Attacks 7 DoS/DDoS Injections Data Exposure Authentication Hijacking Man in the Middle Unencrypted Communication Application Abuse Parameter Tampering }Unexpected API usage
  • 8. 8 API Scraping is different
  • 9. API Scraping 9 Is a pretty ordinary and expected user behaviour It doesnt try to hijack or break anything Content - is the main point of interest Thats why it is not always easy to identify that you are being scraped
  • 10. Ordinary user behavior 鍖ow 10 USER UI API DATA SOURCE
  • 11. Ordinary user behavior 鍖ow 11 USER UI API DATA SOURCE
  • 12. Scraping 鍖ow 12 USER UI API DATA SOURCE
  • 14. Google Maps Terms of Service 14 3.2.3 Restrictions Against Misusing the Services. (a) No Scraping. Customer will not export, extract, or otherwise scrape Google Maps Content for use outside the Services. For example, Customer will not: (i) pre-fetch, index, store, reshare, or rehost Google Maps Content outside the services; (ii) bulk download Google Maps tiles, Street View images, geocodes, directions, distance matrix results, roads information, places information, elevation values, and time zone details; (iii) copy and save business names, addresses, or user reviews; or (iv) use Google Maps Content with text-to-speech services.
  • 15. 15 But not everyone is Google, isnt it?
  • 16. Scraping 鍖ow 16 USER UI API DATA SOURCEBOT
  • 17. Scraping communication 鍖ow 17 USER UI API EXTERNAL APIBOT
  • 18. What makes API Scraping dangerous for API Owners 18 Unpredictable expenses Income losses Loss of intellectual property ownership Loss of competitive advantage Risk of being reverse engineered
  • 20. How to recognize that your API is being scrapped? 20 Logging Thresholds Monitoring alerts
  • 21. How to set up a proper Data Monitoring Smart Data-driven Alerts with Prometheus and Grafana https://youtu.be/GbwyF6xZwwc
  • 22. How to prevent API Scraping 22 HTTP-Request limits Per user, Per time period Basic Firewall rules Headers, Content-Size, IPs More extended rules Geography, Pattern Detection IP databases https://www.abuseipdb.com/ Real-time ML based bot detection Specific services (e.g. Cloudflare) Data thresholds Num or registrations, num of API calls...
  • 23. 23 They say: no one can block API scraping completely
  • 24. 24 We say: it is a race and one who stops 鍖rst loses
  • 26. Scraping communication 鍖ow 26 USER UI API DATA SOURCEBOTHUMAN This one leaves traces
  • 27. 27 Prior to building a bot, a human needs to explore your API
  • 28. Thank you ardemchenkov@gmail.com https://www.linkedin.com/in/artem-demchenkov-76b69934/
  • 29. Check it out. Theres interesting More in our Engineering Corner on Medium 29 https://medium.com/billie-鍖nanzratgeber