Preventing Web Scraping
When we have a full proxy between Internet and our LAN we can do everything, even protect our servers, ;-) this is what a WAF does, protecting against Web Application Vulnerabilities, Web Scraping or DoS Attacks. This time, I want to write about Web Scraping which is a technique to download automatically the whole web site for extracting competitor price tracking, email addresses, directory listings for obtaining leads and marketing information, search competitors' web sites for images, financial information, or other product data, and also for copying the web site for phishing attacks.
There are many tools to extract data from websites for cloning it or analysing it like the simple cURL or Wget or another more advanced like HTTrack. For instance, I used the Social-Engineer Toolkit (SET) two summers ago in a speech called “Innovation, yes but with Security” for making a PoC of Phishing Attack where I copied the Gmail and elpais.com websites.
Although there are still few companies worried about this threat, they are becoming more and more aware about protecting their public data for competitive reasons. Next, we are going to see some Web Scraping mitigation techniques to protect our websites.
|Bot detection configuration in BIG-IP ASM|
Session Anomaly detection
This is a method for detecting clients who open a large number of new sessions. One check is counting the new sessions per second rate and another check is detecting a spike in the number of new sessions. This method could also use the IP reputation database for detecting malicious IP addresses which is an indicator as well for triggering a violation.
|Session Anomaly detection configuration in BIG-IP ASM|
|Fingerprinting configuration in BIG-IP ASM|
Web scraping was a concept unknown for me a year ago but preventing web scraping today can be done and it's a fact for many organization who are worried about their public information.
Regards my friends, drop me a line with the first thing you are thinking!!!