Scraping

AKA Web Scraping, Price Scraping, Data Aggregation, Harvesting, Mining, Mirroring, Scraper Bots


Scraping 101

Scraping refers to the use of automated tools to collect large amounts of data from a target application in order to reuse that data elsewhere.

Scraping can range from benign to malicious, depending on the source, objective, and frequency of the requests. For example, a search engine bot that respects scraping rates defined in the site’s robot.txt will likely be viewed as acceptable, whereas daily price scraping from a competitor is likely unwanted.

A top 5 US airline was losing money

Scrapers were increasing the airline’s infrastructure costs and affecting the airline’s ability to manage revenue, so the security team sought out Shape.

Case Study: International Airline Fights Fare Scrapers

Key points

  1. Travel aggregators used bots to discover and publicize non-compliant ticketing options
  2. Scraping accounted for 25% of traffic on main search URL
  3. Unwanted scrapers evaded all existing security solutions before Shape

25%
Unwanted scraping accounted for 25% of all search traffic on a
single URL.

The 3 Steps of Scraping

1. Write Attack Script

Using automated tools, off-the-shelf scripts, or even scraping-as-a-service providers, attackers can easily create scripts to discover and scrape website content including prices, promotions, articles, and metadata.

How Attackers Simulate Users

Gartner VP & Distinguished Analyst Avivah Litans outlines methods scrapers employ to resemble genuine users

Watch the Video
2. Collect Data

Scraping campaigns can range from brazen to stealth, depending on the attacker’s skillset and aims. Execution of the scraping script may be distributed amongst hundreds or thousands of servers in order to blend in with traffic patterns of the enterprise’s entire user population.

Your marketing team may be the first to experience the symptoms of scraping attacks, including fallen search rankings and poorer conversion rates.

3. Monetize

The extracted data may be sold, used for price-comparison sites, or even used to create imitation sites for fraudulent purposes.

Even if the scraper is a partner, enterprises may prefer that the party retrieve data from a specified API, rather than consume expensive resources by requesting data directly from web servers.

Latest Research

65%

Scrapers make up, on average, 65% of traffic on main URLs for social networking sites.

83%

Scrapers make up, on average, 83% of traffic on search applications in the travel sector.

eBook

OWASP, a global non-profit dedicated to improving software security, highlighted the Top 20 most critical automated threats to web applications, including scraping (OAT-011).

Manage Scrapers Without Having to Manage a Solution

Try Shape’s fully managed service

 

2017 CREDENTIAL SPILL REPORT   DOWNLOAD