Scraping refers to obtaining data from a website without their endorsement. There is software that collects and filters ample amounts of data from public online sources(like websites). From a legal viewpoint, there is not much difference between the way the data is extracted which is either by using scraping software or by employing people to scrape the website. Web scraping is prominent these days and associations do it for numerous reasons. Businesses may use scraped data to study the market or examine their competitors.
HIQ WHICH IS A SMALL DATA ANALYTICS COMPANY THAT EMPLOYS COMPUTERIZED BOTS TO SCRAPE
DATA FROM PUBLIC LINKEDIN PROFILES. THE ASSERTION WAS PROVIDED THROUGH THE NINTH CIRCUIT
TO THE DISTRICT COURT’S INTRODUCTORY ULTIMATUM, IN WHICH THEY WERE PREVENTING LINKEDIN
FROM DENYING THE COMPLAINT, HIQ LABS, FROM RECORDING LINKEDIN’S PUBLICLY AVAILABLE LINKEDIN MEMBER PROFILES.
LinkedIn tried to prevent it by using CFAA (Computer Fraud and Abuse Act), which is a federal cybersecurity and anti-hacking law that says that the system cannot be accessed without gaining authorization or in case of excess authorization.
HiQ scraped profile data from LinkedIn which is publicly available. But this action of HiQ is not promoted by LinkedIn and in response to it, they issued a cease-and-desist order on May 23, 2017. The letter submitted includes that, by the act of scraping by HiQ they are certainly overstepping LinkedIn user’s agreement with California and federal law. They also stated that they would block the activities of HiQ to scrape their website.
After two weeks, HiQ prosecuted for an introductory referendum against LinkedIn in the Northern District of California, pleading the court for a declaratory verdict claiming that scraping LinkedIn’s data was legal and was able to win the case at the district level. The court issued injunctions to LinkedIn to provide HiQ with its content again. The matter then reached the Ninth Circuit with the plea of LinkedIn.
Meantime, scraping has taken a new dimension. Mark Zuckerberg was apprehensive of a two-day statement with the charge that Facebook failed to preserve their user’s data from third-party compilation like Cambridge Analytica.
What are the legal issues that arouse with web scraping?
Claims in scraping cases may include copyright violation, breach of contract or violation of CFAA, etc.
Copyright © (Most prominent issue with scraping)
Copyright gives the creator with the liberty to copy, distribute and exhibit their work. It protects creative work but not facts, information or data. While scraping involves copying some content in which copying the way a website is developed or communicated could give rise to the copyright claim but copying the information itself is uncertain to comprise infringement under United States law.
Breach of Contract
Computer Fraud and Abuse Act Against Web Scraping
It prohibits gaining information without authorization from the computer system. Infringement of CFAA brings both criminal and civil punishment Rule of CFAA has become central in scraping disputes. In the 2016 decision Facebook, Inc. v Power Venture, Inc. in which the Ninth Circuit was in favor of Facebook on CFAA claim on discovering that Power Ventures were scraping data which was simply banned by Facebook TOUs and also a conceivable breach of CFAA rule. Circumstances of Facebook and LinkedIn case as distinct. Power Ventures have extracted data from the private profile of Facebook(with user’s permission) while HiQ scraping was restricted to public profiles of LinkedIn.
A Central Question Then Aroused:
Who has the right to access the website content?
LINKEDIN ISSUED A CEASE-AND-DESIST LETTER TO HIQ INQUIRING WHETHER IT WAS WITHOUT AN APPROVAL UNDER CFAA IN WHICH THE NINTH CIRCUIT ASSERTED THAT IT WAS NOT. CFAA INVESTIGATED THE DATA THAT IS NOT PUBLICLY EXPOSED WHILE PUBLIC PROFILES OF LINKEDIN ARE PUBLICLY ACCESSIBLE I.E PUBLICLY ACCESSIBLE.
LinkedIn claimed that HiQ has violated the terms of its user agreement. Responding to it, the Ninth Circuit proclaimed that by sending cease-and-desist letters, LinkedIn has snatched the tag of “user” from HiQ. Although LinkedIn didn’t claim any right over public profile content, it was their responsibility to safeguard user’s privacy rights thus pleading to block HiQ scraping actions. The court didn’t put much emphasis on the fact of protecting public profile content as there was very little expectation for privacy.
Though the court discussed several claims the case was reasonably about CFAA. In the end, it doesn’t declare openly that a website holder doesn’t have an opportunity against the wholesale allowance of its public content.
The United States Ninth Circuit Court of Appeals declared jurisdiction in favor of hiQ, a data analytics company from LinkedIn public profiles. This case involves a lot of implications. This trial may not be over and could wind up before the United States Supreme Court.
However, its widespread interpretation which can be extracted as online data and facts which are public and not owned or preserved by passwords through publishers can be fetched by third-party. The court also conveyed their interest in giving companies the right to choose who can accumulate and utilize the data. While the data which the company does not own would be made public(which would be available to all viewers), which the corporation themselves would be used for fetching and utilizing it which would possibly risk the generation of information monopolies that would deserve the public interest.
For helping associations to extract massive data from the web without any irritants, I offer
affordable data extraction services. If you are in need of help with your web scraping
projects, get in touch with us and I will be pleased to help.