How Google Locates and Identifies Malware
At SecTOR, Google security researcher details how the search giant identifies malware and why the company doesn't remove all malware pages from its search index.
TORONTO -- Google knows a thing or two about malware on the Web. Google comes across malware on a regular basis and has made a number of efforts to help secure Web users against potential malware risk.
In a session at the SecTOR security conference in Toronto, Google security researcher Fabrice Jaubert detailed how the search engine giant identifies malware and what it does to help protect the safety and security of Web users.
At a high-level, Google differentiates between what it considers to be malware, phishing and spam sites. Jaubert noted that a malware site is a site that is a host to some kind of malicious software that could potentially do harm to a user's computer. Google identifies malware sites for users with a warning.
The warning that Google has made available to search users, has undergone an evolution in recent years. Jaubert explained that early on, Google had a warning page that was displayed to users about potential malware being on a given page, and then provided users with a button that enabled them to click through to the page.
According to Jaubert, 95 percent of users were still clicking through to the page with the malware on it, even though Google had provided a warning. So Google shifted tactics. Now the company provides the URL of the malware site as text, which requires a user to click and paste the address if he or she still wants to proceed to the malware site.
Jaubert also noted that only some types of malware sites are ever removed from the search index. Google does remove phishing sites from its index, as well as those sites that it considers to be 'spammy.' A spammy website is one that typically doesn't contain any real content and is just a mechanism for distributing spam via a link farm.
In terms of the types of sites that potentially have malware on them, Jaubert told the audience that the distribution is widespread.
"There really is no safe harbor on the Web," Jaubert said. "Your browsing habits are not a factor in keeping you safe."
The geography of malware is also widespread, though the U.S. is a haven for malware distribution sites. According to Jaubert, the U.S. hosts over 25 percent of malware distribution.
"The problem is global, but it starts here in our backyard," Jaubert said.
From Google's perspective, malware is often some type of drive-by download where malicious software is downloaded on a user's computer without his or her permission. Jaubert noted that for a drive-by download to work, there typically has to be a vulnerability in the Web browser or in one of the add-ons used. He added that it's often very difficult for users to keep their browser plugins like Flash and PDF updated.
In addition to vulnerability on the user side, there needs to be a legitimate website that is serving the malware or redirecting users to a malware distribution site. Examples of how malware redirection can occur include the integration of an iframe or a script tag that pulls from the distribution site.
Jaubert noted that often there is a central distribution site for malware that feeds many different landing pages.
The detection pipeline that Google uses to identify malware includes the use of virtual machines running unpatched Windows operating systems with a vulnerable Internet Explorer (IE) browser and out-of-date plug-ins. Jaubert noted that Google has also tested for malware with Firefox, but typically new malware is first found for IE.
The Google virtual machines then monitor network traffic and simulate responses in order to determine if malicious activity is occurring. Google's detection pipeline scans millions of websites in order to try and help identify sites that could be hosting malware.
The results of the Google detection pipeline help to feed a number of Google tools that assist users and websites to reduce their malware risks. The Safe Browsing API is Google's list of potentially malicious sites and is technology that is integrated into the Apple Safari, Mozilla Firefox and Google Chrome Web browsers. For webmasters, the Google Webmasters service provides information about potential malware on sites.
Stay current with browser security news--follow eSecurityPlanet on Twitter @eSecurityP.