Google search and search engine spam
1/21/2011 09:00:00 AMJanuary brought a spate of stories about Google’s search quality. Reading through some of these recent articles, you might ask whether our search quality has gotten worse. The short answer is that according to the evaluation metrics that we’ve refined over more than a decade, Google’s search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness. Today, English-language spam in Google’s results is less than half what it was five years ago, and spam in most other languages is even lower than in English. However, we have seen a slight uptick of spam in recent months, and while we’ve already made progress, we have new efforts underway to continue to improve our search quality.Just as a reminder, webspam is junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines. A decade ago, the spam situation was so bad that search engines would regularly return off-topic webspam for many different searches. For the most part, Google has successfully beaten back that type of “pure webspam”—even while some spammers resort to sneakier or even illegal tactics such as hacking websites.
As we’ve increased both our size and freshness in recent months, we’ve naturally indexed a lot of good content and some spam as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content. We’ll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.
Google is doing more to get rid of spam and webspam. This must be very good news for everyone - except of course the spammers themselves!