Explosive Leak Reveals Google’s Closely Guarded Search Algorithm Secrets

Technology May 29, 2024

Christoph Soeder/picture alliance via Getty Images

A massive leak of internal documents purportedly exposing the inner workings of Google’s search algorithm has sent shockwaves through the tech industry and search engine optimization (SEO) community.

The Verge reports that the leaked documents, spanning 2,500 pages, provide an unprecedented glimpse into how Google’s search algorithm ranks websites, a process that has long been shrouded in mystery. The leak was shared with Rand Fishkin, an SEO expert with over a decade of experience, by a source who hoped to counter what they believed were “lies” spread by Google employees about the search algorithm’s functionality.

Sundar Pichai, chief executive officer of Alphabet Inc., during the Google I/O Developers Conference in Mountain View, California, US, on Wednesday, May 10, 2023. Google introduced a new large language model, used for training artificial intelligence tools like chatbots, known as PaLM 2, and said it has already woven it into many of the internet search company’s marquee products. Photographer: David Paul Morris/Bloomberg

While the leaked information is highly technical and may be more accessible to developers and SEO professionals, it offers valuable insights into the data Google collects from webpages, sites, and searchers. Although the documents do not definitively prove that Google uses the mentioned data and signals for search rankings, they provide indirect clues about what the company deems important, according to SEO expert Mike King.

The leak touches on various aspects of Google’s search algorithm, including the handling of sensitive topics like elections, the treatment of small websites, and the types of data collected and utilized. Notably, some of the information appears to contradict public statements made by Google representatives, as highlighted by both Fishkin and King.

“‘Lied’ is harsh, but it’s the only accurate word to use here,” King wrote in his analysis of the documents. “While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries.”

Google has not publicly disputed the legitimacy of the leaked documents, despite multiple requests for comment from the Verge. However, a Google employee did reach out to Fishkin, asking him to modify some language in his post regarding the characterization of an event.

The search giant’s secretive algorithm has given rise to an entire industry of marketers who closely follow Google’s public guidance to optimize websites for millions of companies worldwide. The widespread use of these tactics has led to a perception that Google Search results are deteriorating, cluttered with low-quality content that website operators feel compelled to produce to maintain visibility.

The leaked documents raise questions about the accuracy of Google’s public statements regarding how Search works. For example, while Google representatives have repeatedly indicated that Chrome data is not used for ranking pages, the documents specifically mention Chrome in sections discussing how websites appear in Search results.

Another point of contention is the role of E-E-A-T (experience, expertise, authoritativeness, and trustworthiness) in ranking. Although Google representatives have previously stated that E-E-A-T is not a ranking factor, the documents suggest that Google collects author data from pages and has a field indicating whether an entity on the page is the author.