Website's Designer | Web Design & Development Studio

Some of the advanced SEO concepts are influenced by academic research and patents that shape the way search engines rank and present content. One man who has played a huge role in uncovering these technical foundations was Bill Slawski.

To say Slawski was a SEO expert is an understatement. Bill dedicated ~~years~~ decades to analyzing search-related patents and their impact on SEO practices. All of which he document on his blog Seo by the Sea.

Even if many of his articles may seem dated, as Google keeps growing and changing, they still pack a lot of value to understand the fundamentals and core structure on top of which modern Google is built (and by virtue of mimicking Google most of modern search engines).

Grab a cup of coffee, take 45-minutes. Let’s go and geek out a bit!

1. Information Gain

What is the basis for Information Gain in SEO?

In the big picture, the concept of Information Gain originates from the field of information theory, particularly from research by Claude Shannon. Oh, yes, everyone’s favorite Claude Chadchin from CS101.

In SEO, this idea has evolved into how search engines determine the value of unique content.

The main goal is to avoid duplicate content. The secondary goal is to increase the value of unique, high-quality content and bring that to more prominence.

By using a metric similar to Information Gain, Google makes sure that users such as you and I are exposed to the most informative and unique content possible.

What Bill Wrote About It

Bill Slawski played a big role in bringing different patents related to Information Gain to light through his detailed blog posts on SEO by the Sea. He emphasized how Google might be using such a system to reward content with high information gain and punish content that merely rehashes what’s already available. His insights allowed SEOs to focus on creating more original and in-depth content to satisfy the needs of search engines for fresh information.

Here are two articles from him that addresses duplicate content directly:

On a more advanced level, understanding concept research can offer value (even if it was “just a phase” in the evolution of Google):

2. PageRank

The origin of PageRank:

PageRank is one of the most famous SEO concepts, rooted in the academic research of Google’s founders Larry Page and Sergey Brin. The foundational paper is titled “The Anatomy of a Large-Scale Hypertextual Web Search Engine“ (1998). This document introduced the world to the concept of PageRank, which calculates a webpage’s importance based on the quantity and quality of backlinks it receives.

The corresponding patent, “Method for Node Ranking in a Linked Database” (US Patent No. 6,285,999), explains how PageRank treats links as votes of confidence. A page linked to by many high-quality pages will rank higher.

Bill on PageRank

Slawski was one of the first SEO experts to break down the PageRank patent in an understandable way. He explained how PageRank works, what SEOs should focus on regarding backlinks, and how updates to PageRank (like Google’s switch to trust-based metrics and other algorithms) influenced rankings over time.

Bill’s Ten-Part Behemoth PageRank MegaThread

Beyond this MegaGuide, Slawski’s posts on SEO by the Sea often elaborated on subtle changes in PageRank that came through new patents or research, giving SEOs deeper insight into the continual evolution of the algorithm.

3. Long Clicks and Short Clicks

The basis for Long Clicks and Short Clicks:

Long clicks and short clicks are based on the concept of user engagement metrics, which search engines like Google use to gauge user satisfaction with a search result. These metrics are closely tied to patents like “Modifying search result ranking based on implicit user feedback” (US Patent No. 8661029B1), which describes how Google tracks user behavior to determine whether a particular search result was useful.

When a user clicks on a result and spends a significant amount of time on the page before returning to the search engine (a long click), it signals that the content was relevant. A short click or quick return to the search page (often called pogo-sticking) signals dissatisfaction with the result.

The Long and Short of It

Slawski consistently reviewed patents that highlighted the importance of user behavior in search engine rankings. In his analysis of these patents, he made it clear that Google was evolving toward an algorithm that took user experience into account rather than just relying on on-page factors or backlinks.

Slawski’s deep dive into these interaction-based signals gave SEOs early warning to focus on providing real value to users and improving the usability and satisfaction of their websites, beyond just traditional ranking factors.

4. Navboost and Navigational Queries

The patent behind Navboost:

Navboost relates to search queries where users intend to navigate to a specific website, such as “Facebook login” or “YouTube homepage.” These are known as navigational queries, and search engines like Google provide a boost to the most relevant pages for these queries.

Google often pushes the most relevant page for navigational queries to the top of the results, allowing users to quickly get to their intended destination.

Bill’s take on navigational queries

Slawski brought attention to patents like this one, explaining how brand websites and key pages benefit from such boosts in search rankings. His work helped SEOs understand the value of optimizing for branded and navigational keywords, and how to leverage this concept to dominate branded search results.

5. TF-IDF

Research behind TF-IDF:

TF-IDF is a mathematical model used to measure the importance of a term in a document relative to a corpus (collection of documents). It has been widely used in information retrieval and search engines. The original formula was introduced by Gerard Salton in the 1970s, and it plays a key role in determining how search engines evaluate the relevance of a document based on keyword frequency.

TF-IDF is an information retrieval concept that forms the basis for understanding keyword relevance in a document and is part of how search engines rank pages based on content relevance.

Bill Slawski’s articles on TF-IDF

Bill explored patents where Google employed variations of TF-IDF in its ranking algorithms.

He highlighted the evolution of search engine technology from basic keyword matching to more sophisticated models like TF-IDF and helped SEOs understand how keyword density, usage, and distribution influence search rankings.

6. BM25

The basis of BM25:

BM25 is an improvement on the TF-IDF model and is part of the Okapi BM family of algorithms used in information retrieval systems. Developed in the late 1990s by Stephen Robertson and others, BM25 refines the way term frequency and document length are factored into search rankings, making the model more adaptable to natural language.

Unlike simple TF-IDF, BM25 applies a diminishing returns function to the frequency of a term and normalizes for document length, ensuring that longer documents aren’t unfairly favored.

Articles on SEO by the SEA regarding relevance

Slawski was one of the few SEO professionals to delve into the nuances of patents similar to BM25. By analyzing Google’s patents and drawing connections to academic papers like Robertson’s work, he showed how modern search engines were improving their relevance algorithms beyond basic term frequency.

His coverage helps marketers understand the shift from simple keyword optimization to more nuanced strategies, where document length, structure, and natural language usage began playing a more prominent role.

Learning Advanced SEO Concepts with Bill Slawski