Yoast SEO’s Indexables framework acts as an abstraction layer that manages SEO-related metadata connected to posts and other resources.
A page-centric view of the web
Much of what our software does involves storing, organizing, and analyzing information related to pages. Each page is associated with a unique canonical URL.
This reflects how search engines generally interpret the web. They build a map of pages based on URLs and store information about them. Yoast SEO follows a similar approach. Once that map exists, it becomes easier to check, update, and manage data related to each page.
Although the concept sounds simple, the word page becomes more complex in a WordPress environment.
For instance, WordPress posts are not stored with a direct URL in the database. Whenever the system needs a page’s URL, it must generate it dynamically according to the site’s permalink settings. This process requires additional computational effort.
The challenge isn’t limited to performance. Another issue is that the definition of a page can vary depending on the context.
What counts as a page?
Beyond typical webpages, websites can also contain archive views (such as posts written by a specific author), alternative formats like RSS feeds, taxonomy pages for tags or categories, error templates such as 404 pages, paginated results, and other specialized content types. From a search engine’s perspective, all of these can be considered pages.
From an SEO standpoint, each scenario requires different handling, rules, and conditions. Even a simple blog post may involve dozens of fields that need evaluation. These can include crawling and indexing directives, content analysis scores, keywords, presentation settings, media data, and more. All these elements and their relationships must be assessed to determine which SEO metadata should appear on the page.
For example, identifying the correct canonical URL for a page can require multiple queries and evaluations.
On larger websites, this amount of processing, storage, and logic can affect performance. This is particularly relevant in WordPress, where the database structure is not specifically optimized for this type of workload.
Additionally, some pages within a website should not be evaluated for SEO purposes. Certain content types may exist only within the administrative interface and are never accessible through public URLs. Since search engines cannot index these pages, storing or processing SEO data for them is unnecessary.
Being able to distinguish between indexable and non-indexable content is essential for efficient metadata management.
What is an indexable?
An indexable is any resource that could theoretically be indexed by a search engine through a specific URL. This includes many types of content beyond standard webpages, such as category archives, author pages, paginated archive views, media files, and more.
Examples include:
- https://www.example.com/example-page/ — a standard webpage
- https://www.example.com/example-category/page/2/ — a paginated category archive
- https://www.example.com/2018/10/20/ — a date archive
- https://www.example.com/author/laura/ — an author archive
- https://www.example.com/colors/red/ — a custom taxonomy archive
Non-public pages and pages that return errors are intentionally excluded.
Yoast SEO’s Indexables tables in WordPress
Yoast SEO stores and manages indexables using a dedicated database table in WordPress. This table contains SEO-related information for every indexable resource the system recognizes. Because this information is stored centrally, determining the correct SEO metadata for a page becomes much faster.
The process runs quietly in the background and synchronizes with WordPress’s native metadata fields and workflows.
The table also updates automatically. When the system encounters an indexable it hasn’t processed before, it creates a new entry so that the information is available for future requests. In addition, the plugin offers a reindexing feature in the admin tools, which proactively builds the indexables table based on data from the site’s database.
With this table in place, Yoast SEO maintains an SEO-focused overview of the website that concentrates on pages and the metadata associated with them.
Indexing
The indexables table is created and maintained through two main processes:
- Various optimization features in the Yoast SEO interface may prompt users to run an indexing process, which prepares the data needed for certain tools and settings.
- When the system encounters an indexable that hasn’t been processed yet, it automatically generates the necessary record through a lazy generation method.
Together, these processes ensure that the indexables table remains an accurate and complete representation of the website.
Which indexables does Yoast SEO store?
Yoast SEO stores indexables for several types of resources, including:
- All public posts and taxonomies
- The homepage
- Author archives for authors who have published public posts
The system also stores patterns that represent templates or content types where creating individual indexables for every variation would be unnecessary. Examples include:
- Post type archives, taxonomy archives, and date archives
- Error pages
- Internal search result pages
A page is considered public when the public attribute for the post or taxonomy type is set to true in register_post_type or register_taxonomy.
Use cases
Once the system has a complete understanding of all public pages on a website, the indexables database can power several features and tools. For example:
- When generating metadata for a page’s <head> section, Yoast SEO can retrieve all required fields through a single database request.
- When building an XML sitemap, the system can quickly determine which indexables should be included or excluded.
- Other software and systems can integrate with this framework, extending or modifying its functionality as needed.
FAQs
What does “indexables” mean in SEO?
Indexables refer to the web pages or URLs on a website that search engines are allowed to crawl and include in their search index. Only indexable pages can appear in search engine results.
What is an indexables functional specification?
An indexables functional specification is a technical document that explains how a system manages which URLs should be indexable. It outlines rules, logic, and processes used to determine whether a page should be included in search engine indexes.
Why is an indexables system important for SEO?
An indexables system helps control which pages are visible to search engines. This ensures that only valuable, high-quality pages are indexed while duplicate, low-value, or unnecessary pages are excluded.
How do websites decide which pages should be indexable?
Websites typically decide this based on factors such as page value, duplicate content, canonical tags, robots directives, and whether the page provides meaningful information for users.
What role does the noindex directive play in indexables management?
The noindex directive tells search engines not to include a page in their search index. It is commonly used for pages like login screens, thank-you pages, or filtered search results.
How do canonical tags relate to indexables?
Canonical tags help search engines understand which version of a page should be considered the primary version. This prevents multiple similar pages from being indexed separately.
What types of pages are usually excluded from indexing?
Pages like admin panels, duplicate pages, internal search results, staging environments, or temporary pages are often excluded from indexing to maintain a clean and efficient search index.
Can pagination affect indexable pages?
Yes, paginated pages can be indexable if they provide unique value or content. However, they must be handled carefully with proper linking and canonical tags to avoid unnecessary duplication.
How does crawl budget relate to indexable pages?
Crawl budget refers to the number of pages a search engine crawler will visit on a site within a certain period. Managing indexable pages helps ensure that crawlers spend time on the most important content instead of low-value pages.
How do internal links influence indexability?
Internal links help search engines discover pages and understand site structure. Pages that are well linked internally are more likely to be crawled and considered for indexing.
What tools can help identify indexable pages on a website?
SEO tools and website auditing tools can analyze which pages are indexable, detect noindex tags, check canonical settings, and identify pages that may be blocked from search engines.
What problems can occur if too many pages are indexable?
If too many low-quality or duplicate pages are indexable, search engines may struggle to identify the most important content. This can dilute ranking potential and reduce overall search visibility.
