The short answer

Google is pruning pages from its index more aggressively this year, and most of what's getting removed isn't penalized. It's just not earning its slot. Thin content, stale pages, orphaned URLs, and AI-drafted filler are the clearest targets. The same patterns also keep pages out of AI citations entirely, so the stakes are higher than they used to be.

If you've been watching "Discovered - currently not indexed" or "Crawled - currently not indexed" climb in your Google Search Console reports, that's the signal. It's not always a single page that triggers it. It's usually a pattern across the site that tells Google your content isn't worth the index space.

This isn't a penalty conversation. It's a curation conversation. Google is choosing to index less, and that changes how you have to think about what you publish.

What "Discovered - currently not indexed" and "Crawled - currently not indexed" actually mean

Both statuses show up in the GSC Page Indexing report, and both mean Google decided your page wasn't worth a slot in the index. The difference is how far Google got before making that call.

Status
Discovered - currently not indexed

Google knows the URL exists but hasn't crawled it yet. Usually means Google decided the page wasn't worth spending crawl budget on. Often a signal about your site's overall trust, not just the individual page.

Status
Crawled - currently not indexed

Google crawled the URL and chose not to index it. Usually means Google read the page and decided it didn't add anything worth including in search results. The page is on the radar, it's just not making the cut.

Either way, the underlying message is the same: this page isn't earning a place. The question worth asking isn't "how do I force Google to index it?" It's "why didn't this page earn the slot in the first place?"

Why Google is pruning more aggressively right now

A few honest observations from the field this year.

The index is getting more curated. Google has been clear for a while that indexing is a quality decision, not a default. What's changed is how visibly that's playing out in client GSC reports. The volume of pages stuck in "Crawled - currently not indexed" has climbed across nearly every site I audit.

AI-generated content at scale is part of the pressure. The web has gotten noisier in the last two years. Google's response has been tighter, not looser. If your blog looks like 30 generic AI-drafted posts, you're in the cohort Google is trying to filter out, not the cohort it's trying to reward.

The "130-day" pattern is real enough to act on. Pages that haven't been crawled or meaningfully updated in roughly four months are more likely to drop out. This isn't a published Google rule, but it's consistent enough across audits that it's worth treating as a working assumption.

AI retrieval is raising the bar for what "indexable" means. A page Google deprioritizes in its index is also a page ChatGPT, Perplexity, and Google's AI Mode can't confidently pull from. The floor for being visible anywhere has moved up. A page that's borderline for indexing in 2026 is functionally invisible in AI search.

The real reasons pages get deindexed

The diagnosis usually comes back to one of these patterns. Most sites I audit have more than one in play.

Thin or duplicate content. Pages that don't clearly answer a real question, or that repeat what other pages on the site already say. This is the most common cause and the easiest to spot.

Stale content. Pages that haven't been touched in a year or more and aren't earning impressions. Freshness signals matter, but they have to reflect real content work, not just a republish date bump.

Orphan pages. Pages with no internal links pointing to them. Google reads internal linking as a vote of importance from the site itself. A page nobody on the site links to is a page Google reads as low-priority, and the orphan pages article covers why this matters more than most teams realize.

Cannibalization. Multiple pages competing for the same query, none strong enough to break through. This is one of the most common underlying causes of deindexing I see in client audits, and it's the one teams are usually most surprised by. The keyword cannibalization article walks through how to diagnose and fix it.

Technical issues. Accidental noindex tags, broken canonicals, robots.txt blocks, server errors, redirect chains. Less common than quality issues, but worth ruling out before assuming the problem is content.

Low site-level authority. A domain with a lot of weak pages earns less indexing trust across the board. This is the version of deindexing that catches teams off guard, because it's not about any one page. It's about the cumulative signal the site is sending.

How to check if a page has been deindexed

Three checks, in order:

1
Use the GSC URL Inspection tool

Paste the URL in. It tells you exactly why the page isn't indexed and what status Google has assigned.

2
Use the site: operator in Google

Search site:yourdomain.com/page-slug. If nothing comes up, the page isn't in the index.

3
Check the Pages report in GSC

The "Why pages aren't indexed" breakdown shows you the volume by reason. That's where you'll see whether you have a one-page problem or a site-wide pattern.

If the volume in "Crawled - currently not indexed" or "Discovered - currently not indexed" is climbing month over month, that's the signal you have a pattern, not an incident.

The rule I always give clients

Every page on your site is either earning its index slot or putting the rest of your site at risk. There is no neutral page anymore.

How to prevent it (the stronger play than reacting)

Recovery is possible, but prevention is the better game. The fix sequence I run on every audit looks like this.

Audit what you have. Every URL has to earn its place. Pages that don't get internal links, don't earn impressions, and don't serve a clear user need shouldn't be live. Sentimentality about old blog posts is one of the more common causes of indexing trouble.

Consolidate redundancy. Multiple pages on the same topic usually want to be one page. This is where cannibalization cleanup does the most work. The keyword cannibalization article walks through the framework I use.

Fix the technical basics. No accidental noindex tags. No broken canonicals. No orphan pages quietly draining trust from the rest of the site. The orphan pages article covers the orphan side specifically.

Redirect with intent. When you remove a page, redirect to the closest relevant match, not the homepage. Don't redirect informational content to commercial pages. The 301 redirects article has the full picture on doing this without losing equity.

Write for real expertise. AI drafts without editorial oversight are one of the clearer signals Google is using to prune. If your content reads like it could've been written by anyone with a prompt, it probably isn't earning its index slot.

Update what deserves to stay. Republishing dates without real content updates is a habit Google sees through. Real freshness means real changes: new examples, updated data, sharpened arguments. If a page isn't worth updating meaningfully, it might not be worth keeping.

What to do if you're already seeing pages deindexed

If the pattern is already in your reports, three notes on what to actually do.

Don't mass-request reindexing

Submitting hundreds of URLs through GSC doesn't address why the pages weren't worth indexing in the first place. Google will recrawl them and make the same call.

Fix the underlying issue first, then request indexing. If the page was deindexed because of thin content, rewrite it. If it was deindexed because of cannibalization, consolidate the cluster. If it was a technical issue, fix the tag or the canonical, then request reindexing.

Accept that some pages shouldn't come back. A cleaner, smaller index of strong pages is a healthier outcome than restoring a pile of weak ones. Some of what's been pruned is exactly what should've been pruned. The instinct to "save" every page is usually the instinct that got the site into trouble in the first place.

Where I land

The sites that are fine right now are the sites that already treated indexing as a privilege, not a default. The ones getting surprised are usually the ones that published at volume without a strategy. The fix is the same whether you're worried about Google rankings or AI citations: fewer pages, better pages, clearer topical ownership. None of this is new SEO advice. It's just enforced more strictly than it used to be.

Related reading
What Is Keyword Cannibalization? What Are Orphan Pages? What Is a 301 Redirect?

Frequently asked questions

What does "Discovered - currently not indexed" mean?

It means Google knows the URL exists but hasn't crawled it yet. Usually it's a signal that Google decided the page wasn't worth spending crawl budget on, often based on the site's overall trust signals rather than the individual page.

What's the difference between "Discovered" and "Crawled - currently not indexed"?

"Discovered" means Google knows about the URL but hasn't crawled it. "Crawled" means Google read the page and chose not to index it. Both result in the same outcome: the page isn't in Google's index. The Crawled status is usually a clearer signal that the specific page didn't earn its slot.

Why is Google deindexing pages in 2026?

Google is being more selective about what it indexes, mostly in response to the volume of low-quality and AI-drafted content on the open web. Thin pages, duplicate content, stale pages, orphan pages, and cannibalized clusters are the most common targets. It's a curation decision, not a penalty.

How do I get a deindexed page back into Google?

Fix the underlying reason it was deindexed first. If the page is thin, rewrite it. If it's part of a cannibalized cluster, consolidate. If there's a technical issue like an accidental noindex tag, fix it. Then use the URL Inspection tool in GSC to request reindexing. Skipping the fix and just resubmitting won't work.

Does deleting old blog posts help with deindexing?

It can. A smaller, stronger set of pages usually performs better than a sprawling library of weak ones. Pages that aren't earning impressions, don't get internal links, and don't serve a clear user need are often hurting the site's overall indexing trust. Deleting or redirecting them to stronger destinations is one of the more underrated fixes.

Helpful resources