The dissent of internal indexes
Contrary to popular belief, ChatGPT is no longer (or not always) a simple layer on top of Bing or Google when it looks for information beyond what it learned during training. To save compute—and therefore money—the conversational platform seems to increasingly favor its internal cache over real-time searches on traditional engines.
Some experts are starting to argue that ChatGPT’s fan-outs—now ridiculously long and totally unusable for a classic search engine—prove that the platform is heavily relying on its own internal search system. These ultra long-tail queries, enriched with many synonyms, would be designed to perform semantic search inside a vector database (ChatGPT’s vectorized cache).
It’s also important to note that I don’t see the slightest trace of these nonsensical queries (which would be easy to spot) in my Google Search Console.
Which would support the idea that they are no longer used to retrieve Google search results.
And if that’s true, it creates a new problem.
Where a traditional engine cleans up dead links to reflect the state of the Web (a page that no longer exists is, by definition, inaccessible to users), an AI retains concepts, indifferent to HTTP status codes. When it needs to fetch information, it doesn’t check whether a source is available; it checks the semantic relevance of a memory. This weakness turns ephemeral data into durable “truths.”
The mechanics of Wikipedia-based injection
A current black-hat GEO technique exploits this latency through Wikipedia. The attacker creates a promotional page (for a brand, a service, their own name), forces the model to ingest it immediately (it’s enough to ask ChatGPT to summarize the newly created Wikipedia page), then lets moderators delete it (which inevitably happens—Wikipedia’s editorial rules are famously strict).
For a human, it becomes a 404. For the AI, it’s validated data because it comes from an authoritative source (it’s hard to find anything more authoritative than Wikipedia), stored and reusable for answering any user on the platform. By then placing a link to that dead URL on third‑party sites known to be regularly crawled by AI bots, the model’s memory is reactivated. The user is left with a paradox: an unreachable source cited with confidence by the AI.
What I’m describing isn’t the product of a diabolical imagination. I’ve seen it with my own eyes—and I see it more and more. I ask ChatGPT to name the best experts in a specific field, and the AI gives me names while displaying, as a clickable link, the Wikipedia page it used. I click the link and land on a page that doesn’t exist.
I open a new conversation and ask the AI engine to summarize the content of the URL. ChatGPT summarizes in detail the content of the URL… which doesn’t exist!
Rare case of a Wikipedia page that no longer exists and yet is heavily cited by ChatGPT
How can such a feat be possible?
I woke up my investigator’s soul and activated my Wikipedia account to try to understand this phenomenon. The page (to take just one example among many) had been created last September and was deleted two days later by moderators.
Two days online is more than enough for OpenAI’s bots to ingest it—and enough for them to remember it three months later, to the point of showing it in responses.
Digging deeper, I discovered that the now-nonexistent Wikipedia page was linked from a page on a site that clearly belongs to the people who created the promotional entry. On that page, the site proudly claims it is referenced on Wikipedia, supposedly proving its legitimacy in its industry. And in case ChatGPT can no longer retrieve the Wikipedia page, the site provides a detailed summary (no doubt to keep the model’s memory alive).
It’s diabolical—and likely temporary—but in late 2025, the reality is that this technique works perfectly well: the brand is mentioned almost every time you ask ChatGPT a question related to its business, with a nice clickable “Wikipedia” link next to its name as a bonus.
From ranking to memory imprinting
SEO is morphing into memory imprinting: the goal is no longer to be ranked, but to be remembered. This persistence makes the Right to be Forgotten technically ineffective, because you can’t easily remove data dissolved into billions of parameters. The cache becomes a parallel Web—a graveyard of data that can be reactivated on demand—creating a critical risk for e‑reputation management and disinformation, which an army of malicious actors could exploit.
Toward a decoupled architecture of truth
User experience is now splitting into two realities. On one side, the living Web: moderated and verifiable. On the other, a synthetic Web mixing live data and stale memories. Future strategies will have to do more than monitor what gets published: they will need to audit what gets remembered. Because the next major security flaw may well be the injection of harmful ghost memories—served as established facts by ChatGPT long after the evidence has vanished.
Commentaires
Aucun commentaire pour le moment. Soyez le premier à commenter !
Ajouter un commentaire