The Discovery Gap: Why ChatGPT Knows Your Startup But Won’t Recommend It
TL;DR
I tested 112 Product Hunt startups with 2,240 queries across ChatGPT and Perplexity. The results challenge conventional wisdom about “Generative Engine Optimization” (GEO):
- The Discovery Gap: ChatGPT recognizes 99% of products directly but recommends only 3% organically (30:1 ratio)
- GEO doesn’t work (yet): Zero correlation between GEO optimization and ChatGPT discovery
- Traditional SEO wins: Backlinks (r=+0.32) and Reddit presence (r=+0.40) are the strongest predictors
Full paper: arXiv:2601.00912 n Code & Data:GitHub
The Problem: ChatGPT Visibility for Startups
Every startup founder is asking the same question: “How do I get ChatGPT to recommend my product?”
This is a reasonable concern. As ChatGPT becomes a go-to tool for product discovery, being invisible to it means being invisible to a growing segment of potential customers.
The emerging field of Generative Engine Optimization (GEO) promises to solve this. The concept, introduced by researchers at IIT Delhi, suggests that optimizing content with citations, statistics, and authoritative language can improve visibility in AI-generated responses.
But does it actually work for ChatGPT?
I decided to find out.
The Experiment
Dataset: 112 Product Hunt Startups
I collected data on 112 products from Product Hunt’s December 2024 – January 2025 leaderboard. These represent the “best case” for new startups: recently launched, actively marketed, and getting meaningful traction.
For each product, I gathered:
- Product metadata: Name, tagline, category, URL
- SEO metrics: Referring domains, organic traffic, domain authority
- Social signals: Product Hunt upvotes, Reddit mentions
- GEO scores: Citation density, statistic usage, authoritative language
Query Design: 2,240 Tests
Each product was tested with 20 queries:
Direct Queries (3 per product)
"What is [ProductName]?"
"Tell me about [ProductName]"
"Have you heard of [ProductName]?"
Discovery Queries (7 per product)
"What are the best [Category] tools launched in 2025?"
"Recommend some new [Category] products"
"What [Category] startups should I check out?"
"I'm looking for a [Category] solution. What are my options?"
The distinction matters. Direct queries test recognition—does ChatGPT know your product exists? Discovery queries test recommendation—will it actually suggest your product to users?
LLMs Tested
- ChatGPT (GPT-4): The dominant LLM without web search
- Perplexity: Web-search-augmented LLM for comparison
The Results
Finding #1: ChatGPT’s Discovery Gap is Massive
| Metric | ChatGPT | Perplexity |
|—-|—-|—-|
| Direct Recognition | 99.4% | 94.3% |
| Organic Discovery | 3.3% | 8.3% |
| Visibility Gap | 30:1 | 11:1 |
ChatGPT knows almost every startup exists. When asked directly, it can provide accurate descriptions, features, and use cases.
But when users ask for recommendations—the queries that actually drive customer acquisition—these same startups almost never appear.
This is the Discovery Gap: the massive divide between ChatGPT’s knowledge and its recommendations.

Finding #2: GEO Optimization Shows No Effect on ChatGPT
To measure GEO optimization, I adapted the scoring methodology from Aggarwal et al. (2024) — the IIT Delhi researchers who introduced the concept of Generative Engine Optimization in their seminal paper “GEO: Generative Engine Optimization”.
Their framework measures optimization across multiple dimensions:
- Citation density — References to authoritative sources
- Statistical content — Use of numbers and data points
- Authoritative language — Confident, expert-sounding phrasing
- Expert quotations — Inclusion of expert opinions
- Fluency optimization — Clear, well-structured content
Using this established GEO scoring framework, I calculated scores for each product and compared them to ChatGPT discovery rates.
The correlation?
r = -0.10 (not statistically significant)
Products with high GEO scores were no more likely to be recommended by ChatGPT than products with low scores. The fancy optimization tactics that the GEO literature promotes showed zero measurable impact.
Finding #3: Traditional SEO Signals Still Matter
If GEO doesn’t work, what does?
| Predictor | Correlation | p-value |
|—-|—-|—-|
| Reddit Mentions | +0.40 | <0.01 |
| Referring Domains | +0.32 | <0.001 |
| Product Hunt Upvotes | +0.23 | <0.05 |
| GEO Score | -0.10 | n.s. |
The strongest predictors are the same factors that have driven SEO for decades:
- Reddit presence (r = +0.40): Products with genuine community discussions got recommended more often
- Backlinks (r = +0.32): More referring domains = more ChatGPT visibility
- Social proof (r = +0.23): Product Hunt engagement correlated with discovery

Finding #4: Perplexity’s Web Search Provides an Edge
Perplexity, with its real-time web search, achieved 2.5x better discovery rates than ChatGPT (8.3% vs 3.3%).
This suggests that web access meaningfully improves an LLM’s ability to surface new products. ChatGPT, limited to its training data, struggles more with recent launches.
Why Doesn’t GEO Work for ChatGPT?
Based on my analysis, I have three hypotheses:
1. The Training Data Problem
ChatGPT is trained on web data up to a knowledge cutoff. For products launched after that cutoff, no amount of GEO optimization will help—the content simply isn’t in the training set.
2. The Authority Gap
GEO techniques optimize the content of your pages. But ChatGPT appears to weight external signals (backlinks, mentions, authority) more heavily when deciding what to recommend.
A perfectly optimized landing page with zero backlinks may still be invisible.
3. The Recommendation vs Recognition Split
ChatGPT’s knowledge retrieval and recommendation systems appear to work differently. Knowing about a product doesn’t mean recommending it. The 30:1 gap proves this.
Practical Implications for Founders
If you’re a startup founder thinking about ChatGPT visibility, here’s my takeaway:
1. Don’t Panic About GEO (Yet)
The GEO hype cycle is in full swing, but my data suggests it doesn’t work for new products targeting ChatGPT. Save your optimization energy for proven strategies.
2. Focus on Traditional SEO
Backlinks and referring domains showed the strongest correlation with ChatGPT discovery. The boring work of building genuine web presence still matters.
3. Build Community Presence
Reddit mentions were the single strongest predictor (r = +0.40). Authentic community engagement drives AI visibility.
4. Track the Right Metrics
If you want to measure LLM visibility, track organic discovery queries—not just direct recognition. The 30:1 gap between them is where the opportunity lies.
5. Watch This Space
LLM capabilities are evolving rapidly. GEO may become relevant as models improve. But for now, the fundamentals win.
Limitations & Future Work
This study has limitations:
- Dataset: 112 products from Product Hunt may not generalize to all markets
- Timing: LLM capabilities change rapidly; these results reflect a specific snapshot
- Correlation vs causation: These are correlational findings, not causal claims
I’ve open-sourced all code and data for replication. If you run this experiment and find different results, I’d love to hear about it.
