Methodology

How LLMVerse measures AI visibility.

We’d rather show our work than wave our hands. Below: how the daily pipeline runs, what we measure, and where the numbers come from.

Prompt generation

We auto-generate the prompts. You don’t fill in a spreadsheet.

When you add a brand, we generate up to 100 prompts (and up to 1,000 across the brand’s lifetime) from three sources:

  • Your sitemap — we crawl your top URLs to understand product categories, use cases, and audience signals.
  • Your industry vertical — we maintain a library of common buyer questions for ~30 verticals (D2C, SaaS, fintech, edtech, etc.).
  • Live AI suggest data — we sample what users actually type into ChatGPT, Perplexity, and Gemini auto-complete in your category.

We don’tpublish the prompt library — that’s how the system stays useful. You can see your own brand’s prompts in the dashboard, edit them, or add your own.

Engine querying

Official APIs where available. Structured search where not.

We query 7 engines daily: ChatGPT (OpenAI API), Gemini (Google AI Studio API), Claude (Anthropic API), Perplexity (Perplexity API), Grok (xAI API), DeepSeek (DeepSeek API), and Google AI Overviews (via SerpAPI structured search).

We do not scrape user accounts, automate browser sessions on logged-in UIs, or use credential-shared workspaces. Every request is a first-party API call billed to our account.

Queries originate from servers in India, which affects responses for region-aware engines like Google AI Overviews.

Scan frequency

Daily at 2 AM IST. Hourly on Business+.

Free and Pro brands scan once a day at 2 AM IST. Reports are ready by 7 AM IST for the morning standup. Business+ brands can opt into hourly scans for time-sensitive launches.

You can also trigger a manual scan from the dashboard at any time — useful when you publish a new piece of content and want to see whether AI picks it up before the next cycle.

Mention detection

Exact-match + fuzzy match + an LLM tie-breaker.

For each response we run three passes:

  1. Exact match on your brand name and registered aliases.
  2. Fuzzy match (Levenshtein distance ≤ 2) to catch typos and minor variants like “Mashreq Neo” vs “Mashreq-Neo”.
  3. LLM disambiguation (Claude Haiku 4.5) for ambiguous cases — common-word brand names, partial matches, references inside negations.

We’re building an internal labeled eval set to publish precision/recall benchmarks openly. Until then, you can flag false positives from any dashboard row and the system learns.

Sentiment scoring

Per-mention, with rationale stored alongside.

Each detected mention is scored positive / neutral / negative by Claude Haiku 4.5, with a one-sentence rationale stored next to the score. You can audit any sentiment call from the dashboard — click a mention to see the full response, the surrounding context, and the model’s reasoning.

Sentiment is scored on the specific mention, not the overall response — so a long ChatGPT answer that says “Brand A is great, Brand B is overpriced” gets two separate scores, not an averaged one.

Why this matters

You should be able to challenge every number.

AEO is new. The metrics are not yet standardised. Anyone who tells you they have it all figured out is overselling.

We’re publishing our methodology because (a) you should be able to question any number on your dashboard, and (b) we’d rather get corrected fast than be wrong slowly.

Found something we got wrong, or have a method suggestion? Email hello@llmverse.io.

See the methodology in action.

Run a free audit on your domain. The report links back to this page from every metric.