RETURN_TO_BLOG
AI & SEO 13 min

How to Measure Brand Share of Voice in AI Models — From Manual Tests to Automated Monitoring

A marketing director discovers that a competitor is being recommended in ChatGPT — despite holding a TOP 3 position in Google. Traditional SEO tools register nothing. I show how to build a methodology for measuring AI Share of Voice: from a manual baseline audit to automated monitoring with Perplexity API and AnswerLyzer.

Martin, marketing director at a B2B SaaS firm, received a sales report: three prospects last month said the exact same thing — "I checked ChatGPT and it recommended a different company." Martin opened Google Analytics: organic traffic normal, TOP 3 positions for main keywords, CTR unchanged. Traditional SEO tools saw no problem.

The problem was elsewhere. When customers stop asking Google and start asking ChatGPT — search rankings become invisible to that channel. Nobody tracks that a competitor appears in AI responses instead of your brand — until sales starts bleeding.

This article shows how to measure that visibility: methodically, repeatably, with results you can act on.

What Is Share of Voice in AI Models

In traditional SEO, Share of Voice (SoV) measures the percentage of search visibility for a brand across a tracked keyword set. A brand appearing in the TOP 10 for 60 out of 100 tracked phrases has an SoV of 60%.

In the context of AI models, the definition is different — there are no ranked result lists. Models generate narrative responses: they cite, recommend, describe, and compare.

AI Share of Voice is the percentage of AI responses in which a brand is mentioned as recommended, cited as a source, or described as a leading option — across a representative set of industry queries.

Example: 100 queries in the marketing automation space sent to ChatGPT. Your brand appears in 28 responses → ChatGPT SoV = 28%. The same 100 queries in Perplexity → 41 mentions → Perplexity SoV = 41%. Aggregated SoV (average across 3 models) = 28%.

Why it matters: AI users don't see a list of ten results — they see one or two recommended options. The winner-takes-most effect is stronger than in traditional search. A brand ranked #1 in Google can have an AI SoV of 0%.

Why Traditional SEO Tools Miss This

Ahrefs, Semrush and Search Console measure: - URL position for a given keyword - CTR and clicks from Google - SERP visibility (organic result list)

None of them measure: - Whether ChatGPT recommends you for an industry query - How Perplexity describes you relative to competitors - Whether Gemini cites your articles as a knowledge source - How your SoV shifts after a model update

A company can lose dozens of potential customers per month to more AI-visible competitors — and not know it. In B2B with long sales cycles, that's a 3–6 month lag before revenue impact becomes visible.

Step 1: Building the Test Question Set

The foundation of SoV measurement is a set of 50–100 questions your prospects actually ask AI models. Not keywords — complete questions in natural language.

Three question categories:

Discovery — buyer searching for a category - "What tools do you recommend for B2B marketing automation?" - "How do I choose an agency to implement AI in a small company?" - "What is GEO and is it worth investing in?"

Consideration — buyer comparing options - "How much does AI process automation cost for a 20-person company?" - "Compare Make and n8n for workflow automation — which should I choose?" - "Which agencies offer GEO services in Europe?"

Authority — buyer seeking an expert, tests citations of your content - "How do I build an AI content marketing pipeline?" - "How do I measure brand Share of Voice in AI models?" - "How do I automate meeting notes with AI?"

How to build this set: interview 3–5 salespeople ("what do clients ask before buying?"), review support conversation history, and review your own AI query history. The set must reflect the customer's language — not your marketing team's jargon.

/// PIPELINE POMIARU AI SHARE OF VOICE

Od zestawu pytań do mierzalnego SoV

01
Zestaw pytań
60–100 pytań w 3 kategoriach: discovery, consideration, authority
02
3 modele AI
ChatGPT, Perplexity, Gemini — każde pytanie do każdego modelu osobno
03
Analiza odpowiedzi
Wzmianka marki (Tak/Nie), sentiment, cytowane źródła i konkurenci
04
SoV Score + trend
% per model i zagregowany, wykres zmian w czasie, alerty na spadki
60–100
pytań testowych
3
platformy AI
2 tyg
zalecany interwał
<1h
czas auto-testu

Step 2: Testing Models and Collecting Data

Level 1 — Manual Testing (good starting point)

Ask each question to ChatGPT, Gemini and Perplexity. For each response, record: - Is the brand mentioned? (Yes/No) - How is it described? (recommended / neutral mention / negative comparison) - Which competitor brands appeared? - Does the response link to your content?

Time: ~3–4 hours for 60 questions × 3 models. A solid baseline for a first audit.

Level 2 — Semi-Automated with Perplexity API

Perplexity offers an API compatible with OpenAI syntax. Every response includes a list of cited sources — you can programmatically check whether your domain appears in the citations.

sov_perplexity.py
import requestsPERPLEXITY_API_KEY = "pplx-..."def check_brand_mention(question, brand_name, domain):    response = requests.post(        "https://api.perplexity.ai/chat/completions",        headers={"Authorization": "Bearer " + PERPLEXITY_API_KEY},        json={            "model": "sonar",            "messages": [{"role": "user", "content": question}],            "return_citations": True        }    )    data = response.json()    answer = data["choices"][0]["message"]["content"]    citations = data.get("citations", [])    return {        "question": question,        "brand_mentioned": brand_name.lower() in answer.lower(),        "domain_cited": any(domain in c for c in citations),        "citations_count": len(citations)    }questions = [    "Best agencies for AI automation implementation in Europe?",    "Who offers GEO services for B2B companies?",    "How to automate meeting notes for a startup?",]results = [check_brand_mention(q, "Wiszniewsky", "wiszniewsky.pl") for q in questions]sov = sum(1 for r in results if r["brand_mentioned"]) / len(results) * 100print("Perplexity SoV:", round(sov, 1), "%")

Setup time: 2–3 hours one-off. API cost: ~$0.005–0.008 per query — a full set of 100 questions costs about $0.70.

Level 3 — Automated Monitoring with AnswerLyzer

For regular measurements (every 2 weeks or more frequently) manual tests don't scale. I built AnswerLyzer for exactly this purpose — a platform that:

  • Manages question sets per brand and industry
  • Automatically queries Gemini, GPT-4o and Perplexity on configured schedules
  • Uses LLM-as-a-Judge architecture: a separate model evaluates each response for sentiment and mention type (recommendation / neutral mention / negative comparison)
  • Generates trend reports: SoV over time, competitor comparison, alerts when SoV drops by more than 5 percentage points

/// AI SHARE OF VOICE — PANEL MONITORINGU (przykład)

SoV per model AI — zestaw 100 pytań branżowych

ChatGPT
Twoja marka28%
Konkurent A35%
Konkurent B18%
Perplexity
Twoja marka41%
Konkurent A22%
Konkurent B15%
Gemini
Twoja marka14%
Konkurent A31%
Konkurent B12%
Sentiment wzmianek "Twoja marka" — wszystkie modele łącznie
Pozytywny (polecenie)62%
Neutralny (wzmianka)31%
Negatywny7%

Step 3: Calculating and Interpreting Results

SoV formula: number of responses mentioning the brand ÷ total number of queries × 100

Measure per model (ChatGPT SoV, Perplexity SoV, Gemini SoV) and aggregated. The aggregate gives overall visibility — per-model scores show where the gaps are.

AI SoVInterpretationPriority action
0–5%Practically invisibleContent and Schema.org audit, build external source presence
5–15%Marginal visibilityStrengthen expert content, increase industry citations
15–30%Moderate visibilityFormat optimisation (atomic paragraphs), topic expansion
30–50%Good niche visibilityExpand into new subtopics, monitor competitor shifts
50%+Topical dominanceDefend position, expand to new categories

Sentiment — not just whether, but how

Presence alone isn't enough. Models can describe a brand positively ("I recommend"), neutrally ("there's also option X"), or negatively ("less well-known than Y"). Negative SoV is worse than no mention — it directly steers prospects toward competitors. Identifying negative mentions is the first priority of any audit.

Step 4: Competitive Intelligence

SoV has the greatest value in comparison — measure yourself and 3–5 competitors in parallel for the same question set.

Alert pattern: your SoV drops while a competitor's SoV rises on the same questions. That's a signal of active GEO optimisation on their side. Diagnostic questions: - What new content did the competitor publish in the last 4–8 weeks? - Where did they gain new citations and backlinks? - Did they update structured data or Schema.org? - Did they appear in external industry rankings or roundups?

How Often to Measure

Business typeRecommended cadenceRationale
B2B SaaS / tech agencyEvery 2 weeksLong sales cycle: lag between SoV change and pipeline impact is 2–3 months
Service agency / consultantMonthlyModerate risk; GEO increasingly important for client acquisition
B2C e-commerceEvery 2 weeksAI product recommendations growing rapidly among B2C users
Local businessQuarterlyLocal queries in AI still a niche channel, but the trend is growing

Additional triggers (measure immediately outside the schedule): - After a major model update (GPT-5, Gemini 2.0, new Perplexity release) - After a competitor publishes a significant article or case study - After your own publication — verify whether new content is being cited - When sales signals a shift in lead quality or new competitors appear in client conversations

Who Should Prioritise AI SoV — and When Not to Bother

Highest priority: - B2B companies whose clients actively use AI for research (SaaS, tech agencies, professional services, consultants) - Brands in categories with rapidly growing AI query volumes (marketing, automation, fintech, legal tech) - Companies investing in GEO that need a measurable KPI for effectiveness

Lower priority (for now): - Local businesses with "near me" queries — local queries in AI are still a minority - Price-driven e-commerce — customers compare prices in dedicated comparison tools - Brands with dominant branded traffic — when 80%+ of traffic is brand-name searches, behavioural shifts are slower

---

---

I build AI Share of Voice monitoring dashboards and LLM visibility audits. If you want to know how ChatGPT and Perplexity describe your brand relative to competitors — get in touch, we start with a baseline audit and first measurement.

/// AUTHOR
Paweł Wiszniewski – AI & Web Engineer

Paweł Wiszniewski

Senior Full-Stack Engineer & AI Architect

8+ years building AI systems, automations, and scalable web applications that reduce costs and improve operational efficiency.

Signal received?

Terminate
Silence

Initiate protocol. Establish connection. Let's build something loud.

> WAITING_FOR_INPUT...