New: Real-time hallucination alerts are live. Learn more →

LLM Metrix logoLLM Metrix
Back to Knowledge Base
Strategies

Entity Schema Guide: Building a Knowledge Graph Presence for Your Brand

AI engines understand the world through entities — named things with properties and relationships. This guide explains how to build a strong entity presence using schema markup and off-site records.

7 min read4 sections

AI engines don’t just index pages — they build models of entities: named things (brands, products, people, concepts) with properties and relationships. When an AI engine answers a question about your brand, it draws from its entity model, not just from the text of individual pages. Building a strong entity presence is how you ensure that model is accurate and complete.

What entities are and why they matter for AI

An entity is any uniquely identifiable thing — a company, a product, a person, a technology, a place. AI engines learn about entities through:

  • On-site structured data — Organization, Product, and SoftwareApplication schema on your own pages
  • Knowledge graph records — Wikidata, Wikipedia, Google’s Knowledge Graph
  • Third-party mentions — how authoritative sources describe and categorize your brand
  • Cross-references — how often and how consistently these sources agree on the same facts

The reason entity building matters is that AI engines generate responses from their internal entity models, not just from retrieved text. When ChatGPT answers “what does [your brand] do?”, it draws from its parametric knowledge of your brand — facts it absorbed during training from the above sources. If that model is incomplete, outdated, or conflicting, you get hallucinated answers regardless of how well-optimized your website is.

The four layers of entity schema

Layer 1: On-site entity declaration (Organization schema)

Your homepage Organization schema is your canonical entity declaration. Every other layer builds on it.

The critical properties for entity disambiguation:

name — your exact brand name, matching how you appear everywhere else. Inconsistency between your schema name and your Wikidata record name creates a split entity problem — AI engines may create two separate entity records rather than unifying them.

description — write for entity understanding, not marketing. Include your category, primary function, and audience:

"description": "LLM Metrix is an AI visibility monitoring platform that tracks brand mentions, citations, and share-of-voice across major AI search engines including ChatGPT, Perplexity, Gemini, and Claude."

sameAs — the most important property most implementations skip. This is an array of URLs pointing to your entity records on authoritative platforms. It’s the mechanism by which AI engines unify your web entity with your knowledge graph entity:

"sameAs": [
  "https://www.linkedin.com/company/llmmetrix",
  "https://twitter.com/llmmetrix",
  "https://www.wikidata.org/wiki/Q[your-q-id]",
  "https://www.crunchbase.com/organization/llmmetrix"
]

knowsAbout — optional but valuable for topic authority. An array of topics your organization has expertise in, helping AI engines understand the query types for which your brand is relevant.

Layer 2: Product entity declarations

Each core product or service needs its own entity schema. For software products:

{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "LLM Metrix",
  "applicationCategory": "BusinessApplication",
  "operatingSystem": "Web",
  "description": "AI visibility monitoring platform for tracking brand citations across AI search engines.",
  "offers": {
    "@type": "Offer",
    "price": "49",
    "priceCurrency": "USD"
  },
  "creator": {
    "@type": "Organization",
    "name": "LLM Metrix",
    "url": "https://llmmetrix.com"
  }
}

The creator property linking back to your Organization schema creates a formal entity relationship — your product is a thing made by your organization, not a separate unrelated entity.

Layer 3: Knowledge graph records

Wikidata is the open knowledge graph that most AI engines use as a foundational reference. A Wikidata entry for your brand:

  • Creates a stable, persistent entity ID (a “Q number”) that AI engines can reference
  • Provides a structured set of properties (instance of, founded, headquarters, website, etc.)
  • Connects your brand to the broader knowledge graph (your industry, your competitors, your category)

If your brand doesn’t have a Wikidata entry and you’ve been around for more than 2–3 years with meaningful online presence, creating one is one of the highest-leverage entity actions available. Keep the entry conservative — only include verifiable facts with sources.

Wikipedia notability criteria are harder to meet, but if your brand qualifies, a Wikipedia article provides the most powerful single entity signal for AI knowledge.

Layer 4: Third-party entity corroboration

AI engines weight facts that appear consistently across multiple independent authoritative sources. For each key fact about your brand (what you do, what category you’re in, pricing tier, team size), you want that fact to appear consistently on:

  • Your Wikidata record
  • Your Crunchbase profile
  • Your LinkedIn company page
  • Your G2 or Capterra profile
  • Major press coverage or review articles

When all these sources agree on the same facts in the same language, AI engines have high confidence in those facts and are more likely to include them accurately in generated answers.

Common entity schema problems and how to fix them

Split entity problem. AI engines create two separate entity records for the same brand when sources use inconsistent names or descriptions. Fix: standardize your brand name and description across every profile and schema implementation.

Missing sameAs connections. Your on-site Organization schema and your Wikidata record are not connected, so AI engines may not realize they’re describing the same entity. Fix: add the Wikidata URL to your Organization schema sameAs array, and add your website URL to your Wikidata record’s “official website” property.

Stale entity facts. Your Crunchbase profile still lists 10 employees when you have 50. Your Wikidata record shows your old founding date. AI engines will sometimes synthesize answers using these stale facts rather than your current website. Fix: audit all major profiles quarterly and update any outdated properties.

No product-to-organization relationship. Product schema exists but doesn’t link back to Organization schema via the creator property. AI engines may treat your product as an entity independent of your brand. Fix: add the creator relationship.

Measuring entity strength

A rough proxy for entity model accuracy: ask ChatGPT, Claude, and Perplexity to describe your brand in a paragraph without accessing your website. Compare the response to ground truth:

  • Is the category correct?
  • Is the core function described accurately?
  • Is the audience correct?
  • Are the founding date and size approximately right?
  • Are any features or pricing details incorrect?

The answers reveal where your entity model is weak. Gaps usually trace back to one of the four layers above — missing schema, an empty third-party profile, or conflicting facts across sources.

Use LLM Metrix’s hallucination checker to monitor AI-generated brand descriptions systematically over time, and track whether entity improvements reduce the frequency and severity of inaccuracies.

Was this helpful?

Ready to put this into practice?

Apply these concepts with our step-by-step tutorials or check your visibility now.