New: Real-time hallucination alerts are live. Learn more →

LLM Metrix logoLLM Metrix
Back to Glossary
Definition

Inference

The real-time process by which a trained AI model generates a response to a user's prompt. Distinct from training (which happens once in advance). Brand visibility is determined at inference time by training data recall, live retrieval, and system prompt configuration.

Inference is the process by which a trained AI model generates a response given an input — the live computation that turns a user’s prompt into an answer. Distinct from training (which happens once, in advance, on large datasets), inference happens in real time every time a user submits a query.

Why inference matters for brand visibility

Brand visibility in AI responses is determined at inference time by a combination of:

  1. What the model learned during training — associations, facts, and patterns baked in before inference starts
  2. What the retrieval system fetches at query time (for RAG-powered engines) — documents retrieved live from the web or a proprietary index
  3. How the system prompt configures the model’s behavior — which sources to prefer, which topics to avoid, how to structure responses

Brands can influence #1 over time through content and earned media. They can influence #2 through technical SEO and content quality. They have limited control over #3.

Inference variability

The same query run at inference time twice will often produce different responses — a phenomenon driven by temperature settings and stochastic sampling in the generation step. This is why AI visibility monitoring requires multiple runs of the same query, and why impression rate is measured as a percentage (not a binary yes/no).

Inference cost and speed tradeoffs

Larger, more capable models require more compute at inference time — they’re slower and more expensive to run. This is why AI products serving millions of users often run smaller distilled models rather than the full-size foundation model. The model a user interacts with on a free tier may have less brand-specific knowledge than the premium-tier model, affecting visibility on different product tiers.

Ready to improve your AI visibility?

Put your knowledge into practice with step-by-step tutorials.