How often do LLMs update their training?

Q: How often do LLMs update their training?

Frontier labs ship new model versions roughly every three to six months, each trained on a fresher corpus. Training cutoffs trail real time by months to over a year, so brand-new content only appears in answers via live browsing tools.

Q: What is the difference between training cutoff and refresh rate?

Training cutoff is the date past which a base model has read nothing. Refresh comes either from a new model version (months later) or from live browsing tools (Perplexity, browsing ChatGPT, Google AI Overviews) that query the web at answer time.

Q: Can a new brand appear in AI search before being trained on?

Yes, via the live-browsing engines. Perplexity, browsing-enabled ChatGPT, and Google AI Overviews can cite content within days of indexing. Default Claude, default ChatGPT, and DeepSeek will not see it until the next training cycle.

Q: Which engines rely on training data only?

Default Claude and DeepSeek lean heavily on training-only answers. Default ChatGPT falls back to training when browsing is not invoked. Perplexity, Google AI Overviews, and Gemini's grounded mode lean on live search.

Question

How often do LLMs update their training?

Maciej Grabek · Accepted Answer

Frontier models ship new versions every few months, but their training cutoffs trail real time by anywhere from a few months to over a year. A brand launched last week will not appear in any model's training data. It can still appear in answers from Perplexity, browsing-enabled ChatGPT, and Google AI Overviews because those tools query the live web at answer time. Training cutoff is not refresh rate An LLM's training cutoff is the date past which it has read nothing. Anthropic, OpenAI, and Google publish these dates in model cards. As of early 2026, the publicly disclosed cutoffs across the major labs sit anywhere from late 2023 to early 2025, depending on the model and the version. GPT-4o was widely reported to have a late-2023 base cutoff with later refreshes; newer Claude and Gemini versions disclose more recent dates. The exact numbers shift with each version bump, but the shape is the same: months behind today, sometimes a year or more. That cutoff freezes everything the model knows from training. New product launches, rebrands, M&A, pricing changes, or a fresh round of reviews after the cutoff are invisible to the base model. If someone asks Claude or default ChatGPT about a startup that launched last month, the model has two options: hallucinate, or say it does not know. How often new versions actually ship The major labs ship model updates on a rough quarterly-to-half-year cadence: OpenAI: GPT-4, GPT-4 Turbo, GPT-4o, and the o1/o3 reasoning lines have all shipped within roughly two years. Anthropic: Claude 2, Claude 3, Claude 3.5 Sonnet, Claude 3.7, Claude 4, Claude 4.5 across the same window. Google: Gemini 1.0, 1.5, 2.0, 2.5 across roughly the same window. Each version retrains on a fresher corpus. So even without browsing, a brand that gets indexed widely (Wikipedia, large news sites, Reddit threads, GitHub) will eventually be absorbed into the next training cycle. The lag is real, but it is not permanent. Live browsing changes the math Three of the five major engines query the live web at answer time: Perplexity is browsing-first by design. Almost every answer pulls fresh URLs. ChatGPT uses browsing when the question implies recency or when the user asks for sources. Default mode without browsing falls back to training data. Google AI Overviews are stitched on top of live Google search results, so they reflect whatever is in the index that day. Gemini has Google Search grounding available and uses it on news-shaped queries. Claude and DeepSeek default to training-only on the consumer apps. Claude has web search as a recent addition; DeepSeek's web mode is limited. For a brand, that means the live-browsing engines can pick up new content within days of indexing. The training-only engines are a different game. What this means for a new brand Concrete example. A B2B tool launches its public site on April 1. By April 8, it has been crawled by Google, listed in two niche directories, and reviewed by one industry blog. If you ask Perplexity "what is " on April 9, it can find and summarise the site. If you ask default Claude or default ChatGPT (no browsing), the answer will be "I do not know about that company" - because the next training run has not happened yet. The brand might not enter training data until late 2026 or 2027, depending on indexing depth and how the labs filter sources. That gap is exactly why GEO has two tracks: feed live-browsing engines now, and seed the long-form sources (Wikipedia mentions, large news, comparison pages on established sites) that the next training cycle will absorb. How to read training-cutoff disclosures Every major lab publishes the cutoff in the model card or system prompt. You can also ask the model directly: "What is your training cutoff?" Most will give you a date or a quarter. Treat that as the edge of what the base model can know without browsing. Anything more recent has to come through a tool call. The practical takeaway: do not measure GEO progress by whether default Claude or default ChatGPT mentions you this week. Measure by whether Perplexity, browsing ChatGPT, and Google AI Overviews cite you, and by whether the durable sources that feed the next training cycle now reference your brand.