LLMs vs Agents
vs

LLMs vs Agents

An interactive visual guide to understanding how large language models predict text, how they work under the hood, and how agents use them to take action in the world.

Scroll to explore

Token Prediction

At their core, LLMs are next-token predictors. Given a sequence of text, they calculate the probability of what comes next — one token at a time.

Try a prompt:
Select a prompt above to start generating...

Click a preset prompt to see token probabilities

Key insight:An LLM doesn't "understand" language the way humans do. It's a sophisticated probability machine that has learned statistical patterns from vast amounts of text. Each token is chosen based on the probability distribution over all possible next tokens — sometimes picking the most likely one, sometimes sampling from lower-probability options for creativity.

What is an LLM?

A Large Language Model is a neural network trained on massive amounts of text to understand and generate human language.

The Simple View: Text In, Text Out

Input (Prompt)

"Explain quantum computing in simple terms"

LLM

Output (Completion)

"Quantum computing uses quantum bits (qubits) that can be 0 and 1 at the same time..."

Under the Hood: The Transformer Architecture

T

Input Text

Raw text prompt enters the model

Tk

Tokenizer

Text is split into tokens (subwords)

Em

Embeddings

Tokens become numerical vectors

At

Attention

Tokens attend to each other to build context

FF

Feed Forward

Neural network processes the representations

Out

Output

Probability distribution over all possible next tokens

Training Data Scale

0T+

tokens of training data

Modern LLMs are trained on trillions of tokens from diverse sources across the internet, distilling patterns from an enormous breadth of human knowledge into their parameters.

Web pages
Books
Code
Conversations
Articles
Knowledge bases

Each dot = billions of parameters

Model Parameters

Billions to Trillions

Parameters are the learned numerical values that encode everything the model knows. During training, these values are adjusted millions of times to minimize prediction errors. The more parameters, the more nuanced patterns the model can capture — but also the more compute and data required.

What is an Agent?

An AI Agent uses an LLM as its reasoning engine, augmented with tools, memory, and the ability to take autonomous action in the world.

The LLM Brain + Tools

LLM Brain

"I need to search the web for this..."

Code Execution

Run code and scripts

Web Search

Search the internet

File Access

Read and write files

API Calls

Connect to services

Database

Query and store data

Calculator

Precise computation

The Agent Loop

Agents operate in a continuous loop — observing, reasoning, acting, and evaluating until the task is complete.

1

Observe

Perceive the environment, read inputs, check tool results

2

Think

Reason about the task, plan next steps using the LLM

3

Act

Execute a tool, generate a response, or take an action

4

Loop

Evaluate the result and decide whether to continue or stop

The loop continues until the task is complete

Memory Systems

Unlike a raw LLM, agents can maintain both short-term context and long-term persistent memory.

Short-term Memory

The context window — what the agent can "see" right now

User: What's the weather in Tokyo?
Agent: Let me search for that...
Tool result: Tokyo is 22°C, partly cloudy
Window size: ~128K tokens3/10 messages

Long-term Memory

Persistent storage — facts and preferences recalled across sessions

User prefers Celsius
2 weeks ago
User lives in Japan
1 month ago
Favorite format: concise
3 months ago
Previous trip: Kyoto
6 months ago

Stored in vector databases, files, or external systems

LLM vs Agent: Side by Side

CapabilityRaw LLMAI Agent
Can use external tools
Has persistent memory
Can take real-world actions
Plans multi-step tasks
Operates autonomously
Generates text
Understands context
Reasons about problems

In summary: An LLM is a powerful text prediction engine. An Agent wraps that engine with tools, memory, and a decision loop — transforming it from a passive text generator into an autonomous system that can reason, plan, and take action to accomplish complex tasks.