How were the AI models trained?

by Stélio Inácio, Founder at Jon AI and AI Specialist

How Were the AI Models Trained? The Digital Library of Humanity

We've learned that deep learning models learn from data, but the sheer scale of that process is almost hard to comprehend. How do you teach a machine to understand language, reason, and even write poetry? You give it a library bigger than any that has ever existed: the internet.

At its core, training a large language model (LLM) like ChatGPT or Gemini is like putting a student with a perfect memory and an infinite amount of time into the world's largest library and telling them to read everything. The model scans trillions of words and sentences from websites, books, articles, and scientific papers. It doesn't "understand" in the human sense, but it learns the statistical relationships between words. It learns that "the sky is" is very likely to be followed by "blue." It learns the patterns of grammar, the flow of a story, and the structure of a logical argument. This initial, massive reading phase is what gives the model its general knowledge about the world.

The Two-Step Training Process

Getting from a raw, knowledgeable model to a helpful AI assistant involves a couple of key stages.

Step 1: Pre-training (Building the Brain): This is the massive library-reading phase. The AI is given a huge dataset—a significant portion of the public internet—and a simple task: predict the next word in a sentence. By doing this trillions of times, it builds a complex neural network that understands language patterns, facts, and concepts. After this stage, the AI is knowledgeable but not necessarily helpful or safe.
Step 2: Fine-Tuning (Teaching it Manners): This is where humans come in to polish the model. In a process often called Reinforcement Learning with Human Feedback (RLHF), human trainers have conversations with the AI. They rank its responses, showing it which answers are good, helpful, and safe, and which are bad, toxic, or unhelpful. The model is then "rewarded" for producing answers similar to the good examples and "penalized" for the bad ones. This process is like teaching the knowledgeable student how to be a polite, helpful, and safe conversationalist.

Key Concept: We All Contributed to AI Training

AI was trained with publicly available data on the internet, including content from social media, blogs, and other platforms. This means that the AI has learned from a wide range of human expressions and knowledge.

Quick Check

What is the main goal of the "fine-tuning" stage of AI training?

A) To give the AI its initial knowledge by making it read the entire internet.

B) To teach the knowledgeable but raw AI to be helpful, harmless, and conversational using human feedback.

C) To install the AI model onto servers so people can use it.

Recap: How AI Models Were Trained

What we covered:

AI models are trained in a two-step process: massive "pre-training" on public internet data, followed by "fine-tuning" with human feedback.
The pre-training phase gives the model its vast general knowledge by learning statistical patterns in language.
The fine-tuning phase (using methods like RLHF) teaches the model to be helpful, safe, and conversational.
Essentially, the collective knowledge of humanity on the public internet has served as the textbook for modern AI.

Why it matters:

Knowing how AI is trained helps you understand both its incredible capabilities and its inherent limitations and biases. It learned from us, so it reflects both the best and the worst of the information we've put online.

Next up:

We'll explore how you can train your own AI and what it means to "fine-tune" a model for specific tasks.

Jon AI Services

How were the AI models trained?