100 Years of Artificial Intelligence Explained
Chapters8
In September 2012, Alex Kresvki secretly builds a learning system in his bedroom using gaming GPUs, unknowingly laying the groundwork for what would become artificial intelligence’s long history.
From Enigma to Claude and Cloud Code, this brisk history shows how AI leapt from clever ideas to the everyday backbone of modern tech.
Summary
Nate Herk traces a century of artificial intelligence, beginning with early probabilistic machines and Turing’s codebreaking bomb, then moving through the birth of a named field at Dartmouth in 1956. The tale follows competing camps: the symbolic rule-based approach championed by Minsky and Rosenblatt’s neural networks, the AI winters that silenced funding, and the commercial pivot to expert systems in the 1980s. The breakthrough era arrives with neural networks enabled by GPUs, data abundance from ImageNet, and the 2012 AlexNet leap that upended image recognition. DeepMind’s AlphaGo demonstrates creative problem solving, while the transformer architecture finally unlocks scalable language models. OpenAI’s GPT lineage and the consumer explosion around ChatGPT redefine public perception, followed by enterprise-focused plays like Claude (Anthropic) and Google Gemini, and developer-centric tools such as Cloud Code and Codex. The video culminates with a prediction that AI’s evolution—powered by models, data, and compute—will keep accelerating and redefining industries. Nate emphasizes that the history isn’t over, but the direction is unmistakably toward more capable, more integrated AI across consumer and developer ecosystems.
Key Takeaways
- Enigma’s codebreaking effort used thousands of Bomb settings at once, dramatically narrowing key possibilities and shortening World War II by years.
- AlexNet’s 2012 win on ImageNet reduced error rates to 15% and made deep learning the dominant approach in image recognition within a year.
- Backpropagation, popularized by Geoffrey Hinton and colleagues in the 1980s, enabled training deep neural networks by propagating errors backward through layers.
- The transformer architecture, introduced in 2017, reads all words in parallel rather than left-to-right, revolutionizing language tasks and spawning modern AI systems.
- DeepMind’s AlphaGo demonstrated genuine strategic creativity, placing a move that humans hadn’t anticipated and signaling AI’s capacity for original thought.
- Claude and Cloud Code exemplify a shift to developer-centric AI tools, with Anthropic and OpenAI competing in consumer and enterprise spaces in the mid-2020s.
- By 2025–2026, major tech players invested billions in AI, intensifying a race among Anthropic, OpenAI, and Google to own the developer and cloud AI stack.
Who Is This For?
Essential viewing for AI enthusiasts, developers, and managers who want a cohesive narrative of how AI evolved from symbolic and neural roots to today’s cloud-and-code-centric landscape.
Notable Quotes
"The bomb spins through thousands of Enigma settings at once using guess message phrases to spot contradictions and eliminate impossible keys."
—Explains how Turing’s device dramatically narrowed down possibilities in wartime cryptography.
"If a machine communicating only through text could fool a human into believing it's another human, that can be considered intelligence."
—Turing’s imitation game framing for AI cognition.
"In July of 1958, Rosenblatt builds a working version of his idea and calls it the perceptron."
—Marks the early neural network milestone that shaped decades of AI debate.
"Attention is all you need."
—Introduction of the transformer architecture, a watershed moment in AI.
"AlphaGo won this game. The move demonstrated that AI could actually create new thoughts."
—Showcases AI’s leap from learning to strategic creativity in games.
Questions This Video Answers
- How did the Enigma codebreakers influence modern computing and AI?
- What is backpropagation and why did it revive neural networks in the 1980s?
- What is the transformer model and why did it change natural language processing?
- Why did the AI winters happen and what resurrected AI in the 2000s?
- How did AlexNet change the trajectory of computer vision?
AI historyEnigma machineAlan TuringDartmouth 1956symbolic AIneural networksbackpropagationtransformerGo/AlphaGoImageNet','AlexNet','DeepMind','OpenAI','Claude','Cloud Code
Full Transcript
September 2012. In a bedroom in his parents house, 26-year-old Alex Kresvki is creating a system inside two gaming graphics cards. He's training the system to recognize pictures. For him, this is just a side project that his friends convinced him to do to win a competition. But what he doesn't know is that he's about to build the system that will become the foundation of one of the greatest inventions in history. This is the 100-year history of artificial intelligence. 1939, Britain is losing the Battle of the Atlantic. German submarines are sinking ships faster than the Allies can replace them.
However, the Allies manage to get information about the German code machine that's sending military orders and tactical instructions, and it's called Enigma. So, if the Allies can crack the Enigma code, they'll basically get the German Ubot positions and attack plans. But there is one problem. The German code machine called Enigma has over a 100 quintilion possible settings, which is obviously way too many for any human team to crack by hand. So they called one of the biggest mathematical minds in Britain, a man called Alan Turing. His only task was to crack the Enigma code. Now over the next year, Turring designed an electromechanical machine called the bomb or Bombay.
But in this video, I'm just going to be saying bomb, so don't get mad at me. The bomb spins through thousands of Enigma settings at once using guest message phrases to spot contradictions and eliminate impossible keys. This enabled the narrowing down of millions of options to just a handful of possible solutions that a team could use to crack Enigma. So by the end of the war, more than 200 bombs were running across Britain, breaking over 4,000 German messages per day to crack the Enigma code. And they did it. The code revealed strategic data that allowed the Allies to turn the Battle of the Atlantic and shortened the war by 2 to four years.
Unfortunately though, the bomb has the same problems as most wartime machines. It's full of vacuum tubes, which are basically glass bulbs that control the flow of electricity, and they burn out constantly. The mechanical switches inside are slow, and the machine can't be reprogrammed without literally rewiring it by hand. So, after the war, most bomb machines were dismantled or just scrapped. However, even without his machine, Turring continued to build the idea of what AI could be. He published one paper that proposed what he calls the imitation game. He states that the scientific community should stop wondering if machines can think.
They should be wondering what would prove that they could think. If a machine communicating only through text could fool a human into believing it's another human, that can be considered intelligence. And there's actually a great movie about this with Benedict Cumberbatch called The Imitation Game. Anyways, that statement created the first waves of people reframing what an intelligent machine actually was. Unfortunately, Turring would die at 41 and couldn't continue working on the idea of an intelligent machine. But years after he died, the concept of intelligent machines began to be studied. research studies and methods were being published in different fields like mathematics, psychology, and electrical engineering.
But this was a problem because without a shared name, there's no shared community. And without a shared community, there's no funding, there's no university programs, and there's no way to attract new researchers. So, a field without a name just doesn't really exist. So, a young professor named John McCarthy decides to put an end to that. He believes that if the right people sat in a room together for a whole summer, they could actually come up with a unified name for this field. So in 1955, McCarthy proposed a project and secured funding from the Rockefeller Foundation gathering signatures from different institutions.
The proposal was co-signed by McCarthy and three more researchers from some of the top institutions in the world like Harvard, IBM, and Bell Labs. Among them was Claude Shannon, and this is who Enthropic named their claude models after, but we'll talk about this more when we get to the 2020s. So in the summer of 1956, around 10 people gathered in Dartmouth to name the field that will research thinking machines. They have multiple options to choose from, but in the end, the name for the field was artificial intelligence. They chose this name because it sounded ambitious and like something they wanted to fund, you know, a thinking machine.
Now, by the late 1950s, the field had an official name and funding from different institutions. But it was divided by two perceptions on how to actually build a thinking machine. And these two perceptions started as a high school debate. We've got Marvin Minsky and Frank Rosenblat. And they knew each other from the Bronx High School of Science. They argued about these two perceptions all the time. And as they got older, the debate got bigger. Minsk's idea is a rule book. This idea states that to create a thinking machine, you need to give it rules. If you see this, do that.
If you see this other thing, do this other thing. Then the sequences continue until the machine can handle every single possible situation. The concept is that human intelligence is logic. So if you write enough logic, you'll get intelligence. This perception was called the symbolic approach. But the Rosenlat perception was the complete opposite. This perception states that an intelligent machine wouldn't need to follow orders. It should actually have something close to how the brain works. billions of neurons wired together, all firing on and off as we think. So, an intelligent machine should be built with artificial neurons that automatically tune themselves by looking at thousands of examples.
It'll then figure out the rules based on those examples. This approach is called the neural network. And this approach would be the first one to build a thinking machine. So, in July of 1958, Rosenlat builds a working version of his idea and calls it the perceptron. It's the simplest possible neural network. It runs on an IBM 704 computer the size of a room and it uses a 20 x20 grid of light sensors feeding into adjustable connections. Now, these connections act like motorized volume knobs that the machine could turn up or down as it learned. After about 50 practice attempts, the Perception teaches itself to tell the difference between two kinds of punched cards.
Now, just think about that. In 1958, a machine that taught itself to do that. Even though it was very simple, it was a miracle. So, the US Navy funds this perception project and stages a press conference. During that conference, one of the most famous quotes came from a New York Times article. The Navy expects this device to be the embryo of a computer that will be able to walk, talk, see, write, reproduce itself, and be conscious of its existence. So, for the next 11 years, Rosenlat and Minsky debated at conferences in front of audiences made up of researchers and graduate students.
Rosenblat argues that the neural networks can do almost anything, while Minsky argues that they can do nearly nothing. And he proved it in 1969. Minsky and his MIT colleague published a book called Perceptrons. In this book, Minsky proves mathematically that Rosenlat's machine has a hard ceiling on what it can learn. There are basic patterns it'll never recognize, no matter how much you train it. The math is correct, and it makes the entire neural network research program look like a dead end. Unfortunately, Rosenblat wouldn't be able to defend his machine because he died in 1978. And with him, so did his approach.
Now, within months, the US government stops funding the neural network machine and shifts over to Minsky's symbolic camp. But even though Minsky proved that the neural network approach was wrong, he couldn't prove that his approach was right. After less than two years of funding from the US and British governments, the British government sends a mathematician to see if the AI research is producing something useful and it just wasn't. Speech recognition was a joke and the translation systems couldn't translate. So the mathematician writes a report saying that the entire promise of human level artificial intelligence is just an illusion.
Funding from the British and the US governments collapses and the first AI winter arrives which meant that for years the entire AI field was just completely silent. No one wants to fund something that can't actually produce results. So in 1980 the entire industry made a shift. The field stops trying to solve the problems that the US and British governments wanted to solve and instead focuses on solving more commercial problems. In 1980, the first big commercial AI system was revealed at Carnegie Melon University. Its name is XCON and it's made with one goal in mind, which is just to do one tedious job extremely well.
So, when a customer orders a custom Decvac computer, somebody has to figure out which exact components will go into it out of millions of possible combinations. Now, humans are slow at this and make mistakes, but XCON does it perfectly in seconds. And by 1986, XCON was saving Deck tens of millions of dollars a year. And people started to call it an expert system, a computer program that pretends to be a human expert in one narrow job by following thousands of handwritten rules. Basically, the symbolic approach from Minsky was finally producing something useful. So throughout the 1980s, the entire AI industry tried to clone XCON across every domain at once.
One expert system diagnoses bacterial infections. Another one analyzes chemical compounds, and another one helps geologists find mineral deposits. All of these were running on list machines which were specialized computers built to run the list code that these expert systems were written in. By 1985, Fortune 500 companies were spending more than a billion dollars a year on these expert systems. So the AI field regained its presence. However, just 2 years later, that same fast growth would collapse very fast. The expert systems are fragile. They work great inside the specific job that they were built for, but they fail in everything else.
Every weird new situation needs a new rule, and maintaining those rules requires a whole team. But even if the team could add a new rule, there was always the possibility that a new rule could conflict with another rule. And that caused the whole system to just break. The problem could be fixed by pouring more money into hiring AI experts to keep training the machines. But the possibility of more funding turned to zero when a new machine arrived on the market. And by 1987, the regular workstations like the ones that Sun Micro Systemystems was making could do the same thing the Lisp machines did.
And they cost a fraction of the price. There was no reason to spend $70,000 on a Lisp machine when a $10,000 Sun workstation could run the same program. So the AI hardware industry worth half a billion dollars at its peak collapses in months and Lisp Machines Incorporated goes bankrupt. The symbolic approach that Minsky proposed went from being the future of AI to becoming its own downfall. And with it, the second AI winter began. But just one year before the symbolic approach failed, the neural network approach would start to rise again. In 1986, while expert systems were at their commercial peak, neural networks were still considered career-ending research.
However, three men thought that it was still the best approach. Geoffrey Hinton and two more people published a paper showing that the problem that Minsky pointed out about neural networks was actually solvable. Minsky said that if you stack multiple layers of neurons and the network gets an answer wrong, there's no way to know which neuron in which layer caused the mistake. And if you can't figure out what to fix, then you can't train the network. But the solution was actually simple, which was to work backwards. When the network gets the answer wrong, you trace that mistake back through every layer of connections.
Each connection gets a piece of the blame in proportion to how much it caused the error. And once you know who to blame, you know who to adjust. And once you can adjust them, you can train networks with as many layers as you want. And this is called back propagation. Rosenblat didn't have time to fix his approach. But Hinton and others continued his work. The back propagation paper had a big impact on the field and later became one of the most cited papers in all of AI history. However, neural networks still couldn't get real world results that would change the situation of the field.
And the reason is very simple. The mathematical solution worked, but the hardware to make it work didn't yet exist. So, trying to build a multi-layer neural network would take weeks or even months for the computers of the era. But that problem was going to be solved by the gaming industry. Nvidia graphics cards got amazingly powerful by the 2000s. And it turns out that the kind of math a graphics card does is exactly the kind of math a neural network needs. So after 20 years, the compute technology that neural networks needed was finally created. Hidden students could now train neural networks on GPUs in days instead of months.
But there was another problem. Of course, they didn't have enough data to actually train these networks. Because to teach a neural network to recognize something like a cat in a picture, the system has to review hundreds of thousands of photo examples with different angles, different lighting, different breeds, and different backgrounds. The neural networks of the 2000s had never been shown enough examples to actually learn anything. So computer sciences called Feay Lee teams up with a handful of grad students to build the largest collection of labeled images in history. And by 2009 this collection which she calls ImageNet had more than 3 million labeled photos.
And by 2010 it had 14 million. So ImageNet solved the data problem for computer vision. And for the first time researchers had 1.2 million labeled photographs across a thousand different categories which was more than enough to actually train a deep network to recognize real world objects. So the compute problem was solved. The data problem was solved. The machine learning era began. Imagenet created an annual competition where every lab in the world could test their best system on the same 1.2 million images. In 2010, the best system in the world got 28% of the answers wrong.
In 2011, a different team pushed it to 26%. The AI field was evolving at a slow rate. But in 2012, a grad student in Toronto would change the entire field. Alex Kvski was convinced to enter the imageet competition by one of his friends. But he didn't go with the traditional route. Other research teams were coding the rules to tell the network what to look for to recognize an image. So things like edges or corners, textures and shapes. But KVski didn't write any rules. Instead, he fed the network with the entire imageet database and let it figure out what features matter on its own.
It sounds like the way I like to use cloud code right now. And the machine learned its own theory of vision. And Alex called this AlexNet. When the competition ended, Alex Net got 15% of the answers wrong. 11 points less than last year winner. And at this point, less is good because it's your percentage wrong. But anyways, he didn't just beat the competition. He made every other approach look obsolete and proved that machines can actually learn. The results spread through the research community almost instantly. After 12 months of the competition, every serious image recognition lab on Earth is using neural networks.
And after another 12 months, Google, Facebook, and Microsoft had hired most of the serious deep learning talent out of universities. The entire Alexet architecture helped reshape products like Google Photos, Google Lens, and Google Search. Each one of these products was trained the same way that Alex trained his own system, AlexNet. And from one day to the next, the entire AI field went from being forgotten to being the future for these big tech companies. And this was just the beginning. Because in the following years, learning machines began to evolve. A small London lab called DeepMind got the world's attention by building an AI that taught itself to play Atari games from scratch.
Both Facebook and Google wanted this technology. But in the end, Google won, acquiring DeepMind in January 2014 for roughly $500 million. And two years later, DeepMind would turn the idea of thinking machines into a reality. So, in March of 2016, DeepMind's AI program, Alph Go, played a fivegame match against a professional Go player, Lee Sadal. A typical Go game has more possible board positions than there are atoms in the observable universe. And Lee is an 18time world champion. In game two of this match, Alph Go does something that nobody can explain. It places a stone in a spot that no professional player would have ever even considered.
The commentators on the broadcast initially assume that the AI started glitching. And when Lee sees this move, it takes him more than 12 minutes to figure out a proper response. And in the end, AlphaGo won this game. So that single move demonstrated that AI could actually create new thoughts, which is just so wild. Alph Go had taught itself what a good go position looks like and developed its own theory of the game, its own strategy. 3 years later, Lee retires from professional Go and says, "Even if I became number one, there is an entity that cannot be defeated." The poor guy.
Machines went from learning to creating new thoughts just like a human would. And just one year later, the biggest change in AI history would come from an AI experiment. In June of 2017, eight Google researchers published a paper with the title, "Attention is all you need." The paper proposes a new design for neural networks that they call the transformer. The transformer's big idea is this. The old neural networks for language read text the way that humans read it, one word at a time, left to right, remembering what came before. And that's kind of slow and it loses track of context across long passages.
But the transformer basically reads every word in the sentence at one time in parallel. So instead of reading a book, turning page by page, it reads every single page at the same time. And the transformer was created to make translation between languages faster. But what the eight authors didn't realize is that they just created what we now know as AI. Researchers at an AI company called OpenAI noticed that they could take half of the transformer that's good at generating text and train it on a single task. Read a chunk of text and predict the next word.
So they fed it massive data sets from the web books and code and they had it make predictions billions of times. In June of 2018, they released their first model GBT1 followed by GBT2 in 2019 and GBT3 in 2020. These neural systems could write code, summarize documents, draft emails, answer questions from a single prompt. Two years later, OpenAI wrapped this new technology into a simple chat box and released it to the world as ChatBT. And that was the moment that the AI field changed forever and probably where you guys might have stumbled upon AI. That's where I stumbled upon AI.
CatchBT reaches 1 million users in just 5 days, 100 million in just 2 months, and it became the fastest growing consumer app in history at the time. Then Microsoft announced a $10 billion investment and Google declares an internal code red, panicking that its core search business is about to be disrupted. For the first time since the field was created in 1956, AI has reached the general public. An AI gold rush began. On March 14th, 2023, Enthropic publicly launched Claude, its first AI assistant named after Claude Shannon, who we talked about earlier in this story. Eight months later, in December of 2023, Google struck back with Gemini.
And then the money just started moving. By the mid2020s, Microsoft had committed around $13 billion to OpenAI. Amazon gave around $5 billion to Enthropic and Google roughly $2 billion more. The three companies began to race to be the dominant player in the market, but each one bet on something different. OpenAI doubled down on general consumers. Chacht added voice, vision, memory, image generation. Google focused on giving more to its current users and embedded Gemini in their whole, you know, Google ecosystem. But Enthropic took an entirely different approach. It focused specifically on developers. In June of 2024, Claude 3.5 Sonnet shipped with Artifacts, a side panel that let users preview generated code as it was being written.
And within months, Claude had become the model of choice for serious developers. Then in February of 2025, Enthropic released a research preview called Cloud Code, a command line tool that could read a project, edit files, run commands, and even just build software for you locally. And this made OpenAI and Google realize the market that they were losing. OpenAI then launched their own coding tool called Codex. But Cloud Code just kept pulling ahead and by November 2025, it was bringing in over a billion dollars a year. Only 6 months after their launch, one of the fastest revenue jumps in software history.
Even Microsoft, after restructuring its exclusive partnership with OpenAI, committed up to $5 billion to Enthropic. Google tried to keep up in the developer market, releasing coding tools like anti-gravity. But by the end of 2025, the picture was pretty clear. Chatbt still owned the consumer market at this point, but when it came to power users that were building things with AI, cloud code had become the dominant player. People with no coding background started building complete software in just one weekend. And that's when the term vibe coding became very mainstream. And in April of 2026, Amazon committed up to $25 billion more to Anthropic, while Google followed up with another $40 billion.
Even Claude's competitors know that Cloud Code is the biggest, hottest AI tool right now. However, the race certainly isn't over. Every single day, new AI tools are being built and more features are being released by these big three players. The history of AI is far from being over, and I'm super excited.
More from Nate Herk | AI Automation
Get daily recaps from
Nate Herk | AI Automation
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.





