GPT-3 has 175 billion parameters/synapses. Human brain has 100 trillion synapses. How much will it cost to train a language model the size of the human brain?

REFERENCES:

[1] GPT-3 paper: Language Models are Few-Shot Learners
https://arxiv.org/abs/2005.14165

[2] OpenAI's GPT-3 Language Model: A Technical Overview
https://lambdalabs.com/blog/demystifying-gpt-3/

[3] Measuring the Algorithmic Efficiency of Neural Networks
https://arxiv.org/abs/2005.04305

The human brain is at least 100 trillion synapses, and it could be as high

And a synapse is a channel connected to neurons through which an electrical or chemical signal is transferred and is the loose inspiration for the synapses, weights, parameters of an artificial neural network.

GPT-3, the recently released language model from OpenAI that has been captivating people's imagination with 0 shot or few shot learning has 175 billion synapses or parameters.

As mentioned in the OpenAI paper, the amount of compute that was used to train the final version of this network was 3.14 x 1023 flops.

And if we use reasonable cost estimates based on Lambda's test of U100 cloud instance, the cost of training this neural network is $4.6 million.

Now, the natural question I had is, if the model with 175 billion parameters does very well, how well will a model do that has the same number of parameters as our human brain?

Setting aside the fact that both our estimate of the number of synapses and the intricate structure of the brain might require a much, much larger neural network to approximate the brain.

But it's very possible that even just this 100 trillion synapse number will allow us to see some magical performance from these systems.

And 1 way of asking the question of how far away are we is how much does it approximately cost to train a model with 100 trillion parameters.

So GPT-3 is 175 billion parameters and $4.6 million in 2020.

Let's call it GPT-4 HB with 100 trillion parameters.

Assuming linear scaling of compute requirements with respect to number of parameters, the cost in 2020 for training this neural network is $2.6 billion.

Now, another interesting OpenAI paper that I've talked about in the past, titled Measuring the Algorithmic Efficiency of Neural Networks, indicates that for the past 7 years, the neural network training efficiency has been doubling every 16 months.

So if this trend continues, then in 2024, the cost of training this GPTHB network would be 325 million, decreasing to 40 million in 2028, and in 2032, coming down to approximately the same price as the GPT-3 network today at 5 million.

Now it's important to note as the paper indicates that as the size of the network and the computer increases, the improvement of the performance of the network follows a power law.

Still, given some of the impressive Turing test passing performances of GPT-3, it's fascinating to think what a language model with 100 trillion parameters might be able to accomplish.

I might make a few short videos like these, focusing on a single, simple idea on the basics of GPT-3 including technical even philosophical implications along with highlighting how others are using it.

So if you enjoy this kind of thing subscribe and remember try to learn something new every day.


Speaker 0: The human brain is at least 100 trillion synapses, and it could be as high

Speaker 1: as 1,000 trillion. And a synapse is a channel connected to neurons through which an electrical or chemical signal is transferred and is the loose inspiration for the synapses, weights, parameters of an artificial neural network. GPT-3, the recently released language model from OpenAI that has been captivating people's imagination with 0 shot or few shot learning has 175 billion synapses or parameters. As mentioned in the OpenAI paper, the amount of compute that was used to train the final version of this network was 3.14 x 1023 flops. And if we use reasonable cost estimates based on Lambda's test of U100 cloud instance, the cost of training this neural network is $4.6 million.

Now, the natural question I had is, if the model with 175 billion parameters does very well, how well will a model do that has the same number of parameters as our human brain? Setting aside the fact that both our estimate of the number of synapses and the intricate structure of the brain might require a much, much larger neural network to approximate the brain. But it's very possible that even just this 100 trillion synapse number will allow us to see some magical performance from these systems. And 1 way of asking the question of how far away are we is how much does it approximately cost to train a model with 100 trillion parameters. So GPT-3 is 175 billion parameters and $4.6 million in 2020.

Let's call it GPT-4 HB with 100 trillion parameters. Assuming linear scaling of compute requirements with respect to number of parameters, the cost in 2020 for training this neural network is $2.6 billion. Now, another interesting OpenAI paper that I've talked about in the past, titled Measuring the Algorithmic Efficiency of Neural Networks, indicates that for the past 7 years, the neural network training efficiency has been doubling every 16 months. So if this trend continues, then in 2024, the cost of training this GPTHB network would be 325 million, decreasing to 40 million in 2028, and in 2032, coming down to approximately the same price as the GPT-3 network today at 5 million. Now it's important to note as the paper indicates that as the size of the network and the computer increases, the improvement of the performance of the network follows a power law.

Still, given some of the impressive Turing test passing performances of GPT-3, it's fascinating to think what a language model with 100 trillion parameters might be able to accomplish. I might make a few short videos like these, focusing on a single, simple idea on the basics of GPT-3 including technical even philosophical implications along with highlighting how others are using it. So if you enjoy this kind of thing subscribe and remember try to learn something new every day. You

See all Lex Fridman transcripts on Youtube

GPT-3 vs Human Brain