OpenAI’s new ChatGPT AI, o1 was previously
called project strawberry. They say that it is much smarter than a regular strawberry and
has PhD-level knowledge, can write video games, solve tricky brain teasers, and
achieve world peace. Yeah! Well, maybe not that one, but we will make
sure to have a look at all the rest and find out if should be called
Dr. AI or not. So what can it do? Well, it can quickly write up a snake game…in 2D.
But now, how about a snake game…but instead in 3D? Let’s see…just one prompt goes in, and…look
at that! This is the game as it came out on the first try.
Goodness. This not only seems
just fine, but I see a little polish already, for instance it also checks when you hit an
obstacle and knows when the game is over. Now, more games. It can also write you a chess
game real quick, but with a twist…look. What is happening here? Who is he playing against?
Well, against an AI. Wait…so this is an AI that just programmed another AI. That is I
think, insanity. What a time to be alive! So, what about the Two Minute Papers special, and
that is, of course, physics simulations. Well, it can’t create these incredibly sophisticated
systems that we talk about here, however, a rudimentary little program? Not a problem!
I will tell you about a PhD-level astrophysics example that is a hundred times more difficult
in a moment and we’ll see if it can do it. A previous research work made a video game
where 25 of these AIs were given an identity and they organized a valentine’s day
party.
Or they made a video game, yes a video game inside a video game.
It’s like Inception, but nerdier. How deep can this go? Now imagine putting 25
of these brainy little AIs in a video game, and getting them to work together on something
amazing. Or maybe they would just start to argue about whose code is better. The AI
version of a family reunion. Good times! So how does all this happen? Well, this is an
AI that is given a little more time to think. When asking for a futuristic tron-type
game, it gives you the code code code, okay, but not everyone is a programmer.
And…look.
Here it has also attached the thought process to help out. And from this,
a playable game comes out that you can play right now if you stop this video here and
check out the video description for the link. It even has camera controls, so you can
see yourself better as you lose. Excellent. And the key here is that this is an AI
that can finally do planning. For instance, when we ask a paradoxical question, like
how many words there are in the answer to this prompt. Not in our prompt, but in your
answer. Tricky…it starts thinking, recognizes the severity of the situation, don’t fall into
the trap, little AI…and… nailed it. Perfect. This planning can also help it write a poem,
but not any poem, but one where first letters of odd lines spell this, and more, it is a nice
little brain teaser with tons of constraints, then it thought for how long? Only 35 seconds,
nice…and it was able to create that poem. Now, as promised, some say that it has PhD-level
knowledge, but does it? Because if it does, we should not just test it on simpler questions
like these.
Let’s do the real deal! Oh yes, here is a PhD thesis on astrophysics on
how to measure black hole masses. And now, hold on to your papers Fellow Scholars because
the scientist who wrote it says that there is a piece of code for an experiment
that took him an entire year to write, and ChatGPT was able to accomplish that
in… are you serious? Get this…it did all that in one hour.
Wow. By the time the next
version drops, we may have to give it tenure! Okay, so how smart is it? What are its limits?
Well, there are IQ tests out there for humans, and don’t forget, these AI assistants can see now,
so can they take a test? Yes they can. So how well did previous techniques do? On some questions,
“not so great” would be an understatement if they answer at all. They scored well under 100,
the average human score. And the new one? Whoa, they measured an IQ of 120.
That is really
far beyond previous techniques. And while I say bravo OpenAI, I would also like to note
that this is not peer-reviewed research, it is a little more speculative result,
so take it with a pinch of salt. And what would you Fellow Scholars use this for?
Let me know in the comments below. By the way, our next episode will be on this work from NVIDIA, so make sure to subscribe and
hit the bell icon to not miss it..