AI This Month: Your February AI news roundup
In the headlines this month: AI’s Sora, Google Gemini for Workspaces, Gemma, OpenAI memory tools (and that one day the chatbot lost its mind).
Feb 29, 2024 • 8 Minute Read
One of the most exciting things about AI is also what makes it frustrating — it moves at an incredible pace. So fast, in fact, it can be hard to keep up with it all. It’s a surefire recipe for FOMO, especially if you work in the field of technology.
That’s why we’re here to solve it with you, with this series of AI This Month. With just a few minutes reading, you can catch up with all the big ticket news items, so you can stay in the know.
AI This Month: Your February AI news roundup
- OpenAI introduced Sora, a photorealistic AI video generator
- Google rebranded Bard as Google Gemini for Workspaces
- Google launched Gemini 1.5 Pro
- Google also announced a new range of open-weight models
- OpenAI is testing memory tools, @mentions for ChatGPT
- Also, in other news, ChatGPT went bonkers for a day
OpenAI introduced Sora, a photorealistic AI video generator
“So you’re covering AI news? How about that Sora, huh? Those AI videos are amazing.”
This was literally every conversation I had at every tech event this month. In February, people were utterly blown away by Sora, a new model by OpenAI that can take your text and turn them into 60-second long videos. It also creates videos in high resolution (That’s 1920x1080, for those at home).
Now, even though Sora is only a research preview, it’s blown people’s minds and caused a little bit of a panic. Why? Because the videos are 100% fake, which means from now on, any video could be completely AI generated. In fact, Sora can generate simulations of fantastical worlds, such as a demo of a wild west village that doesn’t exist.
In short: if trusting an anonymous video on social media was a bad idea before, it’s a really, really bad idea now.
Now, Sora does have some weaknesses, which OpenAI has freely admitted. It doesn’t always understand the physics of a complex scene, such as cause and effect. So, for instance, someone takes a bite out of a cookie, but the cookie might not have a bite mark afterwards. It also might mix up spacial details like left and right, glass shattering, or have objects spontaneously appear.
Now, let it sink in that this is the worst AI-generated video ever be. Considering how impressive it is now, it’s only going to get more impressive in the future.
Still, it’s worth pointing out there’s a big difference between making a 60 second flashy demo and a whole movie. Plus, you often need to refine AI videos and photos a lot to give you the precise details. On top of this, there’s also no synchronized audio generation yet, so this has to be created and matched separately.
Google rebranded Bard as Google Gemini for Workspaces
Now, even though everyone’s talking about Sora, Google has made a lot of announcements this month. The biggest is that it’s revealed its answer to Microsoft Copilot, the chatbot that can be used natively inside Windows, Microsoft 365 applications, Bing, and Edge.
(Okay, Google did have Duet AI for Google workspaces, but… that’s dead, and never really launched anyway.)
Regardless, this competitor is Gemini for Google Workspaces, and is big news if you’re a company operating in a Google Workspace environment. Now, Gemini for Google Workspaces gives you access to the Gemini chatbot in things like Google Docs, Gmail, Google Slides, Google Sheets, and Google Meet. As you’d expect, it can help you with writing, designing, organizing, text input, and illustration.
The Gemini chatbot is Google’s rebrand of the Bard chatbot, and it made this switch this month. Now, if Gemini sounds familiar, like you maybe heard it three months ago, you probably did. Gemini is the same name as Google’s flagship AI model which was launched in December.
You might have seen the social media videos of an AI being able to “see” things from a webcam, like sheet music, and be able to explain what it was seeing - that was Gemini.
There was also a claim Gemini was the first AI that could outperform humans at MMLU tasks. If you’re not familiar with that acronym --- and I wouldn’t blame you --- it means Massive Multitask Language Understanding, a combination of 57 subjects like math, physics, history, medicine, and ethics.
But there was a caveat: the model that had done this was called Gemini Ultra, and it wasn’t available for public use. Well, now it is. They’ve brought out Gemini Ultra under the name Gemini Advanced, which is something you can access with Gemini for Google Workspaces.
Now, let’s talk about cost. Google Workspaces starts at $6 per month per user if you’ve got the starter package, which you need as a baseline (because obviously you can’t use an AI in Google Workspaces if you don’t have it.) The AI add on is an extra $20 a month per user on top of that, which is pretty pricey! You’ve also got to make a one-year commitment, so no month by month.
There is also a user usage limit. Users can only use Gemini AI features a thousand times per month. It’s also only available for those who have their language set to English, because this is what Gemini is trained on.
There’s a Gemini Enterprise plan that costs $10 a month more, which allows you to have advanced meetings with translated captions in 15 languages, and “Full access and usage of Gemini”, but it’s unclear currently what that means.
Google launched Gemini 1.5 Pro
Continuing on with our Gemini news, Google also announced Gemini 1.5 Pro, which is available for early testing. It has a context window of 128k by standard, but one million tokens for developers and enterprise customers in private preview.
Now let me break down why that is so cool. One million tokens means it can remember or process 700k words - your average novel is 100k words. That many tokens also breaks down to roughly 30k lines of code, 11 hours of audio, or one hour of video.
Gemini 1.5 pro also produces comparable quality to Ultra, but uses less compute, which is great because it uses less energy, saving money and creating faster iteration cycles.
However, this is a bit of an odd decision, because Gemini Ultra is the selling point of their new Gemini for Google Workspaces offering, and a week after it launched, they’ve said there’s a flashier version, which doesn’t make Ultra look, well, that Ultra.
Google also announced a new range of open-weight models
To add to the long list of Google announcements, the tech giant launched Gemma, a series of free, open-weight models built on similar technology to the more powerful Gemini models. These are lightweight models that can work on a developer laptop or desktop computer, and are designed to be fine tuned with multiple frameworks (and with Google Cloud, for obvious reasons.)
This is likely a play to match Meta, who has been releasing open-weight models such as LLaMA and Llama 2 since February last year. Gemma also contrasts with OpenAI’s GPT-4 Turbo, which you can’t run locally.
OpenAI is testing memory tools, @mentions for ChatGPT
Alright, enough about Google. What about ChatGPT, you might ask? Well, this month OpenAI said they’re testing memory controls for ChatGPT. Currently, ChatGPT doesn’t remember anything across your different sessions, so you’ve got to tell it what’s going on all over again, like Dory from Finding Nemo.
But with the new memory functionality, you can ask it to remember something specific, or let it pick up details itself. So, if you say you like it when it gives you coding snippets, or puts meeting notes in bullet points, or that you own a coffee shop, it takes all these things into consideration when giving you a response. Previously, you had to enter this manually using the Custom Instructions tool, which was annoying, to say the least.
You can turn off memory at any time, or wipe it. However, if you’re using temporary chat so your conversations aren’t used to train OpenAI’s models, you unfortunately can’t use this feature. Either you’ve got to open yourself up to that risk, or forgo using this new feature.
Additionally, ChatGPT now has @ mentions that can bring custom personalities called GPTs into any conversation. GPTs are usually built to do a certain task or know about a certain topic, so this allows you to have a set of AI variations that focus on that one thing at your beck and call. So, if you’ve got one set up as a chef or a wellness advisor, you can call them in and ask them questions.
Also, in other news, ChatGPT went bonkers for a day
ChatGPT went a bit peculiar for a brief period in February, with users reporting unexpected, rambling outputs from the AI assistant. For example, one user shared a question about dog food, which devolved into the following:
Yes. Really. Anyway, the cause was the large language model was trying to map numbers to words, and picked the wrong ones, producing word sequences that made no sense. Apparently this was caused by inference kernels producing incorrect results while in certain GPU configurations, and has been fixed now.
Which, honestly, I’m a little sad about, because there’s no more delightful nonsense.
And that’s it for this month!
I’m sure by the time you’ve read this, several more things have happened in the AI space. I’m sure we’ll cover it in AI This Month, next month. As always, keep being awesome, Gurus!
Missed the last few months?
Check our previous editions of AI This Month to see what you might have missed!