AI This Month: Your May AI news roundup

Microsoft’s Copilot AI laptops, Github Copilot extensions, the latest model releases from the major players, and more.

By Adam Ipsen

Jun 3, 2024 • 7 Minute Read

Please set an alt value for this image...

AI moves at an incredible pace, so fast it can feel like you need to be a machine to keep up with it all! We’re here to help by gathering all the month’s biggest news in one place. With just a skim read of this article, you can catch up with all the month’s biggest AI developments and stay in the know. Prefer to listen? Check out the video with all the updates.

Microsoft unveils Copilot+ model and PCs

This month, Microsoft released Copilot+, a generative AI model that lives inside your laptop instead of in the cloud. That might seem backward, considering over the last two decades we’ve been moving away from on prem to the cloud. But by having it on your hardware, it’s obviously low latency, and the AI can use all your local data to help you out. For instance:

Looking at your meeting notes and suggesting scheduling a follow up meeting with your boss at a certain time
Finding an image you’re after in your mess of local directories
Finding a clue for what to get a friend for your birthday based on your recent conversations with them

Basically, you’ve giving the AI context about you to solve your problems, things that would be a pain to give to a cloud-based AI, at least in a secure way and efficient way. The idea is what happens on your laptop, stays on your laptop.

Of course, this is going to take processing grunt. That’s why a whole stable of PC makers have unveiled a new class of laptops called AI PCs, which are designed to be able to run on-device AI apps like Copilot+.

So, what if you want to run Copilot+ without an AI PC? I mean, you can, if it’s a good laptop. You don’t need to run out and buy one just to use a local version of Copilot. It’s just going to drain your battery real fast, so don’t unplug them. The benefit of an AI PC is it’s not going to drain your battery like crazy, and it’ll run more efficiently. In the future, expect to see more AI PCs popping up on the market as the norm.

GitHub Copilot Extensions

Github has just announced GitHub Copilot Extensions, which allows third-party providers to extend the functionality of GitHub Copilot. The result? Developers can go about their work more efficiently.

Right now, if you’re a dev, you might be dealing with a database-related error. You might have to jump into your audit logs in DataStax, go to Sentry for error monitoring. Then figure out the solution, apply the fix, and deploy in Azure. That’s a lot of mucking about.. With Copilot Extensions, you can invoke all these tools from your Copilot Chat, perform actions, generate files, and pull requests. These extensions will be accessible with all your Copilot Chats, like in Visual Studio, Visual Studio Code, and in GitHub.com.

There’s a dozen extensions in the first wave from the likes of DataStax, Docker, LambdaTest, Azure, MongoDB, and Stripe. But the GitHub Marketplace will offer extensions that are open to everyone, and you’ll be able to create private Copilot Extensions from your homegrown developer tooling. To use Copilot Extensions right now, you must be enrolled in the limited public beta.

Azure AI Studio Goes GA

Speaking of having everything all in one place, Azure AI Studio is now generally available. The studio is a place where you can develop and deploy your GenAI apps in a streamlined way, using a mix of visual and code-first tooling. The studio was launched at last year’s build, but now it’s fully open for business.

OpenAI launches GPT 4o

The big news this month was OpenAI’s release of GPT-4o. Now, that’s easy to read as 40, and think… "Hey, did I miss 36 other GPT releases?" But the O stands for Omni, since this new version takes voice, text, and visual input. It can also respond in real time audio, detect a user’s emotional state from audio and video, and adjust its voice to convey different emotion.

Now, I’m going to take a moment to be a bit indulgent and have a big “I told you so moment.” Back in 2023, I wrote an article where I coined the term “Omnimodal AI" to describe an AI capable of taking all five senses as input, and interact back. And now we’re moving closer to that future.

Now, OpenAI’s version isn’t quite omni — it’s still multimodal, since it can’t take touch, smell, or taste as input. If that seems far-fetched, read my article for why it won’t be. And having used GPT-4o, the user interface is still holding it back from being like actually talking to a real-time, true digital assistant. But it’s still a very impressive step forward, and certainly if you’re using GPT models for your organization, it’s another more powerful one for you to use.

Developers can also now access GPT-4o in the API as a text and vision model. GPT-4o is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo. OpenAI plans to launch support for GPT-4o's new audio and video capabilities to a small group of partners in the API in the coming weeks.

New Google Models: Gemini 1.5 Flash, Vio, Imagen 3, AlphaFold 3

Google released a ton of new models this month. Let's dive into them:

In the Gemini family, they’ve launched Gemini 1.5 Flash. If you’re a developer that needs lower latency and a lower cost to serve, this new model is a good light-weight alternative to 1.5 Pro.
There’s Veo, a rival to OpenAI’s sora that can create high quality videos from a text prompt.
Imagen 3, a text-to-image model that’s making some very believable photos.
And AlphaFold 3, an advanced AI model that can predict not just the structure of proteins, but can model DNA, RNA, and allegedly the structure of “all life’s molecules.” This will help researchers in medicine, agriculture, materials science, and drug development test potential discoveries

Microsoft Phi-3 family expands, works on MAI-1

Microsoft has expanded the Phi-3 family on Azure this month with Phi-3-vision, a multimodal model that brings together language and vision capabilities. They’ve also made Phi-3-small and Phi-3-medium available on Microsoft Azure.

Behind the scenes, Microsoft is also working in-house on a new LLM model called MAI-1, which will apparently have 500 billion parameters. This is above Llama 3’s 80 billion parameters but below GPT-4’s rumored 1.7 trillion parameters. That said, MAI-1 is still being developed and has not been officially announced.

In other AI news…

IBM has now open sourced its Granite Code Models, which are trained for simplifying the code process. According to IBM, these models enhance productivity by automating routine and complex coding tasks. If you want to play around with them, they’re available on GitHub.
Anthropic has found a new way to see what’s going on beneath the surface of LLMs, cracking their Claude LLM’s black box. You too can dive into these mysteries and see how the AI’s neurons are firing by checking out their research paper.
Google has introduced SynthID, a tool for watermarking AI-generated text and videos to ensure authenticity and reduce misinformation.
And Google Deepmind has introduced the Frontier Safety Framework for guiding the safe development of AI products.

And that's it for this month!

And that wraps up this edition of AI This Month! If you want to stay up to speed with all the latest news, be sure to follow us on our YouTube channel. You can also check out our blog for in-depth articles on these stories and more.

Stay curious, stay informed, and as always, keep being awesome, folks!

Adam I.

Adam is a Lead Content Strategist at Pluralsight, with over 13 years of experience writing about technology. An award-winning game developer, Adam has also designed software for controlling airfield lighting at major airports. He has a keen interest in AI and cybersecurity, and is passionate about making technical content and subjects accessible to everyone. In his spare time, Adam enjoys writing science fiction that explores future tech advancements.

More about this author