Only a week after releasing Gemini 1.0 Ultra, Google has now released a preview of Gemini 1.5 Pro, a new model that is meant to compete directly with one of the biggest AI models in the world, GPT-4. This new model is available via AI Studio and Vertex AI and promises to feature “dramatically enhanced performance,” at least that’s what Google claims.
As per Google, the new 1.5 Pro model is built to handle larger amounts of data with a bigger context window. But is the new model significantly better than 1.0 Pro? Or is this just an upgrade that doesn’t make much of a difference? We’ll find this out in our Google Gemini 1.5 vs 1.0 Pro comparison.
Google Gemini 1.5 vs 1.0 Pro – what’s new?
Using Google Gemini is the same as before, but there are new additions to it. From a larger context window to a faster response time, the 1.5 Pro model brings a lot to the table. It can not only analyze large blocks of data but can also now quickly find a particular piece of text inside blocks that may even consume around 1 million tokens. Below, you’ll find everything that’s new about the Google Gemini 1.5 Pro.
Faster response time
Compared to Gemini 1.0, Gemini 1.5 Pro has a much faster response time, and this is all thanks to the latest Transformer and Mixture-of-Experts (MoE) architecture. Typically, a normal Transformer works as a single neural network. However, MoE models consist of a group of small neural networks or models, which allow the system to work more efficiently.
Whenever some type of input is given to the model, the MoE models activate only the relevant pathways to not waste any resources and it divides the task into multiple subtasks so they can be assigned to suitable neural models. All of this translates to a faster response time while maintaining the quality of the output.
Comes with a larger context window
One of the biggest highlights of the Gemini 1.5 Pro is the context window. For those who don’t know, a context window is made up of tokens, which can be parts of texts, images, audio, code, or videos. The larger the context window is, the more information Gemini can take in and process to generate an output. Think of the model as looking through a window. If the window is bigger, the model can see more information and do a better job of giving answers.
Gemini 1.5 Pro comes with a context window of 1 million tokens. This is significantly bigger than the context window of Gemini 1.0, which was capped at 32,000 tokens. However, do keep in mind that the 1 million context window is a part of the paid version of Gemini 1.5 Pro. Those who use the free version will get access to a 128,000 context window. However, this is still 4 times bigger than what we had in Gemini 1.0 Pro. But for now, until the paid version is here, users can access the 1 million context window for free in the preview version.
As for what the 1 million context window can process in one go, it can handle a 1-hour video, 11 hours of audio, and a code with over 30,000 lines or 700,000 words. To show an example, Google fed Gemini 1.5 Pro the 402-page transcripts from Apollo 11’s mission to the moon and asked it to find three comedic moments from the transcript. The transcript was of around 330,000 tokens, and the model generated an accurate output in less than a minute.
Better at coding
According to Google, Gemini 1.5 Pro is much better at coding than Gemini 1.0 Ultra, so 1.0 Pro isn’t even a part of the equation here. The free version of Gemini 1.5 Pro is able to deliver much better output than the paid version of the previous model. This is also a part of the large context window, which now allows Gemini to understand more information and lines of code.
When it comes to problem-solving tasks, Gemini 1.5 Pro can perform efficiently across longer code blocks. During the official preview, it was able to go through a prompt with more than 100,000 lines of code, which used more than 800,000 tokens, and suggested helpful modifications. It even explained how certain parts of the code worked, and the results were accurate. So, if you’re a developer, this could really help you out when dealing with large blocks of code.
Better performance and learning skills
When it comes to performance, Gemini 1.5 Pro outperforms 1.0 Pro in 87% of the benchmarks – though it stays on the same level as 1.0 Ultra. To test its accuracy, Google used the Needle In A Haystack method. In this, they placed a small piece of text containing a particular fact or statement inside a long block of text and asked Geminin 1.5 Pro to find it. During the tests, it was able to find it 99% of the time. Google even tested this with blocks of data that consumed around 1 million tokens, and the results were the same.
When it comes to learning new skills, Gemini 1.5 Pro shows impressive results. As per Google, the new model can learn a new skill quickly through given information. For instance, Google gave it a grammar manual for Kalamang, a language spoken by less than 200 people in the world. Surprisingly, Gemini 1.5 Pro was able to learn everything about it and was later able to translate English to Kalamang just like how it would translate a popular language. What this means is that during a conversation, you can now give information to Gemini that it hasn’t seen before and it will learn it quickly so it can apply it to the rest of the conversation.
What is Google Gemini better now for?
While Gemini was already useful for people dealing with documents, audio, and video files, it has now improved to something that can aid those who are constantly dealing with bigger and longer data.
Thanks to its larger context window of up to 1 million tokens, it is now ideal for analyzing lengthy documents like contracts, scripts, data sheets, and similar files, analyzing lengthy videos and audio files, or just having a longer chat with the chatbot. On top of that, the larger context window also now makes it a better choice for coders, as it can quickly go through large code blocks and suggest modifications while explaining how different parts of the code work.
Conclusion
Gemini 1.5 Pro has significantly elevated Google’s position in the field of AI. It now directly competes with GPT-4, and is powerful enough for bigger tasks. The enhancements in version 1.5, like the bigger context window (up to 1 million tokens), improved performance, and quick learning of new skills, make it an effective choice for various tasks. If you’re interested in learning more about Google Gemini (formerly known as Google Bard) and whether it’s worth it, check out our Google Gemini review.
Also, learn how Google Gemini compares to other AI models through these in-depth guides: