• Home
  • Sofware
  • Google Gemini vs ChatGPT: Why is Bard a revolutionary technology?

Google Gemini vs ChatGPT: Why is Bard a revolutionary technology?

In the past hours, Google and Google DeepMind announced the highly anticipated artificial intelligence model Gemini. There still isn't a lot of hands-on feedback on how well it works, but what the software giant has shown is frankly insane. Giant...
 Google Gemini vs ChatGPT: Why is Bard a revolutionary technology?
READING NOW Google Gemini vs ChatGPT: Why is Bard a revolutionary technology?
In the past hours, Google and Google DeepMind announced the highly anticipated artificial intelligence model Gemini. There still isn’t a lot of hands-on feedback on how well it works, but what the software giant has shown is frankly insane. Giant technology companies have embarked on an incredible artificial intelligence race in recent years, especially with the introduction of ChatGPT into our lives. So far we’ve seen ChatGPT developer OpenAI and investor Microsoft ahead. However, Google has finally shown its hidden hand with Gemini AI.

This article has been prepared to provide an overview of the information we have obtained so far and the first impressions I have gained from what I have read. We will try to convey what Gemini can do and what it means for the future of artificial intelligence. Hold on tight, we’re getting started.

What is Google Gemini?

First, let’s start simple. Gemini is Google’s new and most powerful artificial intelligence model that can understand not only texts but also images, videos and sounds. It is stated that Gemini, a multimodal model, can complete complex tasks in mathematics, physics and other fields and understand and produce high-quality codes in various programming languages.

It is currently available with Google Bard and Google Pixel 8 integrations and will gradually be added to other Google services. According to Google DeepMind CEO and co-founder Dennis Hassabis, “Gemini was designed from the ground up to be multimodal, meaning it can generalize and seamlessly understand, operate across, and use different types of information, including text, code, audio, images, and video.” can unite.”

There are 3 different versions of Gemini

Google describes Gemini as a flexible model that can run anywhere, from the company’s data centers to mobile devices. To provide this scalability, Gemini is being released in three sizes: Gemini Nano, Gemini Pro and Gemini Ultra.

Gemini Nano: This model is a model that targets more devices. Google hasn’t disclosed the number of parameters for Ultra and Pro, but we do know that Nano is divided into two tiers, Nano 1 (1.8B) and Nano 2 (3.25B) for low- and high-memory devices. These versions will perform functions such as chat, text summarization and visual creation on the device. Gemini Nano is built into Google’s Pixel 8 Pro, which will become an AI-enhanced smartphone. Frankly, we can say that this is the beginning of super mobile assistants. Gemini will also be available in more of our products and services, like Search, Ads, Chrome, and Duet AI, but doesn’t specify to what size or when.

Gemini Pro: Comparable to GPT-3.5, the model surpasses its rival in some areas, although not in all conditions. Google says it optimized this model for responsiveness and cost. If you don’t need the best of the best and costs are a constraint, the Pro is likely to be a better choice than the Ultra. Gemini Pro is currently available in English on Bard. Google later announced that it would be available in other countries and languages.
Gemini Ultra: Gemini Ultra is the most powerful of the family and the version that surpasses GPT-4, OpenAI’s most advanced model. You won’t be installing this on your home computer since it’s designed to run in data centers. Although not yet widely available, Google describes Gemini Ultra as its most capable model, surpassing its competitors in 30 of 32 commonly used academic metrics in large language model (LLM) research and development. Designed for highly complex missions, Gemini Ultra will be released after completing the current testing phase. In early 2024, Google’s chatbot will be available in the “Bard Advanced” version.

All models have a 32K context window, which is considerably smaller than the largest, Claude 2 (200K) and GPT-4 Turbo (128K). However, it is difficult to say which size context window is most appropriate (it depends on the task, of course) because it is known that if the size is too large, models tend to forget a large part of the context information.

Frankly, we do not know much about our technical information about Google Gemini and how it works because Google does not share them. It’s pretty funny to say, but we’ll have to wait for Meta to release its next model to find out more. An open source Llama 3 – if it’s comparable to GPT-4 and Gemini – could shed some light on how these models are built and what they’re trained on.

Gemini vs ChatGPT 4

Primarily, Gemini stands out due to its native multimodality, while other models like GPT-4 need plugins and integrations (like DALL-E3 and Whisper) to be truly multimodal. The difference between Gemini and GPT-4 in the paid version of ChatGPT depends on which Google model you use. Gemini Nano is not yet powerful enough for these comparisons, but Gemini Pro and Gemini Ultra can be used as comparisons. In this context, we see that Gemini Pro mostly passes GPT-3.5 in Google’s tests, but lags behind GPT-4. Let us remind you that this model has been added to the existing Bard. And the Gemini Ultra-powered Bard Advanced version won’t arrive until 2024.

Speaking of Gemini Ultra, let’s expand the viewfinder a little more.

Google defines it as follows in its blog post published here:

“Gemini Ultra scores 90.0% on the MMLU (massive multitasking language comprehension), which combines 57 subjects including mathematics, physics, history, law, medicine and ethics to test both world knowledge and problem-solving abilities.” It is the first model to outperform human experts… Gemini Ultra also achieves the highest score of 59.4% on the new MMMU benchmark, which consists of multi-modal tasks covering different domains that require deliberate reasoning.”

Gemini Ultra scores 17 out of 18 benchmarks shown below, including MMLU (90% vs. 86.4% using a new type of Chain of Thought approach) and the new multi-modal benchmark MMMU (59.4% vs. 56.8%) It surpasses GPT-4 in I want to reiterate this: Gemini Ultra is the first model to outperform human experts in MMLU (Massive Multitask Language Understanding), one of the most popular methods used to test the knowledge and problem-solving abilities of AI models.
When we look at the numbers and tables, we see that there are not really big differences between Gemini Ultra and GPT-4. In fact, this does not show Google’s inadequacy, but on the contrary, how difficult it is to develop such systems.
If you want to learn more about Gemini’s capabilities in real-world testing (e.g., reasoning and comprehension, solving math and coding problems, etc.), check out the videos in Google DeepMind’s interactive blog post and the comprehensive demo video below, which CEO Sundar Pichai posted on X I recommend you watch it. To reinforce the numbers above, we need to look at both.

Why is Google Gemini revolutionary?

Although Gemini is still in development, it is already making a difference with its potential to change the way we interact with computers. Let’s try to explain what makes it special as follows:

Unlike most AI models, it can comprehend and respond to a wide range of information sources, not just text. Gemini is smart enough to speak your language. As a result, he can conduct natural and sophisticated discussions just like a human. Additionally, Gemini has the ability to generate code. Additionally, Gemini’s advanced data analysis capabilities can help us gain useful insights across industries ranging from healthcare to finance. Google plans to produce lighter versions of Gemini that will allow developers to design new artificial intelligence applications. This is a dream come true for developers.

Gemini is a big step for Google, but it’s not a giant leap for the AI ​​industry as a whole, nor does it need to be. As we said above; Gemini outperforms GPT-4 in 30 of 32 standard performance metrics, but by small margins. Gemini’s hallmark is bringing the best existing capabilities of AI into one powerful package.

The strongest example that fully demonstrates Gemini is asking (via conversation, not text) whether an omelette is cooked in a pan. Gemini replied, “It’s not ready because the eggs are still runny.” This may seem very simple to us, but it is a difficult process. Gemini fully understands what is said and relates it to images of omelettes. Once the relationship is established, it makes a connection with how an omelette should look when cooked. All this happens in one basic model.

Last words, hallucinations, and higher-order reasoning

Google Gemini AI is really impressive, we have to admit that. However, the main problem of artificial intelligence is still not solved: Hallucinations and high-level reasoning.

The following statements are included in the results section of the 60-page technical report published by Google:

“Despite their impressive capabilities, we must note that there are limitations to the use of LLMs. Ongoing research and development on “hallucinations” produced by LLMs continues to be needed to ensure that model outputs are more reliable and verifiable. LLMs also struggle with tasks that require higher-level reasoning skills, such as causal understanding, logical inference, and counterfactual reasoning, despite performing impressively on exam benchmarks.”

Growing rumors that artificial intelligence is developing at a potentially dangerous pace aren’t slowing things down much. A year after OpenAI sparked a race to develop artificial intelligence technology with the launch of ChatGPT, Google is looking to take further steps to re-establish itself as a leader.

Gemini, a new artificial intelligence model that can work with text, images and video, may be the most important algorithm in Google’s history after PageRank, which cemented the search engine in the public mind and created a corporate giant.

Gemini could be the crest of this productive AI wave. However, it is not yet clear where artificial intelligence built on large language models will go next. Some researchers believe this may be a plateau rather than the next peak.

According to CEO Pichai, we are at the beginning of the road; “As we teach these models to do more reasoning, there will be bigger and bigger breakthroughs. Deeper breakthroughs are yet to come. “When I consider all this, I really feel like we are just getting started.”

Comments
Leave a Comment

Details
306 read
okunma4414
0 comments