The video which demonstrated the potential of Google's AI model might be too good to be true.
The remarkable exchange featured in the Gemini demo, which has amassed 1.6m views on YouTube, illustrates the AI's ability to respond in real time to audio and visual cues.
In the video's description, Google stated that for the purposes of the demonstration, the responses had been accelerated and were not reflective of natural speed.
However, it has conceded that the AI was not reacting to either audio or visual input.
Concurrently with the demonstration, Google elucidated in a blog post the actual process of creating the video.
Following Bloomberg Opinion's initial report, Google revealed to the BBC that the AI was, in fact, generated by inputting still image frames from the video footage and prompting it with text.
A Google spokesperson declared that the Hands on with Gemini demonstration video provides genuine alerts and results from Gemini.
We arrived with the intention of showing off what Gemini is able to do and to motivate developers.
In the video, a person poses questions to Google's AI while demonstrating various objects on the screen.
As an example, the demonstrator displays a rubber duck and queries Gemini if it will stay afloat.
At first, it was not known what the object was composed of, however when the individual compressed it and noted that it made a squeaking sound, the AI was able to identify it accurately.
This article contains content provided by Google YouTube. Before anything is loaded, we need your permission as they may have cookies or other technologies in use. To get more information, you can read Google’s cookie policy, external and privacy policy, external before deciding. To show this content please choose ‘accept and continue’.
At first sight, the video appears to show one thing; yet, upon closer examination, what really transpired to cause the prompts is not the same.
A still image of the duck was presented to the AI and it was asked what material it was made from. After receiving a text prompt clarifying that the duck squeaked when compressed, the AI was able to make the right identification.
In an awe-inspiring feat, the individual executes a cups and balls trick - an act of magic involving the concealment of a ball beneath one of three shifting cups - and the AI is capable of ascertaining its new location.
Rather than responding to a video, the AI was presented with a series of still images to achieve the desired result.
In its blog post, Google outlined that it had actually guided the AI as to the location of a ball beneath three cups, displaying pictures to demonstrate cups being exchanged.
Google stated that the demonstration had been constructed from the video, so as to "review Gemini's skills across a wide range of tasks".
The duration of the sequences was abbreviated and photographs were used, while the audio narration for the video was taken verbatim from the typed instructions entered into Gemini.
Yet there is another factor to the video which extends its truthfulness even further.
At one juncture, the customer lays out a world map and queries the AI: "Think of a game concept based on what you observe, and express it by means of emojis."
The AI appears to have devised a game known as "guess the country", providing the participant with hints (such as a kangaroo and koala) and acknowledging a correct response by the user to the referenced nation (in this situation, Australia).
Google's blog informs us, however, that the AI was not the creator of the game.
Instead of being given instructions, the AI was asked to play a game. It was told, "Provide me with a clue about a country; make sure the clue is specific enough to only be associated with one country. I will try and guess by pointing at the location on a map."
The user proceeded to provide the AI with samples of both correct and incorrect responses.
Following this, clues were generated by Gemini, allowing it to ascertain whether or not the user was selecting the right country from an image of a map.
It may be remarkable, however, it is not the same as asserting that the AI created the game.
Google's AI model is certainly remarkable in spite of its utilization of still images and text-based prods - yet these realities demonstrate its capacities are quite like that of OpenAI's GPT-4.
It is noteworthy that only two weeks after a tumultuous time in the AI sector, sparked by Sam Altman's firing and subsequent re-hiring as CEO of OpenAI, the video was made public.
It is uncertain which one is more sophisticated - but Mr Altman's statement to the Financial Times that Google is in the process of creating the subsequent version of its AI may imply that they are endeavouring to catch up.
top of page
bottom of page
Коментарі