Google has admitted that a video showcasing the capabilities of its AI model, Gemini, was edited to appear more impressive than it actually is. The video, which has gained 1.6 million views on YouTube, portrays the AI responding in real-time to spoken-word prompts and video, but Google revealed that the AI was prompted using still image frames and text. While Google defended the video as a means to inspire developers and demonstrate Gemini’s capabilities, the revelation raises questions about the AI’s true capabilities compared to competitors like OpenAI’s GPT-4.
Key Points:
- Gemini Demo Deception: Google’s Gemini demo video, which garnered significant attention on YouTube, showed the AI responding to spoken-word prompts and video in real time. However, the company has admitted to editing the video to make it appear more impressive than reality.
- Use of Still Images and Text Prompts: Instead of responding to live video and audio, the AI in the video was prompted using still image frames and text prompts. For example, it identified objects and answered questions based on textual cues and static images.
- Misleading Game Invention: The video showcased the AI inventing a game called “guess the country” based on visual cues from a world map. In reality, the AI was directed through text prompts to play the game, and it generated clues and identified countries using still images.
- Implications for AI Capabilities: While Google’s AI model is undoubtedly advanced, the use of edited content and text-based prompts raises questions about its true capabilities. Some observers note that its abilities appear similar to those of OpenAI’s GPT-4.
- Competition with OpenAI: The release of the video came shortly after a period of upheaval in the AI industry, with Google and OpenAI competing to advance their AI models. Google is reportedly working on its next AI version, suggesting that the competition in the AI space is fierce.