Google now uses its Gemini AI model to train robots to navigate the world

Hridhya Manoj
2 min readJul 12, 2024

--

Google unveiled its multimodal capabilities in the Gemini 1.5 AI model.

Google has introduced its multimodal capabilities in the Gemini 1.5 AI model. This indicates that the large language model will take photos, videos, and audio, along with the text as inputs, a process that will inform and generate responses. This company’s AI unit is now taking advantage of Gemini AI’s capability to train robots to navigate their surroundings.

Google added that ‘’Robots can use human instructions, video tours, and common sense reasoning to find their way successfully around a space’’. The trainers had taken the robots on a tour of specific areas in real-world settings. During that process, they focused on the key places to recall. Later, they asked robots to lead them to those locations.

For instance, robots can used as a guide in an office setting, by training robots to navigate the office settings. Robots will be asked to learn the location of the company’s chairman’s office. Later, robots can take the guests to the Chairman’s cabin.

Recently, Google’s Deepmind published a research paper on how robots can be trained and understand multimodal instructions, including natural language, images, and useful navigation. Google shared in X that’’ since a limited context length makes it a challenge for many AI models to recall environments, Gemini 1.5 Pro’s 1 million token context length helped the company to train the robots for navigation’’

Google Deepmind indicates that the architecture will take in these inputs and form a topological graph or a simplification of space. Further, this construction will create frames within tour videos, which capture the general connectivity of their surroundings to find the path without the path.

‘’In the future, users could simply record a tour of their environment with a smartphone for their personal robot assistant to understand and navigate ‘’, Google said.

Follow Hridhya Manoj for more Tech and AI updates

--

--

Hridhya Manoj
Hridhya Manoj

Written by Hridhya Manoj

0 Followers

Welcome to Techy Parakeet! Where technology meets creativity. I'm Hridhya Manoj, a technical content writer, passionate about demystifying tech and AI for all.

No responses yet