Home / Computer Vision & Perception / Google Lens Updates: Answer Queries from Videos with New AI Features

Google Lens Updates: Answer Queries from Videos with New AI Features

Oct 4, 2024

Robert SainiCloud Solutions Consultant

Google has taken a significant step forward with its visual search app, Google Lens, by introducing the ability to answer questions based on video recordings. English-speaking users on both Android and iOS platforms can now capture videos and inquire about objects within them. This new functionality leverages a customized Gemini AI model to interpret the video content and the user’s queries, generating informative responses.

Enrolling in Google’s Search Labs

To access these cutting-edge capabilities, users must enroll in Google’s Search Labs program and opt into experimental features called “AI Overviews and more.” The process is straightforward: while using the app, users can activate the video-capturing mode by holding the shutter button and then ask questions related to the video in real time. The AI model identifies the most relevant frames to provide detailed answers sourced from around the web, enhancing the overall search experience.

Keeping Pace with Competitors

This upgrade comes amidst similar advancements from competitors like Meta and OpenAI. Meta is integrating real-time AI features into its Ray-Ban Meta AR glasses, aiming to change the way users interact with their surroundings. Likewise, OpenAI is developing a premium ChatGPT feature capable of understanding and analyzing videos in real time. Google’s approach, however, remains asynchronous for now, providing answers after video capture rather than during it. This method still represents a significant leap in the app’s capabilities.

Expanding Utility with Image and Text Searches

Beyond video analysis, Google Lens now supports simultaneous image and text searches. Users can take a photo and immediately ask questions aloud, massively broadening the app’s utility. Additionally, Lens incorporates specific e-commerce functionalities, displaying detailed product information such as price, deals, brand, reviews, and availability for recognized products. This feature is currently limited to selected countries and certain shopping categories like electronics, toys, and beauty products but is expected to roll out more broadly over time.

Strategic Integration of Advertising

Google’s strategic inclusion of advertising within Lens’ search results is driven by the sheer volume of shopping-related searches, which is estimated at 4 billion monthly. The integration of shopping ads presents a lucrative opportunity for Google, given its heavy reliance on advertising revenues. This strategic move is designed to capitalize on the app’s growing user base while providing valuable information that enriches the user experience.

The Bigger Picture of AI Integration

Google has made a notable advancement with its visual search tool, Google Lens, by rolling out a feature that allows users to get answers based on video recordings. This development is available for English-speaking users on both Android and iOS devices. Now, users can capture videos and ask questions about objects or scenes within those videos. Utilizing a specialized Gemini AI model, the app analyzes the video content and the questions posed by users, generating insightful and informative responses.

This new capability essentially enhances the versatility of Google Lens, extending its utility beyond static images to dynamic video footage. For example, if you’re watching a video of a nature hike, you can inquire about the different plants or landmarks you see, and Google Lens will provide detailed answers. This technology bridges the gap between seeing and understanding, offering a more interactive and engaging way to learn about the world around you. It’s a significant step forward in AI-assisted visual search technology, making it easier for users to explore and interact with their environment through their smartphone cameras.