Context-Aware Command Understanding for Tabletop Scenarios

2024-10-08Unverified0· sign in to hype

Paul Gajewski, Antonio Galiza Cerdeira Gonzalez, Bipin Indurkhya

Unverified — Be the first to reproduce this paper.

Abstract

This paper presents a novel hybrid algorithm designed to interpret natural human commands in tabletop scenarios. By integrating multiple sources of information, including speech, gestures, and scene context, the system extracts actionable instructions for a robot, identifying relevant objects and actions. The system operates in a zero-shot fashion, without reliance on predefined object models, enabling flexible and adaptive use in various environments. We assess the integration of multiple deep learning models, evaluating their suitability for deployment in real-world robotic setups. Our algorithm performs robustly across different tasks, combining language processing with visual grounding. In addition, we release a small dataset of video recordings used to evaluate the system. This dataset captures real-world interactions in which a human provides instructions in natural language to a robot, a contribution to future research on human-robot interaction. We discuss the strengths and limitations of the system, with particular focus on how it handles multimodal command interpretation, and its ability to be integrated into symbolic robotic frameworks for safe and explainable decision-making.

Tasks

Decision Making Visual Grounding

Context-Aware Command Understanding for Tabletop Scenarios

Abstract

Tasks

Reproductions