SOTAVerified

Interpreting and Generating Gestures with Embodied Human Computer Interactions

2020-09-18ACM IVA Workshop GENEA 2020Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we discuss the role that gesture plays for an embodied intelligent virtual agent (IVA) in the context of multimodal task-oriented dialogues with a human. We have developed a simulation platform, VoxWorld, for modeling and building Embodied Human-Computer Interactions (EHCI), where communication is facilitated through language, gesture, action, facial expressions, and gaze tracking. We believe that EHCI is a fruitful approach for studying and enabling robust interaction and communication between humans and intelligent agents and robots. Gesture, language, and action are generated and interpreted by an IVA in a situated meaning context, which facilitates grounded and contextualized interpretations of communicative expressions in a dialogue. The framework enables multiple methods for performing evaluation of gesture generation and recognition. We discuss four separate scenarios involving the generation of non-verbal behavior in dialogue: (1) deixis (pointing) gestures, generated to request information regarding an object, a location, or a direction when performing a specific action; (2) iconic action gestures, generated to clarify how (what manner of action) to perform a specific task; (3) affordance-denoting gestures, generated to describe how the IVA can interact with an object, even when it does not know what it is or what it might be used for; and (4) direct situated actions, where the IVA responds to a command or request by acting in the environment directly.

Tasks

Reproductions