Learn2Smile: Learning non-verbal interaction through observation

Creators: Feng, Will; Kannan, Anitha; Gkioxari, Georgia; Zitnick, C. Lawrence

Abstract

Interactive agents are becoming increasingly common in many application domains, such as education, healthcare and personal assistance. The success of such embodied agents relies on their ability to have sustained engagement with their human users. Such engagement requires agents to be socially intelligent, equipped with the ability to understand and reciprocate both verbal and non-verbal cues. While there has been tremendous progress in verbal communication, mostly driven by the success of speech recognition and question-answering, teaching agents to appropriately react to facial expressions has received less attention. In this paper, we focus on non-verbal facial cues for face-to-face communication between a user and an embodied agent. We propose a method that automatically learns to update the agent's facial expressions based on the user's expressions. We adopt a learning scheme and train a deep neural network on hundreds of videos, containing pairs of people engaging in a conversation, and without external human supervision. Our experimental results show the efficacy of our model in sustained long-term prediction of the agent's facial landmarks. We present comparative results showing that our model significantly outperforms baseline approaches and provide insightful human studies to better understand our model's qualitative performance. We release our dataset to further encourage research in this field.

Attached Files

Accepted Version - learn2smile-learning-verbal.pdf

Files

learn2smile-learning-verbal.pdf

Files (2.9 MB)

Name	Size	Download all
learn2smile-learning-verbal.pdf md5:a67dc46792936f7195d1e5a77dae1a30	2.9 MB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes