AveRobot
An Audio-Visual Dataset for People Re-Identification and
Verification in Human-Robot Interaction

overview

AveRobot is an audio-visual dataset of people vocalizing short sentences under simulated robot assistance scenarios.

111

2.664

8

9

People

They span different nationalities (e.g. Chinese, Indian, Spanish), ages (AVG: 27; S.D. 11), and heights (AVG. 1.74m; S.D. 0.10m).

Videos

They are recorded in three locations (stairs, floor, lift) for each one of the three floors of a common university building.

Devices per User

Two floors host a video camera, a smartphone camera, and a compact camera; one floor only the last two types of device.

Sentences per User

Each person vocalizes 9 different sentences selected from a pre-defined list of 34 sentences (e.g. Where is the lift?)

Sample recorded on the first-floor lift

Sample recorded on the second-floor stairs

Sample recorded on the third-floor corridor

details

Some relevant specifications and statistics along the dataset.

# Model Type Resolution FpS Format Height Floor
1 Casio Exilim EXFH20 Compact Camera 1280 x 720 30 AVI 130 cm Floor 0
2 Huawei P10 Lite Smartphone Camera 1920 x 1080 30 MP4 130 cm Floor 0
3 Sony HDR-XR520VE Video Camera 1920 x 1080 30 MTS 120 cm Floor 1
4 Samsung NX1000 Compact Camera 1920 x 1080 30 MP4 120 cm Floor 1
5 iPhone 6S Smartphone Camera 1280 x 720 30 MOV 120 cm Floor 1
6 Sony DCR-SR90 Video Camera 702 x 576 25 MPG 150 cm Floor 2
7 Olympus VR310 Compact Camera 1280 x 720 30 AVI 150 cm Floor 2
8 Samsung Galaxy A5 Smartphone Camera 1280 x 720 30 MP4 150 cm Floor 2

The specifications of the recording devices used for the dataset construction

img02

Gender per age distribution

img02

User height distribution

img02

Pronounced sentence distribution

applications

The AveRobot videos are annotated to keep track of the participant's identity, the recorded floor, the recorded location in the floor, the pronounced sentence and the device id. The gender, the age, and the height of each participant is also provided. Hence, AveRobot is tailored for testing several applications within the Human-Robot Interaction (HRI) scenario, including:

author

Audio-Visual Re-Identification

author

Audio-Visual Verification

author

Audio-Visual Speech Recognition

citation

If you use AveRobot in your research, please cite the following publication:

AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

Marras M., Marin-Reyes P. A., Lorenzo-Navarro J. , Castrillon-Santana M., Fenu G.

{mirko.marras, fenu}@unica.it, {javier.lorenzo, modesto.castrillon}@ulpgc.es, pedro.marin102@alu.ulpgc.es

8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019)

Bibtex | Pre-Print | Publisher

download

Please complete the following steps for getting access to AveRobot:

  1. Download, fill, and sign the End User License Agreement (EULA).
  2. Send an email to the authors with subject "AveRobot Download Request" and the EULA as attachment.
  3. Follow the instructions that you will receive in our e-mail response.