Tech. Reports

Project Description

This project falls into the field of real-time artificial vision for interaction, and particularly for Perceptual User Interfaces (PUIs). The objective is to define, build and evaluate a framework for devising and developing solutions for Natural Communication Interfaces, especially for interaction with people in environments with changing conditions. The design of the system is based on using Active Vision techniques combined with Homeostatic Adaptation, which facilitate communication in changing illumination conditions, aiding in the exchange of visual information. Also, it can study and model processes of interaction perception-action through gesture recognition.

The general problems that must be solved include detection of presence, tracking and identification of people, tracking and analysis of facial, hand or body gestures. Our research group has a vast experience in these problems and some solutions have been produced in the context of previous research projects. A fundamental aspect of  the problems of real-time artificial vision is that efficient solutions can be obtained in restricted conditions. Those solutions are typically strongly associated to that conditions, and they degrade when conditions change. The first objective of this project is precisely to search for a solution to this limitation. The idea is to obtain a stabilizing module, based on concepts of homeostatic control, that allows to keep the essential variables of the input image flow stable in a range of values that are adequate to maximize the efficiency of the whole system. The module will guarantee the adaptation to changes  in illumination, color and size of the object, and also the precision (resolution) of the images from the information available in the sequence of objects, faces or other moving patterns. Other objectives include the integration of the module with previously obtained solutions of tracking, detection and identification. A third group includes research in detection and analysis of gestures. Lastly, a fourth group will try to generate a final integrated solution and an application for a concrete domain.

The working hypothesis is based on recent research results in the fields of sociology and psychology that affirm that interaction of people with technological elements, computers and other information systems is fundamentally social and natural. Therefore, it is advisable to have robust processing mechanisms that guarantee the stability of these systems and allow them to be used in environments with variable conditions.

The Project has antecedents in two previous research projects. On the one hand, the C.I.C.Y.T Project “Sistema Percepto-Efector para la Detección y Seguimiento de Objetos Móviles” (Ref. TAP95/0288), produced a percepto-effector system using Active Vision techniques for detection, tracking and identification of moving objects, especially people. It uses a high-performance binocular robotic head from Helpmate Robotics and processing architectures based on the DSP C80 from Texas Instruments. On the other hand, the Project  “Sistema Interactivo Multimodal basado en Técnicas de Visión Activa y Arquitecturas Multiagente” (Ref. DGUI PI2000/042)  produced part of the tracking modules as well as the implementation of algorithms for gesture recognition.