Proceedings of the 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI06), 2006.
MFI 2006, Heidelberg, 3.-6. September, 2006
This paper presents an efficient approach for the combined audio-visual localization of a fallen object. The localization algorithm is incorporated into a multi-sensor robotic platform to supervise and to adapt dynamically the discrete plan of a robot performing a pick-and-place task. Initial audio position estimates of the fallen object are delivered by means of time-delay estimates in two microphone pairs. For a successful operation even in acoustically adverse environments, reliability criteria for the time-delay estimates are introduced. These audio position estimates are improved by visual data using the disparity information in stereo images of a stereo-camera system. The performance of the audio-visual localization algorithm is evaluated for real data in a common office environment. The proposed real-time system shows robustness and accuracy localizing with ease a fallen wooden brick.