Abstract: The rapid development of unmanned aerial vehicles (UAVs) has intensified the need for advanced classification techniques. This paper presents a novel approach that leverages audio data ...
Abstract: The Audio-Visual Event Localization (AVEL) task aims to temporally locate and classify video events that are both audible and visible. Most research in this field assumes a closed-set ...