AI models still lose track of who is who and what's happening in a movie. A new system orchestrates face recognition and staged summarization, keeping characters straight, and plots coherent across ...
Abstract: Multimodal sentiment analysis has attracted extensive research attention as increasing numbers of users share images and texts to express their emotions and opinions on social media.
Abstract: This paper presents a novel approach incorporating Facial Expression Recognition (FER) to improve emotional and contextual understanding in Vision-Language Pretraining (VLP) model-generated ...