Related Research
Dec. 1, 2024

MAGMaR Workshop accepted for co-location at ACL 2025
We are happy to announce that our workshop, MAGMaR (Multimodal Augmented Generation via MultimodAl Retrieval), has been accepted for co-location at ACL 2025 in Vienna. The workshop centers on multimodal retrieval and retrieval-augmented generation, including a shared task on multilingual event-based video retrieval using the MultiVENT 2.0 dataset. We are excited to share more information about submissions and the shared task soon.
Nov. 15, 2024

MultiVENT-Grounded presented at FuturED 2024
Alongside a poster at EMNLP 2024, our video event extraction dataset, MultiVENT-Grounded, was presented in an oral talk at the Workshop on the Future of Event Detection. The workshop focuses on detecting real-world events for applications like emergency response and public health, how this field has developed and should continue to develop, extending it beyond NLP, and real-world applications. See the live session recording here, and check out the workshop website here.
Aug. 9, 2024

2024 Summer Workshop on Video-Based Event Retrieval
The HLTCOE (Human Language Technology Center of Excellence) hosted the 2024 iteration of the SCALE summer research workshop. SCALE’24 focused on the retrieval of event-based visual content found in both professional and non-professional videos. Our goals of this workshop are to understand how current state-of-the-art computer vision technologies work for the retrieval of multilingual event-based visual content and explore how different modalities can be helpful for this task. This workshop resulted in the MultiVENT 2.0 dataset and benchmark approaches.
June 17, 2024

Video Events Survey paper presented at CVPR Workshop
A Survey of Video Datasets for Grounded Event Understanding was presented at the Video Datasets Understanding workshop co-located with CVPR 2024. The survey paper considers the scope of video datasets that implicitly or explicitly target event understanding in multimodal data, compares how these datasets present events, and explores how they compare to multimodal event extraction tasks introduced in the last few years. This survey paper motivates our MultiVENT dataset, and it can be viewed here.