⏩ Volume 22, Issue No.4, 2024 (CVAS)
Temporal Scene Understanding in Video Surveillance Using Multi-Scale Memory-Augmented Recurrent Networks

This study introduces a memory-augmented recurrent network that captures multi-scale temporal dependencies, enhancing long-term scene understanding and behavioral analysis in complex video surveillance environments.

Jeremy Isaac Caldwell, Zhang Lixuan, Niharika Manohar Pillai, Remi François Laporte, Yuki Haruna Kobayashi, Ana Camila Vargas

Paper ID: 32422401
✅ Access Request

Self-Adaptive Visual Feedback Mechanisms for Robotic Manipulators Using Predictive Control Models

This research proposes a self-adaptive visual feedback mechanism for robotic manipulators, employing predictive control models to adjust trajectories in real-time based on dynamic visual observations of tasks.

Leonard Paul Jennings, Huang Qingyuan, Sharvani Rajiv Menon, Christophe Alexandre Lemoine, Naomi Erika Tanaka, Luisa Valentina Rojas

Paper ID: 32422402
✅ Access Request

Weakly Supervised Visual Scene Graph Generation Using Contrastive Relationship Learning and Object Context Clustering

This paper presents a weakly supervised scene graph generation method using contrastive relationship learning and object context clustering, enabling robust semantic mapping with limited labeled visual datasets.

Graham Elliot Winters, Li Yuchao, Anuja Sandeep Raval, François Lucien Bouchard, Hyejin Miura, Beatriz Cristina Salazar

Paper ID: 32422403
✅ Access Request

Semantic Visual Place Recognition in Indoor Environments Using Topological Memory and Hybrid Feature Matching

This study proposes a semantic place recognition system that combines topological memory graphs with hybrid feature matching, facilitating robust indoor navigation for service robots in repetitive environments.

Christopher Allen Barnett, Zhang Yuqing, Nikita Suresh Nair, Julien Étienne Moreau, Akemi Fuyuki Sasaki, Claudia Elisa Fuentes

Paper ID: 32422404
✅ Access Request

Attention-Based Vision-Language Pretraining for Multi-Task Robotic Perception and Instruction Following

This research presents a multi-task vision-language pretraining framework using attention mechanisms to improve robotic perception and instruction-following, supporting complex visual grounding and action planning across varied tasks.

Tristan Michael Holloway, Zhang Xinyue, Anjali Renu D’Souza, Etienne Louis Girard, Sakura Mei Tanaka, Clara Juliana Herrera

Paper ID: 32422405
✅ Access Request

Back