⏩ Volume 21, Issue No.5, 2023 (CVAS)
Object-Aware Scene Reconstruction from Monocular Input Using Geometry-Guided Neural Rendering Networks

This paper proposes a geometry-guided neural rendering approach for reconstructing full scenes from monocular video input, accurately modeling object boundaries and spatial structures for immersive autonomous simulation environments.

Tristan Oliver Hadley, Zhang Yuefeng, Megha Ramesh Iyer, Henrik Gustav Meier, Farah Yasmeen Qureshi, Liu Hongzhi

Paper ID: 32321501
✅ Access Request

Interactive Visual Grounding for Assistive Robots Using Multimodal Attention in Indoor Service Scenarios

This study presents an interactive visual grounding method for assistive robots, combining multimodal attention mechanisms to interpret user commands and localize referenced objects within cluttered indoor environments.

Nicolas James Watterson, Zhou Xinyu, Preeti Rajesh Menon, Jacques Olivier Dumas, Helena Grace Browning, Yu Shentao

Paper ID: 32321502
✅ Access Request

Self-Supervised Learning for Video Representation Using Temporal Order Restoration and Masked Frame Modeling

We propose a self-supervised learning method for video representation, restoring temporal order and predicting masked frames to capture semantic and motion cues from unlabeled surveillance video data.

Dominic Carl Fletcher, Liu Wenqing, Ananya Sharvani Nair, Ethan Grant Douglas, Yasmin Aline Haddad, Zhou Tianhao

Paper ID: 32321503
✅ Access Request

Unifying 3D Semantic Segmentation and Instance Recognition for Urban Scene Understanding

This paper unifies 3D semantic segmentation with instance recognition in a single architecture, enabling precise urban scene parsing for autonomous navigation and infrastructure monitoring using point cloud and RGB data.

Geoffrey Allan Strauss, Huang Zixuan, Rina Latha Raghunathan, Thomas William Hunt, Noemi Elisabetta Caruso, Chen Junhao

Paper ID: 32321504
✅ Access Request

Real-Time Visual SLAM for Aerial Drones with Adaptive Feature Reweighting and Drift Correction

This work develops a real-time visual SLAM algorithm for drones using adaptive feature reweighting and loop-aware drift correction, enhancing mapping reliability in outdoor environments with fast motion and occlusions.

Vincent Marcus Lindholm, Zhang Xiaoxiao, Raghavi Karthikeyan Mohan, Luca Francesco Marchetti, Alice Naomi Bradshaw, Wu Liangjie

Paper ID: 32321505
✅ Access Request

Spatiotemporal Activity Forecasting in Smart Surveillance Using Recurrent Vision Transformers and Human-Centric Cues

This research presents a recurrent vision transformer model integrating human-centric cues to forecast future activities in surveillance footage, enhancing proactive threat detection and urban security monitoring capabilities.

Lewis Andrew Cromwell, Fang Xiaorui, Nandita Ravi Subramaniam, Johan Erik Lindström, Aiko Haruka Miyazaki, Elena Claire D’Angelo

Paper ID: 32321506
✅ Access Request

Depth-Aware Object Interaction Recognition in Industrial Settings Using Dual-Cue Visual Reasoning Networks

This paper introduces dual-cue visual reasoning networks that incorporate RGB and depth cues for robust recognition of object interactions in industrial workflows, enhancing automation in collaborative manufacturing systems.

Marcus Tobias Riedel, Zhang Leiwen, Pooja Neeraj Gupta, Federico Alessandro Romano, Hanna Elin Margareta Nyström, Wu Zhengqi

Paper ID: 32321507
✅ Access Request

Back