Multimodal Vision Research Laboratory

MVRL

About Us

The Multimodal Vision Research Laboratory (MVRL) focuses on the development of multimodal representation learning techniques to interpret complex data at a global scale. Our research is primarily driven by challenges in ecology and geospatial science—from monitoring planetary-scale environmental changes to bridging the gap between overhead and ground-level perspectives. We extend these core methodologies to medical imaging, applying advanced visual understanding to improve diagnostic tools. At MVRL, we are dedicated to creating high-impact AI that transforms multi-source data into actionable insights for both planetary sustainability and human health.

Spotlight Publications

  1. Sarkar A, Sastry S, Pirinen A, Jacobs N, Vorobeychik Y. 2026. DiffVAS: Diffusion-Guided Visual Active Search in Partially Observable Environments. In: International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
  2. Thumbnail for VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics
    Cher D, Wei B, Sastry S, Jacobs N. 2026. VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics. In: IEEE Winter Conference on Applications of Computer Vision (WACV).
  3. Thumbnail for Towards Open-World Generation of Stereo Images and Unsupervised Matching
    Qiao F, Xiong Z, Xing E, Jacobs N. 2025. Towards Open-World Generation of Stereo Images and Unsupervised Matching. In: IEEE International Conference on Computer Vision (ICCV).
  4. Thumbnail for Global and Local Entailment Learning for Natural World Imagery
    Sastry S, Dhakal A, Xing E, Khanal S, Jacobs N. 2025. Global and Local Entailment Learning for Natural World Imagery. In: IEEE International Conference on Computer Vision (ICCV).
  5. Thumbnail for ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
    Xing E, Kolouju P, Pless R, Stylianou A, Jacobs N. 2025. ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  6. Thumbnail for RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
    Dhakal A, Sastry S, Khanal S, Ahmad A, Xing E, Jacobs N. 2025. RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
All Publications →

Recent News

Full Archives →