Multimodal Vision Research Laboratory

MVRL

Research Areas

Explore our research areas spanning computer vision, multimodal learning, geospatial science, and their applications.

We develop novel algorithms for monitoring environmental changes and biodiversity at a global scale. Our work focuses on using computer vision and remote sensing techniques to track changes in ecosystems,...

We develop methods for camera calibration and geometric understanding from images. Our research includes structure-aware methods for direct pose estimation, extending absolute pose regression to multiple scenes, and using natural...

Our generative AI research pushes the boundaries of how AI synthesizes and represents information across diverse sensors and scales. Recent work includes open-world generation of stereo images with unsupervised matching...

We develop methods for geometric understanding and 3D scene reconstruction from images. Recent work includes open-world generation of stereo images with unsupervised matching (GenStereo), consistent text-to-360 scene generation (PanoDreamer), generative-free...

Geospatial AI is a broad field that encompasses the application of AI to geospatial data to solve real-world problems. Our research spans multiple domains including localization, remote sensing, ecology, webcam...

The goal of image localization is to estimate the geographic location (or scene-relative location) of an image (or video). We have worked on this problem for over 15 years, from...

We apply advanced visual representation learning to improve diagnostic tools and automated healthcare systems. Our research includes self-supervised learning for COVID-19 chest X-ray classification using masked autoencoders, efficient training methods...

Our reinforcement learning research focuses on active search and decision-making in geospatial and visual domains. Recent work includes diffusion-guided visual active search in partially observable environments (DiffVAS), goal modality agnostic...

We develop novel algorithms for monitoring environmental changes and biodiversity at a global scale using remote sensing data. Our recent work includes retrieval-augmented neural fields for multi-resolution geo-embeddings (RANGE), language-driven...

We develop novel representation learning methods for computer vision and multimodal understanding. Recent work includes Frobenius norm minimization for self-supervised learning (FroSSL), dynamic feature alignment for semi-supervised domain adaptation, and...

What can you learn about the world by looking at pictures uploaded to social networking websites? We explore the relationship between visual content and geographic location through large-scale analysis of...

Our transportation research applies computer vision and machine learning to improve roadway safety, traffic modeling, and transportation infrastructure assessment. Recent work includes beta distribution learning for reliable roadway crash risk...

We are interested in developing novel algorithms for uncertainty estimation in computer vision. Our research includes improved trainable calibration methods for neural networks, neural network calibration for medical imaging classification...

This work explores the use and development of vision-language models for various vision tasks. We are interested in developing novel algorithms for this field, and in applying these algorithms to...