2022
This work develops an approach for scene understanding purely based on binaural sounds.
A novel UDA method, DAFormer, consisting of a Transformer encoder and a multi-level context-aware feature fusion decoder, improve SOTA by 10.8 mIoU for GTA->Cityscapes and 5.4 mIoU for Synthia->Cityscapes
2021
We present Scale-aware Domain Adaptive Faster R-CNN, a model aiming at improving the cross-domain robustness of object detection.
We propose a projection-based method for semantic segmentation of LiDar data, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
We present a domain flow generation (DLOW) model to bridge two different domains by generating a continuous sequence of intermediate domains flowing from one domain to the other.
We introduce Task Switching Networks (TSNs), a task-conditioned architecture with a single unified encoder/decoder for efficient multi-task learning.
Preprint
We propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.
Preprint
Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not been seen in the training. We propose to decouple the ZS3 into two sub-tasks: 1) a class-agnostic grouping task to group the pixels into segments. 2) a zero-shot classification task on segments.
This work presents a novel method for LiDAR-based 3D object detection in foggy weather by simulating foggy effects into standard LiDAR data..
ACDC is a large-scale dataset for training and testing semantic segmentation methods for four adverse visual conditions: fog, nighttime, rain, and snow.
A novel method to leverage the guidance from self-supervised depth estimation to bridge the domain gap for semantic segmentation, achieving state-of-the-art performance..
A novel manner to learn end-to-end driving from a reinforcement learning expert that maps bird's-eye view images to continuous low-level actions, achieving the state-of-the-art perfomrance.