Publications

Conference Papers


AlignCAT: Visual-Linguistic Alignment of Category and Attribute for Weakly Supervised Visual Grounding

Published in ACM International Conference on Multimedia (ACMMM) 2025, 2025

AlignCAT introduces a query-based semantic matching framework for weakly supervised visual grounding, employing coarse-grained category alignment and fine-grained attribute alignment to enhance visual-linguistic reasoning and achieve state-of-the-art performance on RefCOCO, RefCOCO+, and RefCOCOg.

Recommended citation: Yidan Wang, Chenyi Zhuang, Wutao Liu, Pan Gao, Nicu Sebe. (2025). "AlignCAT: Visual-Linguistic Alignment of Category and Attribute for Weakly Supervised Visual Grounding." ACMMM 2025.
Download Paper

First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

Published in Conference Paper (Preprint on arXiv), 2025

RAG-SEG presents a training-free paradigm for camouflaged object detection (salient obect detection) by decoupling the task into retrieval-augmented generation of coarse masks and SAM-based refinement, eliminating conventional training while achieving competitive results on benchmark datasets with only a personal laptop.

Recommended citation: Wutao Liu, Yidan Wang, Pan Gao. (2025). "First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection." Conference Paper (Preprint on arXiv).
Download Paper

ACGFormer: Attribute Classification Guided Transformer for Camouflaged Object Detection

Published in PRCV 2025, 2025

ACGFormer introduces an Attribute Classification Guided Transformer for camouflaged object detection, leveraging attribute-aware guidance and feature refinement to achieve state-of-the-art performance.

Recommended citation: Wutao Liu, Yao Yuan, Pan Gao, Zheng Lin, Jie Qin. (2025). "ACGFormer: Attribute Classification Guided Transformer for Camouflaged Object Detection." PRCV.

Unified Unsupervised Salient Object Detection via Knowledge Transfer

Published in International Joint Conference on Artificial Intelligence (IJCAI) 2024, 2024

A unified framework for unsupervised salient object detection (USOD), featuring curriculum learning-based saliency distilling and knowledge transfer across tasks.

Recommended citation: Yao Yuan, Wutao Liu, Pan Gao, Qun Dai, Jie Qin. (2024). "Unified Unsupervised Salient Object Detection via Knowledge Transfer." IJCAI 2024.
Download Paper