My personal website

Yue Fan

Ph.D. candidate
University of California, Santa Cruz
yfan71 AT ucsc.edu
Github
Google Scholar
X
Linkedin

I am currently in my fifth year as a Ph.D. student in the Computer Science and Engineering (CSE) department at the University of California, Santa Cruz, advised by Professor Xin Eirc Wang. I earned my Bachelor's degree in Automation from Shandong University, followed by a Master's degree in Robotics from Johns Hopkins University. My research interests predominantly lie in the fields of AI agents, reinforcement learning for reasoning, and post-training multimodal LLM.
I am on the job mark seeking full-time research opportunities starting in first half of 2026. Please feel free to reach out to me if you have any opportunities. Thanks.

News

- [Sept 2025] Our GRIT paper is accepted by NeurIPS 2025.
- [Sept 2025] Our GUI-Bee paper is accepted by EMNLP 2025.
- [Sept 2025] I finished my full-time summer internship at Adobe Research.
- [May 2025] Our MMIR paper is accepted by ACL 2025 as a Finding paper.
- [Jan 2025] Our LLM-Coordination paper is accepted by NAACL 2025.
- [Sept 2024] Our Read Anywhere Pointed paper is accepted by EMNLP 2024.
- [June 2024] Our Muffin or Chihuahua paper is accepted by ACL 2024.
- [Apr 2024] I am glad to share that I have passed my Ph.D. qualification exam and become a Ph.D candidate.
- [Oct 2023] Our R2H paper is accepted by EMNLP2023.
- [Sep 2023] The Athena Team that I proudly lead secured a remarkable second-place in the scientific innovation category of Amazon Alexa Prize SocialBot Grand Challenge 5.

Publication (First/Co-first authorship)

GRIT: Teaching MLLMs to Think with Images
NeurIPS 2025
Yue Fan, Xuehai He, Diji Yang, Kaizhi Zheng, Ching-Chen Kuo, Yuting Zheng, Sravana Jyothi Narayanaraju,
Xinze Guan, Xin Eric Wang

GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
EMNLP 2025
Yue Fan, Handong Zhao, Ruiyi Zhang, Yu Shen, Xin Eric Wang, Gang Wu

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
EMNLP 2024
Yue Fan, Lei Ding, Ching-Chen Kuo, Shan Jiang, Yang Zhao, Xinze Guan, Jie Yang, Yi Zhang, Xin Eric Wang

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
ACL 2024
Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Yang Zhao, Xinze Guan, Xin Eric Wang

R2H: Building Multimodal Navigation Helpers that Respond to Help Requests
EMNLP 2023
Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang

Athena 3.0: Personalized Multimodal ChatBot with Neuro-Symbolic Dialogue Generators
Alexa Prize SocialBot Grand Challenge 5
Yue Fan, Kevin K Bowden, Wen Cui, Winson Chen, Vrindavan Harrison, Angela Ramirez, Saaket Agashe, XG Liu, N Pullabhotla, NQJ Bheemanpally, S Garg, M Walker, XE Wang

Aerial Vision-and-Dialog Navigation
ACL 2023
Yue Fan, Winson Chen, Tongzhou Jiang, Chun Zhou, Yi Zhang, Xin Eric Wang

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Preprint 2022
Kaizhi Zheng*, Kaiwen Zhou*, Jing Gu*, Yue Fan*, Jialu Wang*, Zonglin Di, Xuehai He, Xin Eric Wang

Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator
IROS 2020
Yue Fan, Shilei Chu, Wei Zhang, Ran Song, and Yibin Li

Earlier Projects

Unspuervised Adrenomyeloneuropathy disease data analysis

- Apply feature selection to find dominant factors among the disease progression.
- Design the extra-data-dimension heatmap toolkit for visualization the patient clusters.
- Use Bayesian Neural Network to classify the progressor with uncertainty.

Heatmap Toolkit

More

Learn by Observation: Imitation Learning for Drone Patrol from Raw Videos of A Human Navigator (IROS 2020)

- Design a data auto-labeling method using inter-frame geometric consistency.
- Bring up a DNN called UAVPatrolNet for Detecting Road.
- Make a dataset for drone autonomous Navigation.

Project page

More

Object Detection in Aerial Image
- I contributed to the teamwork by reproducing existing mature algorithms, e.g. RPN, Faster R-CNN.
- I conducted simulated experiments and adjusted the parameters to realize the optimal training effect; improved the object detection performance on aerial images.

ECCV Workshop - Visdrone2018

More

Control and Monitoring System of DJI Drones through PC
- Designed the control interface on PC with varies functions like "vehicle detection".
- Developed a system to transmit data between UVA and PC using Qt and DJI SDK.
- Applied the system in city traffic to successfully improve the management efficiency.

Github

More

Control of Carbon-free Car
- Developed a circuit board and selected the proper sensor by studying the control system of the carbon-free car.
- Conducted OOP of the machine by designing and applying the control algorithm.
- Awarded the First Prize in Engineering Training Integration Ability Competition of Shandong Province.

More

Competitions

Amazon Alexa Prize competition: Socialbot Grand Challenge 5

- The challenge aims at advancing conversational AI. University teams are tasked with developing a "socialbot", an AI chatbot that can interact naturally and intelligently with humans on a variety of topics through Amazon's Alexa platform.
- I serve as the team leader of our Athena3 team.
- Our Athena team has secured the second-place in the scientific innovation category of Alexa Prize SocialBot Grand Challenge 5.

Alexa Prize Socialbot Grand Challenge

Our Team

Amazon Alexa Prize competition: Simbot Challenge

- The challenge is focused on helping advance development of next-generation virtual assistants that will assist humans in completing real-world tasks by continuously learning, and gaining the ability to perform commonsense reasoning.
- Our SlugJARVIS Team won the third place in the Simbot Challenge.
- Our SlugJARVIS Team won the Public Benchmark Challenge.

Alexa Prize SimBot Challenge Public Benchmark Challenge

Our Team

Academic Service and Teaching

Conference Reviewer
- ACL, EMNLP, NAACL, NeurIPS, COLM, ECCV, ICRA, IROS

Workshop Organization
- SpLU-RoboNLP 2023: The Third Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics at EMNLP 2023

Teaching Experience
- Course Assistant, University of California Santa Cruz Science Internship Program (SIP)
  An open-ended STEM/STEAM research program exclusively for high school students

- Teaching Assistant, CSE 140 Machine Learning
  University of California, Santa Cruz

- Course Assistant, EN 601.475 Machine Learning & EN 601.783 Vision as Bayesian Inference
  Johns Hopkins University

Yue Fan

Yue Fan

News

Publication (First/Co-first authorship)

Earlier Projects

Competitions

Athena3 Team

SlugJARVIS Team

Object Detection in Aerial Image

Control and Monitoring System of DJI Drones through PC

Control of Carbon-free Car

Assembling and Programming Drones with ROS

Building Multimodal Web AI Agent

Respond to Help Requests (R2H) Project (EMNLP 2023)

Aerial Vision-and-Dialog Navigation (AVDN) Project (ACL 2023)

Learn by Observation: Imitation Learning for Drone Patrol from Raw Videos of A Human Navigator (IROS 2020)

Academic Service and Teaching