My personal website

Yue Fan


Yue Fan

Ph.D. candidate
University of California, Santa Cruz
yfan71 AT ucsc.edu
Github
Google Scholar
X
Linkedin


I am currently in my fourth year as a Ph.D. student in the Computer Science and Engineering (CSE) department at the University of California, Santa Cruz, advised by Professor Xin Eirc Wang. I earned my Bachelor's degree in Automation from Shandong University, followed by a Master's degree in Robotics from Johns Hopkins University. My research interests predominantly lie in the fields of Embodied AI, Computer Vision, and Natural Language Processing.


News


- [Sept 2024] Our Read Anywhere Pointed paper is accepted by EMNLP 2024.
- [Sept 2024] I finished my full time summer internship at Adobe Research. Publication is under preparation.
- [June 2024] Our Muffin or Chihuahua paper is accepted by ACL 2024.
- [Apr 2024] I am glad to share that I have passed my Ph.D. qualification exam and become a Ph.D candidate.
- [Oct 2023] Our R2H paper is accepted by EMNLP2023.
- [Sep 2023] The Athena Team that I proudly lead secured a remarkable second-place in the scientific innovation category of Amazon Alexa Prize SocialBot Grand Challenge 5.
- [Jun 2023] Our SlugJARVIS Team won the third place in the Amazon Alexa Prize Simbot Challenge.
- [May 2023] Our Athena teamTeam Athena, to which I have the honor of serving as team leader, has advanced to the semi-finals of Amazon Alexa Prize Socialbot Grand Challenge 5.
- [May 2023] Our AVDN paper is accepted by ACL2023. AVDN challenge is released.



Publication (First/Co-first authorship)


Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
EMNLP 2024
Yue Fan, Lei Ding, Ching-Chen Kuo, Shan Jiang, Yang Zhao, Xinze Guan, Jie Yang, Yi Zhang, Xin Eric Wang


Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
ACL 2024
Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Yang Zhao, Xinze Guan, Xin Eric Wang


R2H: Building Multimodal Navigation Helpers that Respond to Help Requests
EMNLP 2023
Author: Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang


Athena 3.0: Personalized Multimodal ChatBot with Neuro-Symbolic Dialogue Generators
Alexa Prize SocialBot Grand Challenge 5
Author: Yue Fan, Kevin K Bowden, Wen Cui, Winson Chen, Vrindavan Harrison, Angela Ramirez, Saaket Agashe, XG Liu, N Pullabhotla, NQJ Bheemanpally, S Garg, M Walker, XE Wang


Aerial Vision-and-Dialog Navigation
ACL 2023
Author: Yue Fan, Winson Chen, Tongzhou Jiang, Chun Zhou, Yi Zhang, Xin Eric Wang


JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Preprint 2022
Author: Kaizhi Zheng*, Kaiwen Zhou*, Jing Gu*, Yue Fan*, Jialu Wang*, Zonglin Di, Xuehai He, Xin Eric Wang


Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator
IROS 2020
Author: Yue Fan, Shilei Chu, Wei Zhang, Ran Song, and Yibin Li


My projects


Building Multimodal Web AI Agent

- We first build MultipanelVQA benchmark to challenge Large Vision-Language Models with their ability to understand multipanel images, such as web screenshot and posters.
- We are now working on developing specialized AI agent to interact with all kinds of UI, including web, mobile, etc.



Respond to Help Requests (R2H) Project (EMNLP 2023)

- We establish the R2H benchmark, featuring tasks that assess an agent's capabilities based on guiding users or another agent in unknown areas through dialogues.
- We propose two multimodal navigation-helper agents, fine-tuned SeeRee model for multi-modal response generation and employing a large language model in a zero-shot manner, analyzed via benchmarking and human evaluations.

Project page



Amazon Alexa Prize competition: Socialbot Grand Challenge 5

- The challenge aims at advancing conversational AI. University teams are tasked with developing a "socialbot", an AI chatbot that can interact naturally and intelligently with humans on a variety of topics through Amazon's Alexa platform.
- I serve as the team leader of our Athena3 team.
- Our Athena team has secured the second-place in the scientific innovation category of Alexa Prize SocialBot Grand Challenge 5.

Alexa Prize Socialbot Grand Challenge

Athen3 Team




Students from the ERIC Lab and the Natural Language and Dialogue Systems Lab are making the fifth appearance in the competition. The goal of the team is to leverage advance algorithms and AI models to build a smart chat bot.




Location: Santa Cruz, California
Faculty advisor: Xin Wang
Team lead: Yue Fan


Amazon Alexa Prize competition: Simbot Challenge

- The challenge is focused on helping advance development of next-generation virtual assistants that will assist humans in completing real-world tasks by continuously learning, and gaining the ability to perform commonsense reasoning.
- Our SlugJARVIS Team won the third place in the Simbot Challenge.
- Our SlugJARVIS Team won the Public Benchmark Challenge.

Alexa Prize SimBot Challenge Public Benchmark Challenge

SlugJARVIS Team




UC Santa Cruz is one of America's Public Ivy universities and a member of the prestigious Association of American Universities (AAU). The ERIC Lab is led by Prof. Xin Eric Wang and stands for Embodiment, Reasoning, Intelligence, and language Communication. The ERIC Lab’s research topics include natural language processing, computer vision, and machine learning, with an emphasis on building embodied AI agents that can communicate with humans in natural language to perform real-world multimodal tasks.




Location: Santa Cruz, California
Faculty advisor: Xin Wang


Aerial Vision-and-Dialog Navigation (AVDN) Project (ACL 2023)

- Our AVDN project aims at building drones that understand and follow natural language commands, facilitating hands-free control and accessibility.
- We build AVDN dataset of over 3k recorded dialogs and navigation trajectories and drone simulator with a photorealistic environment.
- We successfully host public AVDN Challenge at the ICCV 2023 CLVL workshop.

Project page AVDN Challenge


Unspuervised Adrenomyeloneuropathy disease data analysis

- Apply feature selection to find dominant factors among the disease progression.
- Design the extra-data-dimension heatmap toolkit for visualization the patient clusters. - Use Bayesian Neural Network to classify the progressor with uncertainty.

Heatmap Toolkit



Learn by Observation: Imitation Learning for Drone Patrol from Raw Videos of A Human Navigator (IROS 2020)

- Design a data auto-labeling method using inter-frame geometric consistency.
- Bring up a DNN called UAVPatrolNet for Detecting Road.
- Make a dataset for drone autonomous Navigation.

Project page


Training Quadruped Robot with Reinforcement Learning

- Use Unity3D ML-Agents.
- Train the quadruped robot to crawl.

RL for Robot







Object Detection in Aerial Image
- I contributed to the teamwork by reproducing existing mature algorithms, e.g. RPN, Faster R-CNN.
- I conducted simulated experiments and adjusted the parameters to realize the optimal training effect; improved the object detection performance on aerial images.

ECCV Workshop - Visdrone2018

Carbon-free Car






After testing nowadays' state-of-the-art object detection networks, we followed the Faster R-CNN algorithm. However, we made a few adjustments on it to adapt to VisDroneDet dataset. The dataset given consists of many variant-sized proposals which lead to a multi-scale object detection problem. In order to mitigate the impact of relatively rapid changes in sizes of bounding boxes, we added more anchors with large sizes to fit those larger objects and keep small anchors unchanged for detecting tiny objects such as people and cars in long distance. Moreover, the VisDroneDet dataset has an unbalanced object dis- tribution. When testing on validation dataset, we found that classification performance for car is much better than others for the reason that the appearance of cars is more frequent. To alleviate this problem, we masked out some car bounding boxes by hand for pursuing better classification performance.


Control and Monitoring System of DJI Drones through PC
- Designed the control interface on PC with varies functions like “vehicle detection”.
- Developed a system to transmit data between UVA and PC using Qt and DJI SDK.
- Applied the system in city traffic to successfully improve the management efficiency.

Github

Drone with ROS





Control interface on PC.



DJI M100 Drone.



Binocular distance measurement with one camera on UVA.


Control of Carbon-free Car
- Developed a circuit board and selected the proper sensor by studying the control system of the carbon-free car.
- Conducted OOP of the machine by designing and applying the control algorithm.
- Awarded the First Prize in Engineering Training Integration Ability Competition of Shandong Province.

Carbon-free Car






-realize autonomous obstacle avoidance

-Since the car is powered by the hammer block‘s gravitational potential energy, it is called Carbon-free Car.

-Win the First Prize in Engineering Training Integration Ability Competition of Shandong Province


Assembling and Programming Drones with ROS
- Assembled drones from scratch.
- Designed programs for STM32 flight controller(Pixhawk) running Robot operating system.
- Won the municipal First Prize in China RoboWork Competition.

Github

Drone with ROS











Patent


A Method to Precisely Apply Screen Protector
(Patent No.: ZL 2015 1 0853152. 0)


Extracurricular Experience


Career Exploration Program
Team Leader of 10, The University of Hong Kong, 2016


Filming in Support of College Entrance Examination Takers
Initiator, Shandong Experimental High School, 2015
video >>


Honor & Award


2nd-class Scholarship
2018, 2017 & 2016 • Issued by Shandong University


1st Prize, Quadrotor Aircraft Group & Hexacopter Aircraft Group of China RoboWork Competition
2018 & 2016 • Issued by The International Federation of Robotics


3rd Prize, National Undergraduate Engineering Training Integration Ability Competition
2017 • Issued by Department of Higher Education, Ministry of Education of PRC


1st Prize, National Olympiad in Informatics in Shandong Province
2014• Issued by China Computer Federation



Copyright 2023 by YueFan. All rights reserved.
All designs are the property of the owner.