I'm About me.
I'm a third-year Ph.D. student at Purdue CS, I did my internship at Tursio.ai this summer, where we turned databases into generative AI machine together.
My interests span Data Management for LLMs and their applications in different domains while ensuring security. I have publications both in data management for data-centered LLM application workloads (Vexless@SIGMOD2024), and integrating AI into data management systems (LLM-based Semantic File System for AIOS, submitted to ICLR 2025). Specifically, now I focus on enabling large language models (LLMs) to reason with varied data sources effectively, to enhance the trustworthiness and efficiency of ML models/LLMs utilizing real-world knowledge. Those varied data sources include:
Unstructured data (vector data): e.g., semantic embeddings of images, text, etc.
Structured data: e.g., tabular data from RDBMS.
Semi-structured data: e.g., graph data.
Before my Ph.D. journey, I was a CS undergraduate at the School of Computer Science and Engineering at Sun Yat-sen University from 2018 to 2022. I worked as a research intern at the System Research Group in Microsoft Research Asia and SAP Labs.
Please don't hesitate to reach out to me if you are interested in a discussion or collaboration
Thanks to the easygoing upbringing my family and alma mater gave me, I've got a bunch of hobbies, like hitting the trails and snapping some pics. I used to practice Nunchucks and Taekwondo too, but many of the punches and kicks have slipped my mind these days. To know more about me, please have a brief glimpse of my Miscs and my Adventures
News:
Our work "LLM-based Semantic File System for AIOS" is submitted to ICLR 2025.
Will serve as a reviewer of AISTATS 2025.
Will serve as a reviewer of SIGKDD 2025.
Will serve as a reviewer of ICLR 2025.
Served as a reviewer of NeurIPS 2024.
Participated as an AE committee of EuroSys 25' (Spring Round)!
Started my internship at Tursio.ai, working with a group of sharp minds in Azure databases and making great innovations… of course, I’m on vector databases.
I'm very honored to receive the invitation from ACM to become an ACM member.
Grateful to receive the ACM SIGMOD 2024 Student Scholarship!
Grateful to receive the NSF ICDE Travel Award!
I will present my work to some DB/ML/AI/LLM companies 🎤, feel free to reach out if you are also interested.
[Paper accepted at SIGMOD 2024] Our work "Vexless: A Serverless Vector Data Management System Using Cloud Functions" is accepted at SIGMOD 2024, can't wait to see you all in Santiago, Chile! 🇨🇱 My deepest gratitude to my collaborators!
Participated as an AE committee at EuroSys 24' (Autumn Round)!
[Google Cloud Next 24'] Grateful to receive Datawhale & Google's generous support to attend Google Cloud Next '24 (Apr 9-11), looking forward to seeing how databases can better serve Generative AI, see you in Vegas 🎰!
Participated as an AE committee at EuroSys 24' (Spring Round)!
My first 1st author paper was accepted! Many thanks to my collaborators!
Talks & Interviews:
"Serverless Vector Database on Cloud Functions", invited talk at FedML.
"Cloud Techniques & Challenges", interview by Google Cloud & SegmentFault. (Video available@Google CN)