Hi, Iโm Haonan (Eric) Gao ๐.
Welcome
I am currently a second-year Masterโs student in Biostatistics (Data Science track) at Yale University, working at the intersection of machine learning, statistics, and practical decision systems.
Previously, I completed my undergraduate studies in Computer Science and Statistics at the University of Toronto. I have since gained several industry and research experiences working on machine learning, large-scale data systems, and statistical modeling, with a focus on building scalable data pipelines and applying ML methods to real-world datasets.
I build and deploy machine learning models for real-world applications, covering the full workflow from data preparation and feature engineering to model training, evaluation, and production deployment.
I apply statistical modeling and data science techniques to extract insights from complex and high-dimensional datasets, with an emphasis on robust analysis and data-driven decision making.
I design scalable data and AI systems for large datasets, working with distributed data processing, modern ML infrastructure, and efficient pipelines for large-scale machine learning workflows.
With Prof. Yize Zhao, I develop statistical/computational methods for structural-functional brain network modeling, including low-rank + sparse factorization, proximal coordinate methods, and synthetic pipelines for identifiability studies.
With Prof. Song Ma and Prof. Allen Hu, I build a RAG system over 2TB+ of financial text data, reaching ~85% forecasting accuracy while reducing token usage by ~30% through memory optimization.
Learn More
Outside of work, I am a certified instructor with the Canadian Association of Snowboard Instructors (CASI). If you enjoy snowboarding ๐ โor want an excuse to startโfeel free to reach out.
