resume
General Information
| Full Name | Albert (Geyang) Xu |
| gexu@ucsd.edu | |
| Phone | 858-568-6771 |
| Website | albertgy9910.github.io |
| Location | Dayton, OH |
Research Interests
- Data-centric robustness of ML pipelines
- ML/MLOps reliability & monitoring
- Responsible & transparent data management
- Streaming & distributed data systems
- Data provenance & explanations
- Governance of ML systems
Education
-
2022.09 - 2024.03 Master of Science in Computer Science
University of California, San Diego - GPA: 3.88/4.0
- Courses: Operating Systems, Networks, Programming Languages, ML, Probabilistic Modeling, Data Ethics
-
2020.10 - 2022.06 Bachelor of Science in Computer Science
University of Liverpool, UK - GPA: 3.93/4.0 (WES Approved, Top 3%)
-
2018.09 - 2020.06 Bachelor of Science in Information and Computing Science
Xi'an Jiaotong-Liverpool University, China - Years 0–1 under the UoL 2+2 Programme
Experience
-
2025.10 - Present Manufacturing Data Analyst
Fuyao Glass America Inc. - Consolidated production data from machines and inspection systems into a clean analytics layer using Python and SQL ETL, then delivered role-based dashboards that track OEE, FPY, scrap, and changeover in near real time.
- Ran set-point experiments on magnetron sputtering and tempering using A/B and DOE, governed changes with Git schema-migrations, SOPs, and parameter logs, and delivered reproducible Jupyter notebooks.
- Hardened reliability via access control, code review, cron and alerts, 99.3% job success, 2.4 h/week less downtime.
-
2023.10 - 2025.05 Research Engineer
UCSD Halıcıoğlu Data Science Institute - Built a modular 'Injector' pipeline (pattern generation → sampling → injection → evaluation) with beam-search pruning and Optuna/TPE-based Bayesian tuning to systematically surface structured corruptions that reduce downstream AUC by >0.25 compared to random-parameter attacks.
- Co-authored SAVAGE, designing corruption dependency graphs and bi-level black-box search to model mechanism-aware missingness, selection-bias, and outlier patterns for pipeline-level stress-testing of ML systems.
- Ran Inject → Clean → Retrain benchmarks across missing-value, selection-bias, and outlier scenarios to evaluate state-of-the-art cleaning, debiasing, and UQ pipelines and expose their data-centric robustness gaps.
-
2024.10 - 2025.10 Back-end Engineer
4Pexonic Inc. - Developed an academic impact analysis platform tracking citation patterns across tens of thousands of records using Node.js, MongoDB, and Next.js with Docker, serving university researchers and policy institutions.
- Built two-tier caching with in-process LRU and Redis for bibliometric queries, reducing median and p95 latency by 40% for complex research trend analysis and cross-disciplinary impact calculations.
- Integrated gRPC microservices for data processing pipelines and implemented secure authentication with bcrypt/session management, configured Nginx SSL to protect academic publication data and researcher information.
Publications
-
2025 Stress-Testing ML Pipelines with Adversarial Data Corruption (SAVAGE)
PVLDB 18(11): 4668–4681 (2025) - {"Preprint"=>"https://arxiv.org/abs/2506.01230"}
- {"Code"=>"https://github.com/lodino/savage"}
Skills
-
Programming Languages
- Python, Java, Go, SQL, C/C++, JavaScript, R
-
Frameworks and Tools
- Docker, Git, gRPC, Node.js, Next.js, Django, React, MongoDB, Redis, Jupyter, Nginx, AWS