resume

General Information

Full Name Albert (Geyang) Xu
Email gexu@ucsd.edu
Phone 858-568-6771
Website albertgy9910.github.io
Location Dayton, OH

Research Interests

  • Data-centric robustness of ML pipelines
  • ML/MLOps reliability & monitoring
  • Responsible & transparent data management
  • Streaming & distributed data systems
  • Data provenance & explanations
  • Governance of ML systems

Education

  • 2022.09 - 2024.03
    Master of Science in Computer Science
    University of California, San Diego
    • GPA: 3.88/4.0
    • Courses: Operating Systems, Networks, Programming Languages, ML, Probabilistic Modeling, Data Ethics
  • 2020.10 - 2022.06
    Bachelor of Science in Computer Science
    University of Liverpool, UK
    • GPA: 3.93/4.0 (WES Approved, Top 3%)
  • 2018.09 - 2020.06
    Bachelor of Science in Information and Computing Science
    Xi'an Jiaotong-Liverpool University, China
    • Years 0–1 under the UoL 2+2 Programme

Experience

  • 2025.10 - Present
    Manufacturing Data Analyst
    Fuyao Glass America Inc.
    • Consolidated production data from machines and inspection systems into a clean analytics layer using Python and SQL ETL, then delivered role-based dashboards that track OEE, FPY, scrap, and changeover in near real time.
    • Ran set-point experiments on magnetron sputtering and tempering using A/B and DOE, governed changes with Git schema-migrations, SOPs, and parameter logs, and delivered reproducible Jupyter notebooks.
    • Hardened reliability via access control, code review, cron and alerts, 99.3% job success, 2.4 h/week less downtime.
  • 2023.10 - 2025.05
    Research Engineer
    UCSD Halıcıoğlu Data Science Institute
    • Built a modular 'Injector' pipeline (pattern generation → sampling → injection → evaluation) with beam-search pruning and Optuna/TPE-based Bayesian tuning to systematically surface structured corruptions that reduce downstream AUC by >0.25 compared to random-parameter attacks.
    • Co-authored SAVAGE, designing corruption dependency graphs and bi-level black-box search to model mechanism-aware missingness, selection-bias, and outlier patterns for pipeline-level stress-testing of ML systems.
    • Ran Inject → Clean → Retrain benchmarks across missing-value, selection-bias, and outlier scenarios to evaluate state-of-the-art cleaning, debiasing, and UQ pipelines and expose their data-centric robustness gaps.
  • 2024.10 - 2025.10
    Back-end Engineer
    4Pexonic Inc.
    • Developed an academic impact analysis platform tracking citation patterns across tens of thousands of records using Node.js, MongoDB, and Next.js with Docker, serving university researchers and policy institutions.
    • Built two-tier caching with in-process LRU and Redis for bibliometric queries, reducing median and p95 latency by 40% for complex research trend analysis and cross-disciplinary impact calculations.
    • Integrated gRPC microservices for data processing pipelines and implemented secure authentication with bcrypt/session management, configured Nginx SSL to protect academic publication data and researcher information.

Publications

Skills

  • Programming Languages
    • Python, Java, Go, SQL, C/C++, JavaScript, R
  • Frameworks and Tools
    • Docker, Git, gRPC, Node.js, Next.js, Django, React, MongoDB, Redis, Jupyter, Nginx, AWS