Yilin (Larry) Li (李亦林)

Data Scientist @ Arteria AI BASc, University of Waterloo

Hi! I'm Larry, a Data Scientist at Arteria AI and a Mechatronics Engineering graduate from the University of Waterloo with an AI Option. Previously, I worked as a Data Scientist at Telus and as a Machine Learning Engineer Intern at Huawei Canada.

Yilin (Larry) Li Profile Photo

Technical Focus

My work focuses on document extraction systems, multi-agent workflows, and production ML pipelines. I’m particularly interested in:

  • Vision Language Models (VLMs) for document understanding
  • Information retrieval and semantic search
  • Applied reinforcement learning for decision-making systems

Fun fact: I got the CCOM performance certificate of Erhu (a traditional Chinese instrument) in grade 10!

Skills

Languages

Python, C++, C#, SQL, Java, Matlab, R

Frameworks

PyTorch, TensorFlow, Hugging Face (Transformers), CrewAI, Scikit-learn

Cloud & MLOps

GCP (Vertex AI, BigQuery), AWS (Textract), Docker, Databricks, ONNX Runtime

Work Experience

May 2024 - Present

Data Scientist

Arteria AI • Toronto, ON

Building production document extraction systems for financial institutions using AWS Textract, Vision Language Models (Qwen, Idefics), and multi-agent workflows with CrewAI.

Oct 2022 - May 2024

Data Scientist

Telus • Toronto, ON

Designed Offline RL agents for HVAC optimization (20% energy cost reduction) and production ML pipelines on GCP Vertex AI. Led fiber network fault detection project reducing MTTR by 40%.

Sep - Dec 2020

Machine Learning Engineer Intern

Huawei Canada • Montreal, QC

Implemented 8-bit Quantization Aware Training on BERT encoders, compressing model size by 75% while retaining 98% accuracy. Applied knowledge distillation and structured pruning.

Jan - Apr 2020

Deep Learning Engineering Intern

Synapse Technology • Palo Alto, CA

Developed 2D and 3D object detection algorithms for weapon detection in security applications.

May - Aug 2019

Machine Learning Developer Intern

Primate Labs • Toronto, ON

Developed deep learning applications for Android device benchmarking and performance testing.

Jan - Apr 2018

Robotics Software Developer Intern

Engineering Services Inc. • Toronto, ON

Developed navigation algorithms for autonomous indoor robots.

Research Experience

May - Dec 2021

Undergraduate Research Assistant

University of Waterloo (Data Systems Group) • Advisor: Prof. Jimmy Lin

Co-developed a multi-stage retrieval system using Neural Query Synthesis with T5-3B. Achieved 1st place at TREC 2021 (0.71 nDCG) by solving the cross-encoding bottleneck for long document pairs.

Sep - Dec 2018

Undergraduate Research Assistant

University of Waterloo (Kimia Lab) • Advisor: Prof. Hamid Tizhoosh

Researched medical image search and keyword extraction for pathology report analysis.

Projects

2025

Smart Kitchen Multi-Agent System (SKMS)

Python, Google ADK, Gemini, Arize Phoenix

Built a hierarchical agent system with a stateful "Soft-Lock" protocol for concurrency control. Features 98% unit conversion accuracy and LLM trace observability via Arize Phoenix.

Education

2017 - 2022

University of Waterloo

B.A.Sc. in Mechatronics Engineering with AI Option

GPA: 90.6/100 • Dean's List

Publications

ECIR 2022 2022

Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-encoder Effectiveness for Reranking

Ronak Pradeep, Yuqi Liu, Xinyu Zhang, Yilin (Larry) Li, Andrew Yates, Jimmy Lin

View Paper
TREC 2021 2021

New Nails for Old Hammers: Anserini and Pyserini at TREC 2021

Jimmy Lin, Haonen Chen, Chengcheng Hu, Sheng-Chieh Lin, Yilin (Larry) Li, Xueguang Ma, Ronak Pradeep, Jheng-Hong Yang, Xinyu Zhang

View Paper
arXiv 2019

Automatic Classification of Pathology Reports using TF-IDF Features

Shivam Kalra, Larry Li, Hamid R. Tizhoosh

View Paper