A selection of things I've built.
ML Data Platformat Apple
2022Designed and built a data management platform for creating, versioning, and governing ML datasets at scale. Includes a large-scale ingestion service handling 100+ TB and 100M assets per job, with optimizations that improved data load performance by 10x.
ML InfrastructureSDK & APIsPySparkApache IcebergGolangAWS S3Kubernetes
Pre-Training Data Infrastructureat Luma AI
2025Co-built the data processing infrastructure for foundation model pre-training across 3,000 GPUs. Developed an internal library for custom data processors and a multithreaded data loader achieving sub-20ms per-batch loading for multimodal datasets.
ML InfrastructureSystemsPythonRayLancePyTorch
Windows Kernel Driver Subsystemat Microsoft
2019Contributed to the Driver Plug and Play Subsystem in the Windows Kernel, handling driver installations, device-to-driver matching, driver upgrades, and device migration across OS upgrades. Built a new PnP diagnostics module to improve debugging and observability of driver installation.
SystemsC/C++Windows Kernel
Mastering OpenCV Android Application Programming
2015Co-authored a book on building computer vision applications for Android using OpenCV, published by Packt Publishing.
BookOpenCVAndroidJava