Sabari H

Software Engineer
Chennai, India· sabari.h.dev@gmail.com· github· linkedin
I'm Sabari. I work mostly on making AI models run faster on hardware - LLM inference, profiling, quantization, that kind of thing. I like Python and Go, and I tend to have a side project going at any given time. This is just a place to keep track of what I've been doing.
Experience
Multicoreware Inc Jun 2025 - Present
Software Engineer
Got a Vision-Language-Action model running 5.7× faster on AI accelerators (4000ms → 700ms) with profiling and quantization. Wrote a Go installer with rollback handling for the Linux kernel and AMD driver setup on NPU/iGPU. Currently building an agentic AI system that reasons and calls tools over simulation environments.
Multicoreware Inc Dec 2024 - Jun 2025
Junior Software Engineer (Intern)
Implemented Heavy Hitter Oracle in vLLM - a sparse KV-cache pruning method that gave 20-30% more throughput at the same sparsity levels.
Finequs Jan 2024 - Mar 2024
Software Developer Intern
Built a Selenium automation that cut data entry from hours to minutes, and integrated the Tata Telecom API into the product dashboard.
Projects
Heavy Hitter Oracle in vLLM
A sparse KV-cache mechanism for vLLM. It keeps the high-attention keys during decoding and drops the rest, which works out to 20-30% more throughput at the same sparsity.
vLLM LLM Inference Sparse Attention
Decentralized EHR System
A blockchain-based health records system built at the PLI Hackathon (Sathyabama Institute). Won 50k XDC tokens.
Blockchain XDC Hackathon
Decentralized VPN - Hackverse Finalist
A VPN protocol where regular user devices act as relays and earn incentives for doing so. Reached the finals at Hackverse 2024.
Networking P2P Privacy
Skills
Languages · Python, Go, C++, JavaScript
ML & Inference · ONNX, PyTorch, JAX, vLLM, Ollama
Infra & DevOps · Docker, CI/CD, Git, Linux, Bash
Tools & APIs · SQLite, REST APIs, WebSockets, Makefile