ML Engineer, NLP Researcher & GPU Whisperer

Neat Freak Coder obsessed with Systems
Working on Rust and CUDA these days

Hi, I’m Herumb. Thanks for dropping by.
I'm currently working as a Research Assistant at CRFM (Centre for Research in Foundation Models) Lab at Stanford University where most of my task is currently revolving around building infrastructure for evaluating and post-training LLMs. I'm working in other labs including Hazy Research and Scaling Intelligence Lab with Jon Saad Falcon on various projects ranging from multi model inference, LLM routing, Agent Evaluations, etc.
In my free time, I (try to) write posts on Twitter or LinkedIn trying to introduce topics less known to beginners. Teaching people is something I love a lot so if you have any questions feel free to DM, I'll try my very best to help you out!
I hate sports but I love sports anime and I love beatboxing as well. Ping me if you are looking for research collaborators, let's brainstorm together!
In my free time, I (try to) write posts on Twitter or LinkedIn trying to introduce topics less known to beginners. Teaching people is something I love a lot so if you have any questions feel free to DM, I'll try my very best to help you out!
I hate sports but I love sports anime and I love beatboxing as well. Ping me if you are looking for research collaborators, let's brainstorm together!
Areas of Interest
Here are a few domains that I've explored and what I'm upto...
ML Systems
Systems became an area of major interest for me this year. I've worked on project in Rust, CUDA and Triton to name a few. I'm currently working building DSRs which is a Rust port of DSPy and building SparkPuppies which is a collection of performance optimal kernels for Sparse Matrix Operations.
LLM Post-Training
The fun is after pretraining ends! In my current and past works, I've worked on post-training techniques and building infrastructure for various downstream tasks like Tool Use, etc. I'm currently working on finding techniques to improve output format adaptation in LLMs.
Information Retrieval
Information Retrieval is something I've been working on for a while now. I've seen the power of it and how it can be used to solve real world problems. I've worked on traditional IR models and also on modern ones. I've researched deeply on training retrieval models for LLM routing. I love it!
Performance Optimization
Performance is as important as model itself. The limitations of research should be around the ideas and not the infrastructure. Building fast and efficient research infrastructure is something I'm passionate about and have been doing in labs for the past few years.
Deep Learning Research
I've implemented papers for personal learning, for work and as freelancer for student researchers and worked with them to improve them. Up for hearing your ideas in mind and help you brainstorm how to can go about the task!
Reinforcement Learning
Reinforcement Learning was my gateway to ML, so it has always been something I wanted to try. After reading AlphaTensor I got more fascinated with it. Currently, I'm working with RL for LLM post-training team at Stanford.
30+
Blogs
20+
Projects
50+
Talks & Sessions
1000+
Doubts Solved
2
Python Libraries
Research Experience

Research Assistant
Sep 2025 - Present · 3 mos
- Working on The Marin Project's RL Infrastructure and Output Format Adaptation at CRFM under David Hall and Percy Liang
Research Collaborator
May 2023 - Present · 2 yrs 7 mos
- Working on DSPy and ColBERT at Future Data Systems Lab under Omar Khattab and Chris Potts
- Working on Inference Systems at Hazy Research Lab under Jon Saad Falcon and Azalia Mirhoseini
Research Assistant
Sep 2024 - Jun 2025 · 10 mos
- Worked on The Marin Project for Open LLM Training and Research at CRFM under David Hall and Percy Liang
Research Assistant
Jun 2024 - Sep 2024 · 4 mos
- Worked on LLM Chain optimization at SNAP Labs under Shirley Wu and Jure Leskovec
Talks @ Cohere Labs
Started Beginner in Research Group
2022
Lead Paper Implementation Sprints
2022-2023
Lead CUDA Programming Cohort
2023
Co-Lead AI Alignment Cohort
2023
Co-Lead NLP Reading Group
2024
Work Experience

Researcher @Fundamental Research Lab
July, 2025 - September, 2025
- Worked on building training and evaluation pipelines for post-training LLMs for Tool Use.
- Building an general agentic evaluation framework over hosted environments. Added support for 15+ environments.
- Help setting up distributed infrastructure for cluster training.

NLP Engineer @SixDegrees AI
April, 2023 - September, 2024
- Work on training LLMs, deploying them and using them to power SixAI pipepline.
- Keep up with the latest research in LLMs and use them to improve the product performance.
- Coming up with new ideas and product logic to improve the product performance.

Machine Learning Engineer @Simplified
June, 2022 - April, 2023
- Simplified is an AI-powered content creation platform for creators backed by tier 1 investors.
- Research, implement and improve generative models to incorporate into the product.
- Train and deploy models for image editing models for the Design Platform.
- Setting up infrastructure and deployment strategies to deploy and scale models.
Data Science Intern @Simplified
January, 2022 - June, 2022
- Creating and Optimizing GPT-3 prompts and finding new usecases for the same.
- Train and deploy models for image editing models for the Design Platform.
- Working with SEO team for trend analysis and data scraping for landing pages creation.

Data Science Intern @CrowdANALYTIX
July, 2021 - January, 2022
- Researching and fine-tuning models for given task.
- Supporting model deployment team in model code analysis and optimizations for DeployX.
- Part of Platform Data Team.
- Experimentation with deep learning models & architecture for DeployX.
- Any other assignment communicated by team lead over email as needed.

NLP Research Intern @CAIR, DRDO
April, 2021 - August, 2021
- Building and Training Language Models for the provided task.
- Deploying model as an API via Django and a GUI interface to interact.
- Tasks belonged to Audio and NLP Domain.
- Task information confidential.

Jr. ML Engineer @Omdena
March, 2021 - May, 2021
- Building Sustainable Livestock Farming Computer Vision Models on Edge Device.
- Implemented and Experimented chicken detector based on YOLO, Mask-RCNN, etc.
- Also Implemented object tracking of the movement of each chicken frame by frame.
- The model runs on the hardware Raspberry Pi 4 with a Google Coral Edge-TPU.

Technical Content Intern @GeeksforGeeks
Dec, 2020 - July, 2021
- Writing articles related to Machine Learning explaining the process.
- Writing code related to the topic on which the blog was written.
- Topics Included: PyTorch Lightning, Model Evaluation, Deep Learning, R Lang, etc.

Data Science and Machine Learning Teaching Assistant @Coding Ninjas
May, 2020 - September, 2020
- Mentored a group of students in their course Data Science and Machine Learning.
- Evaluated and improved the projects developed by students as a part of the course.
Data Science Intern @Coding Ninjas
December, 2019 - April, 2020
- Mentored a group of students in their course Data Structure and Algorithm using C++.
- Served as an influential contributor to projects developed by the students.
Publications
2025
2025
2025
2024
2023
Adapting to the low-resource double-bind: investigating low-compute methods on low-resource African languages
Masakhane NLP
2022
Perceiving the level of depression from web text

Herumb Shandilya | Made with Mantine — @krypticmouse