ML Engineer, NLP Researcher & GPU Whisperer

Neat Freak Coder obsessed with Systems

Working on Rust and CUDA these days

Hi, I’m Herumb. Thanks for dropping by.

I'm currently working as a Research Assistant at CRFM (Centre for Research in Foundation Models) Lab at Stanford University where most of my task is currently revolving around building infrastructure for evaluating and post-training LLMs. I'm working in other labs including Hazy Research and Scaling Intelligence Lab with Jon Saad Falcon on various projects ranging from multi model inference, LLM routing, Agent Evaluations, etc.

In my free time, I (try to) write posts on Twitter or LinkedIn trying to introduce topics less known to beginners. Teaching people is something I love a lot so if you have any questions feel free to DM, I'll try my very best to help you out!

I hate sports but I love sports anime and I love beatboxing as well. Ping me if you are looking for research collaborators, let's brainstorm together!

Areas of Interest

Here are a few domains that I've explored and what I'm upto...

ML Systems

Systems became an area of major interest for me this year. I've worked on project in Rust, CUDA and Triton to name a few. I'm currently working building DSRs which is a Rust port of DSPy and building SparkPuppies which is a collection of performance optimal kernels for Sparse Matrix Operations.

LLM Post-Training

The fun is after pretraining ends! In my current and past works, I've worked on post-training techniques and building infrastructure for various downstream tasks like Tool Use, etc. I'm currently working on finding techniques to improve output format adaptation in LLMs.

Information Retrieval

Information Retrieval is something I've been working on for a while now. I've seen the power of it and how it can be used to solve real world problems. I've worked on traditional IR models and also on modern ones. I've researched deeply on training retrieval models for LLM routing. I love it!

Performance Optimization

Performance is as important as model itself. The limitations of research should be around the ideas and not the infrastructure. Building fast and efficient research infrastructure is something I'm passionate about and have been doing in labs for the past few years.

Deep Learning Research

I've implemented papers for personal learning, for work and as freelancer for student researchers and worked with them to improve them. Up for hearing your ideas in mind and help you brainstorm how to can go about the task!

Reinforcement Learning

Reinforcement Learning was my gateway to ML, so it has always been something I wanted to try. After reading AlphaTensor I got more fascinated with it. Currently, I'm working with RL for LLM post-training team at Stanford.

30+

Blogs

20+

Projects

50+

Talks & Sessions

1000+

Doubts Solved

Python Libraries

Research Experience

Research Assistant

Sep 2025 - Present · 3 mos

Working on The Marin Project's RL Infrastructure and Output Format Adaptation at CRFM under David Hall and Percy Liang

Research Collaborator

May 2023 - Present · 2 yrs 7 mos

Working on DSPy and ColBERT at Future Data Systems Lab under Omar Khattab and Chris Potts
Working on Inference Systems at Hazy Research Lab under Jon Saad Falcon and Azalia Mirhoseini

Research Assistant

Sep 2024 - Jun 2025 · 10 mos

Worked on The Marin Project for Open LLM Training and Research at CRFM under David Hall and Percy Liang

Research Assistant

Jun 2024 - Sep 2024 · 4 mos

Worked on LLM Chain optimization at SNAP Labs under Shirley Wu and Jure Leskovec

Talks @ Cohere Labs

Started Beginner in Research Group

2022

Lead Paper Implementation Sprints

2022-2023

Lead CUDA Programming Cohort

2023

Co-Lead AI Alignment Cohort

2023

Co-Lead NLP Reading Group

2024

Work Experience

Researcher @Fundamental Research Lab

July, 2025 - September, 2025

Worked on building training and evaluation pipelines for post-training LLMs for Tool Use.
Building an general agentic evaluation framework over hosted environments. Added support for 15+ environments.
Help setting up distributed infrastructure for cluster training.

NLP Engineer @SixDegrees AI

April, 2023 - September, 2024

Work on training LLMs, deploying them and using them to power SixAI pipepline.
Keep up with the latest research in LLMs and use them to improve the product performance.
Coming up with new ideas and product logic to improve the product performance.

Machine Learning Engineer @Simplified

June, 2022 - April, 2023

Simplified is an AI-powered content creation platform for creators backed by tier 1 investors.
Research, implement and improve generative models to incorporate into the product.
Train and deploy models for image editing models for the Design Platform.
Setting up infrastructure and deployment strategies to deploy and scale models.

Data Science Intern @Simplified

January, 2022 - June, 2022

Creating and Optimizing GPT-3 prompts and finding new usecases for the same.
Train and deploy models for image editing models for the Design Platform.
Working with SEO team for trend analysis and data scraping for landing pages creation.

Data Science Intern @CrowdANALYTIX

July, 2021 - January, 2022

Researching and fine-tuning models for given task.
Supporting model deployment team in model code analysis and optimizations for DeployX.
Part of Platform Data Team.
Experimentation with deep learning models & architecture for DeployX.
Any other assignment communicated by team lead over email as needed.

NLP Research Intern @CAIR, DRDO

April, 2021 - August, 2021

Building and Training Language Models for the provided task.
Deploying model as an API via Django and a GUI interface to interact.
Tasks belonged to Audio and NLP Domain.
Task information confidential.

Jr. ML Engineer @Omdena

March, 2021 - May, 2021

Building Sustainable Livestock Farming Computer Vision Models on Edge Device.
Implemented and Experimented chicken detector based on YOLO, Mask-RCNN, etc.
Also Implemented object tracking of the movement of each chicken frame by frame.
The model runs on the hardware Raspberry Pi 4 with a Google Coral Edge-TPU.

Technical Content Intern @GeeksforGeeks

Dec, 2020 - July, 2021

Writing articles related to Machine Learning explaining the process.
Writing code related to the topic on which the blog was written.
Topics Included: PyTorch Lightning, Model Evaluation, Deep Learning, R Lang, etc.

Data Science and Machine Learning Teaching Assistant @Coding Ninjas

May, 2020 - September, 2020

Mentored a group of students in their course Data Science and Machine Learning.
Evaluated and improved the projects developed by students as a part of the course.

Data Science Intern @Coding Ninjas

December, 2019 - April, 2020

Mentored a group of students in their course Data Structure and Algorithm using C++.
Served as an influential contributor to projects developed by the students.