Technical Talks

View All

Fast ReActions: Planning and Reasoning Quickly with LLMs

Ashwin Ramesh Ashwin Ramesh | Machine Learning Engineer | Continual

One of the most important factors of Copilot products that utilize the chat interface is response latency. The expectation of low latency in chat makes it difficult to deliver high-quality results of complex actions like API calls and text summaries.

This talk will quickly describe how Continual tackled some of these challenges when building an AI copilot platform.

Ashwin Ramesh
Ashwin Ramesh
Machine Learning Engineer | Continual

Hey! My name is Ashwin Ramesh. I'm 28 years old. I grew up going back and forth between Bangalore and the Bay Area.I did my undergrad and master's degree in Computer Science at the University of Illinois at Urbana Champaign, which is where I got into the ML/AI space.
My first job was in 2020 as a software dev on the Triton Inference Server team at NVIDIA where I worked on tooling that enabled users to find optimal configurations for their server deployments.
In 2021 I moved to Continual where we've been building AI/ML platforms for the last 3 years.
My current career interests are in AI Copilots as well as autonomous agent workflows. I've gotten to work a lot with LLMs in trying to understand their ability to reason and plan and how this enables AI Copilot applications.
I also love geeking out about a wide range of topics: Econ, Bio, Math, Politics, and Music! Come chat with me about any of these.