Esha Choukse

Principal Researcher

À propos

I am a Principal Researcher in the Azure Research- Systems (opens in new tab) team. I am currently leading the efficient AI (opens in new tab) research project, focusing on the power/energy/thermal bottlenecks of GenAI deployment in cloud, and datacenter sustainability.

For publications, visit this (opens in new tab) tab, resume is here (opens in new tab).

Most recent news:

April 2025: Invited talk at the UPenn NSF Expedition workshop (opens in new tab) on «AI Infrastructure: Foundations for Energy Efficiency and Scalability»
April 2025: Our paper «Towards Resource-Efficient Compound AI Systems (opens in new tab)» will be presented at HotOS 2025
April 2025: Our papers were presented at EuroMLSys (opens in new tab) and ASPLOS 2025 (opens in new tab)
April 2025: Panel member at the YArch (opens in new tab) and SESAME (opens in new tab) workshops at ASPLOS 2025, Rotterdam
March 2025: Gave invited talks at Cornell Tech (guest lecture) and AMD Power Summit
March 2025: DynamoLLM was given the best paper award (opens in new tab) at HPCA 2025!
February 2025: Two of our papers were declared IEEE MICRO Top Picks 2024! (opens in new tab)! Splitwise, and GreenSKUs!
February 2025: Our EcoServe paper is on arxiv now: here (opens in new tab)
January 2025: Serving as Associate Editor for IEEE Computer Architecture Letters (CAL)
December 2024: Gave an invited talk at University of Minnesota Twin Cities
November 2024: Gave an invited talk at IIT Bombay
November 2024: Our droidspeak paper is out on arxiv, here (opens in new tab)
November 2024: Presented at APCCAS 2024 in Taipei, our collaborative paper with AMD, on Optimizing GPU datacenter power.
November 2024: Attended MICRO 2024 where two of our papers were presented: Mosaic and «Memory Allocation under Hardware Compression»
October 2024: Gave an invited talk at Georgia Tech.
September 2024: We published a preprint of our work on 10-million tokens long context LLM inference, Mnemosyne at https://arxiv.org/abs/2409.17264 (opens in new tab)

Some of the amazing students I have been working with/ have worked with:

I received my PhD in 2019 from the University of Texas at Austin, with a thesis on main memory compression for higher effective capacity and bandwidth. I am generally interested in hardware-software co-design for systems challenges.