IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis

arXiv Preprint, 2024

IDA-Bench is a comprehensive benchmark designed to evaluate Large Language Models on interactive guided data analysis tasks. The benchmark tests LLMs’ abilities to understand data, generate appropriate analyses, and respond to user guidance.

Authors: H. Li†, H. Liu†, T. Zhu†, T. Guo†, Z. Zheng, X. Deng, M. Jordan (†co-first author)