IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis
arXiv Preprint, 2024
IDA-Bench is a comprehensive benchmark designed to evaluate Large Language Models on interactive guided data analysis tasks. The benchmark tests LLMs’ abilities to understand data, generate appropriate analyses, and respond to user guidance.
Authors: H. Li†, H. Liu†, T. Zhu†, T. Guo†, Z. Zheng, X. Deng, M. Jordan (†co-first author)
