Evaluating Open-QA Evaluation

Published in Datasets and Benchmarks Track, Neural Information Processing Systems (NeurIPS), 2023