WANLI: Worker-and-AI NLI

2022
WANLI is an NLI dataset of 108K examples created through a novel approach for dataset creation based on worker and AI collaboration, which brings together the generative strength of language models and the evaluative strength of humans. Models trained on WANLI (alone) achieve better out-of-domain performance across a wide variety of test sets compared to training on existing NLI resources (such as MNLI and Adversarial NLI), even when combined or augmented with augmentation sets.
License: CC BY

Authors

Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi