The Minecraft Dialogue Corpus

see here

Multi-Premise Entailment

see here

Flickr30K Entities

see here

Flickr30K and Denotation Graph

see here

Image Descriptions

Here we have the data from:
  • Cyrus Rashtchian, Peter Young, Micah Hodosh and Julia Hockenmaier Collecting Image Annotations Using Amazon's Mechanical Turk Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. [.pdf]
  • Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, David Forsyth Every picture tells a story: generating sentences from images. In Proceedings of ECCV 2010, Greece. [.pdf]
  • Micah Hodosh, Peter Young, Cyrus Rashtchian and Julia Hockenmaier Cross-Caption Coreference Resolution for Automatic Image Understanding Proceedings of the 2010 Conference on Natural Language Learning (CoNLL) [.pdf]
The datasets use images from Pascal and Flickr. For copyright reasons we cannot republish the Flickr images so links have been provided instead. If you discover any problems with links or annotations, please contact