One major issue with research in Q&A is that there is not a controlled but large-scale standard benchmark dataset available. There are a number of open-ended Q&A datasets, but they often require a system to have access to external resources. This makes it difficult for researchers to compare different models in a unified way.
Recently, one such large-scale standard Q&A dataset was proposed by Hermann et al. (2015). In this dataset, a question comes along with a context of which one of the word is an answer to the question. And.. wait.. I just realized that I don't have to explain the dataset nor the task here at all. After all, it's not my data/task nor my paper. I will just leave a link to the original paper:
So, what is an issue with this dataset? It's that the dataset was not published online. I can understand why, even without asking them (though, I neither confirm nor deny any interaction between me and DeepMind or anyone there,) and you can probably guess as well (though, among the two you guessed, a less evil one it probably is.) They instead released a script to generate the dataset, and I am grateful for their effort.
This is unfortunately never fun to spend a few hours generating a dataset, isn't it? Nothing to worry about your laziness anymore! Because, I generated the dataset and am making it available for you to download at