DeepCT-enhanced Lexical Argument Retrieval

Paper · Source

The recent Touché lab’s argument retrieval task focuses on controversial topics like ‘Should bottled water be banned?’ and asks to retrieve relevant pro/con arguments. Interestingly, the most effective systems submitted to that task still are based on lexical retrieval models like BM25. In other domains, neural retrievers that capture semantics are more effective than lexical baselines. To add more “semantics” to argument retrieval, we propose to combine lexical models with DeepCT-based document term weights. Our evaluation shows that our approach is more effective than all the systems submitted to the Touché lab while being on par with modern neural re-rankers that themselves are computationally more expensive.

Retrieving relevant arguments from the Web is essential to support discussions on controversial topics like ‘Should bottled water be banned?”

lexical retrievers still outperform neural models.

Lexical retrieval models (that rely on an exact match between the query and document terms), conversely, may suffer from “ignoring” the semantic similarity between the query and document terms. Hence, we propose to combine lexical retrievers (that are effective for argument retrieval) with document expansion based on estimated semantic term importance (term weights) predicted by DeepCT