Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
History
News
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/38/5e/2e/385e2e1a-fd6d-cf4d-1acf-6a4f92248552/mza_1515038687252973743.jpg/600x600bb.jpg
Neural Search Talks — Zeta Alpha
Zeta Alpha
21 episodes
6 days ago
A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).
Show more...
Technology
RSS
All content for Neural Search Talks — Zeta Alpha is the property of Zeta Alpha and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/production/podcast_uploaded_nologo/19412145/19412145-1639386625357-d9ef19d323019.jpg
Evaluating Extrapolation Performance of Dense Retrieval: How does DR compare to cross encoders when it comes to generalization?
Neural Search Talks — Zeta Alpha
58 minutes 30 seconds
3 years ago
Evaluating Extrapolation Performance of Dense Retrieval: How does DR compare to cross encoders when it comes to generalization?

How much of the training and test sets in TREC or MS Marco overlap? Can we evaluate on different splits of the data to isolate the extrapolation performance?

In this episode of Neural Information Retrieval Talks, Andrew Yates and Sergi Castella i Sapé discuss the paper "Evaluating Extrapolation Performance of Dense Retrieval" byJingtao Zhan, Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma.


📄 Paper: https://arxiv.org/abs/2204.11447

❓ About MS Marco: https://microsoft.github.io/msmarco/

❓About TREC: https://trec.nist.gov/

🪃 Feedback form: https://scastella.typeform.com/to/rg7a5GfJ  


Timestamps: 

00:00 Introduction 

01:08 Evaluation in Information Retrieval, why is it exciting 

07:40 Extrapolation Performance in Dense Retrieval 

10:30 Learning in High Dimension Always Amounts to Extrapolation 

11:40 3 Research questions 

16:18 Defining Train-Test label overlap: entity and query intent overlap 

21:00 Train-test Overlap in existing benchmarks TREC 

23:29 Resampling evaluation methods: constructing distinct train-test sets 

25:37 Baselines and results: ColBERT, SPLADE

29:36 Table 6: interpolation vs. extrapolation performance in TREC 

33:06 Table 7: interplation vs. extrapolation in MS Marco 

35:55 Table 8: Comparing different DR training approaches 

40:00 Research Question 1 resolved: cross encoders are more robust than dense retrieval in extrapolation 

42:00 Extrapolation and Domain Transfer: BEIR benchmark. 

44:46 Figure 2: correlation between extrapolation performance and domain transfer performance 

48:35 Broad strokes takeaways from this work 

52:30 Is there any intuition behind the results where Dense Retrieval generalizes worse than Cross Encoders? 

56:14 Will this have an impact on the IR benchmarking culture? 

57:40 Outro   


Contact: castella@zeta-alpha.com

Neural Search Talks — Zeta Alpha
A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).