Offline dengan aplikasi Player FM !
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Manage episode 444774982 series 3524393
This study introduces GSM-Symbolic, a benchmark revealing LLMs' inconsistent mathematical reasoning, highlighting performance drops with altered questions and increased complexity, questioning their genuine logical reasoning abilities.
https://arxiv.org/abs//2410.05229
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1597 episode
Manage episode 444774982 series 3524393
This study introduces GSM-Symbolic, a benchmark revealing LLMs' inconsistent mathematical reasoning, highlighting performance drops with altered questions and increased complexity, questioning their genuine logical reasoning abilities.
https://arxiv.org/abs//2410.05229
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1597 episode
Усі епізоди
×Selamat datang di Player FM!
Player FM memindai web untuk mencari podcast berkualitas tinggi untuk Anda nikmati saat ini. Ini adalah aplikasi podcast terbaik dan bekerja untuk Android, iPhone, dan web. Daftar untuk menyinkronkan langganan di seluruh perangkat.