Podcasters Assemble (Probably) is a crowdsourced, hype/re-watch podcast, where each season we're revisiting one movie series at a time, as well as our monthly "Disassembled" episodes with Zack Derby! Check out https://probablywork.com/podcasters-assemble/ for more! Follow the podcast on Twitter and Instagram @CastersAssemble
…
continue reading
"Epik Fails of History" is a podcast about the most epic fails of... history! Created, hosted, and edited by Erik Slader (author of the "Epic Fails" history book series), co-hosted with fellow podcasters, Chris Carroll, and Justin Ache. (Recommended for ages 13 and up due to some mild language, occasional crude humor, as well as discussions of war, disasters, and historical facts.)
…
continue reading
1
TroytlePower Presents: The Power Play-Throughs Podcast, with TroytlePower - Let’s Play Video Games!?
We Can Make This Work Probably
TPPTPPTPWTP is the podcast where I, your host, TroytlePower, play through games in a powerful way! See the video version of some episodes at tiny.cc/tpptpwv
…
continue reading
Henry must escape his hunger and Brexit
…
continue reading
…
continue reading
A show where podcasters rate and review dead podcasts
…
continue reading
Welcome to the former, Your Day, Week, Month, and Year Reviews! A Podcast where we review games after playing them for 1 day, 1 week, 1 month and 1 year! Part of the 'We Can Make This Work (probably)' Network
…
continue reading
Welcome to Thoughts Cast, Australia's #1 Car Based (usually) Podcast, from the great state of West "By-God" Virginia (The "By-God" is very important and a lot of people forget about it. Join Evan, along with occasional guest-hosts Arjuna, Troy, Bill, and Tyler, as they all navigate the road of life with deep discussions such as the definition of art, road rage inducing driving, toilets, life in general, and much more. New episodes every (ERROR). Support this podcast: https://podcasters.spoti ...
…
continue reading
Sometimes a group of people, who are interested in playing videogames, but not interested in showing you what the game actually looks like, decide to make an entire format of podcast based on Let's Plays. Bill is one of those people; each week (probably) on Audio Only Experience (AOXP), Bill plays a game and lets you enjoy the pure audio of video game noise and his soothing, British voice. If you are tired of all those other Audio Only Let's Plays, why not try this one?
…
continue reading
1
The Coordinate: Berserk Mode | A Berserk Podcast (formally an Attack on Titan Podcast)
We Can Make This Work Probably
What started as a love affair with Attack on Titan (Shingeki no Kyojin) has now evolved into a more general manga/anime podcast where Tyler and Bill attempt to coordinate with either each other to record despite being on different continents. Season 1 we covered Attack on Titan with a few other animes/mangas thrown in. Currently we are working our way through Berserk throught the 1997 Anime and the original Manga.
…
continue reading
Three guys spun out by the wheel to correct the pattern… or destroy it? Who knows! Follow along, as our hosts go through the Wheel of Time, chapter by chapter. Listen in as Bill, Rob, and Rich attempt to bend the Pattern...in their own special way. If you curious about this incredible book series, this podcast is the perfect compliment to listen to while reading the books. Come experience Robert Jordan's fantasy world that is so deep and fleshed out with great locations and greater character ...
…
continue reading
Did you know that 80% of high school students nationally chose their college based on where their friends are going? Don’t let that be your student. The Cut Throat College Planning podcast is your all-in-one, start-to-finish guide to get your student on the right path for life after high school. We want to welcome parents, students, teachers, and any high school staff interested in helping students plan for their next step into adulthood. Choosing a college or not to go to college, can be on ...
…
continue reading
Smart machines based upon the principles of artificial intelligence and machine learning are now prevalent in our everyday life. For example, artificially intelligent systems recognize our voices, sort our pictures, make purchasing suggestions, and can automatically fly planes and drive cars. In this podcast series, we examine such questions such as: How do these devices work? Where do they come from? And how can we make them even smarter and more human-like? These are the questions that wil ...
…
continue reading
Running out of time to catch up with new arXiv papers? We take the most impactful papers and present them as convenient podcasts. If you're a visual learner, we offer these papers in an engaging video format. Our service fills the gap between overly brief paper summaries and time-consuming full paper reads. You gain academic insights in a time-efficient, digestible format. Code behind this work: https://github.com/imelnyk/ArxivPapers Support this podcast: https://podcasters.spotify.com/pod/s ...
…
continue reading
In The Case Interview Podcast, you will learn what it takes to get offers at top consulting firms (such as McKinsey, BCG and Bain). Every Monday morning, we share the hard-earned wisdom and stories from our experience helping thousands of candidates prepare for their management consulting interviews, so you can avoid the mistakes that 99% of candidates make and be among the top 1% who get multiple offers. We are Bruno and Julio, former McKinsey and Bain consultants, former consulting intervi ...
…
continue reading
Here in the Popular Science office, we could talk about tech all day. Every week has new products, events, and controversies to cover. However, our boss says that if we’re going to sit around talking technology, we had to “make some content,” and thus, this tech podcast was born. Every week, we’ll take a look at the biggest stories from new gadgets releases to overarching topics like artificial intelligence to virtual reality and beyond. Check out a new episode every Monday and email us ques ...
…
continue reading
Their names are part of football folklore. They are often turned to in times of need. They can be a fan’s last resort... but who are “The Football Gods”? This podcast gives famous faces the ultimate footballing role: total power over the beautiful game. Listen as they shape football to their whim and ultimately become The Football Gods. Join our hosts, broadcaster Kate Mason and journalist Tim Spiers, as we ponder the all-important questions such as; “Which player would you send to hell?”, “ ...
…
continue reading
1
[QA] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
7:59
7:59
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:59
This study analyzes layer-wise gradients in LLMs, revealing that slow thinking enhances learning stability and response correctness, while fast thinking shows larger gradient variations. https://arxiv.org/abs//2410.23743 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple…
…
continue reading
1
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
15:27
15:27
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
15:27
This study analyzes layer-wise gradients in LLMs, revealing that slow thinking enhances learning stability and response correctness, while fast thinking shows larger gradient variations. https://arxiv.org/abs//2410.23743 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple…
…
continue reading
1
[QA] Tokenformer: Rethinking Transformer Scaling with Tokenized Model Parameters
7:28
7:28
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:28
Tokenformer introduces a scalable architecture that enhances Transformers' efficiency by treating model parameters as tokens, allowing for flexible scaling without retraining, significantly reducing computational costs. https://arxiv.org/abs//2410.23168 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple…
…
continue reading
1
Tokenformer: Rethinking Transformer Scaling with Tokenized Model Parameters
19:38
19:38
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
19:38
Tokenformer introduces a scalable architecture that enhances Transformers' efficiency by treating model parameters as tokens, allowing for flexible scaling without retraining, significantly reducing computational costs. https://arxiv.org/abs//2410.23168 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple…
…
continue reading
1
E37 - Ranking the Presidents! (Part One: 1789-1901)
1:32:00
1:32:00
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
1:32:00
“We should not look back unless it is to derive useful lessons from past errors, and for the purpose of profiting by dearly bought experience.” - George Washington With the upcoming 2024 Election, Erik and Justin attempt to rank the US Presidents! (Part 1: Washington to McKinley / 1789-1901) You can help support Erik by buying a copy of his book, "…
…
continue reading
1
[QA] Where Do Large Learning Rates Lead Us?
8:30
8:30
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:30
This study investigates optimal initial learning rates for neural networks, finding a narrow range enhances generalization by locating high-quality minima and focusing on relevant features, unlike extreme rates. https://arxiv.org/abs//2410.22113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcast…
…
continue reading
1
Where Do Large Learning Rates Lead Us?
28:43
28:43
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
28:43
This study investigates optimal initial learning rates for neural networks, finding a narrow range enhances generalization by locating high-quality minima and focusing on relevant features, unlike extreme rates. https://arxiv.org/abs//2410.22113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcast…
…
continue reading
1
[QA] Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
7:10
7:10
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:10
The paper introduces a Fourier series-based neural network layer to improve continuous token modeling in decision-making and time series tasks, enhancing performance in various benchmarks. https://arxiv.org/abs//2410.22269 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…
…
continue reading
1
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
13:56
13:56
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
13:56
The paper introduces a Fourier series-based neural network layer to improve continuous token modeling in decision-making and time series tasks, enhancing performance in various benchmarks. https://arxiv.org/abs//2410.22269 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…
…
continue reading
1
[QA] LoRA vs Full Fine-tuning: An Illusion of Equivalence
7:47
7:47
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:47
This study analyzes the differences between full fine-tuning and LoRA in large language models, revealing distinct weight matrix structures and generalization behaviors despite similar performance on tasks. https://arxiv.org/abs//2410.21228 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
1
LoRA vs Full Fine-tuning: An Illusion of Equivalence
13:44
13:44
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
13:44
This study analyzes the differences between full fine-tuning and LoRA in large language models, revealing distinct weight matrix structures and generalization behaviors despite similar performance on tasks. https://arxiv.org/abs//2410.21228 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
1
[QA] Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?
6:57
6:57
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
6:57
Vision-Language Models show promise in reasoning across text and images but struggle with basic visual concepts, revealing significant gaps in their understanding and generalization abilities. https://arxiv.org/abs//2410.19546 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts…
…
continue reading
1
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?
8:44
8:44
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:44
Vision-Language Models show promise in reasoning across text and images but struggle with basic visual concepts, revealing significant gaps in their understanding and generalization abilities. https://arxiv.org/abs//2410.19546 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts…
…
continue reading
1
[QA] Computational Bottlenecks of Training Small-scale Large Language Models
8:10
8:10
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:10
This study investigates the training behavior and computational requirements of Small-scale Large Language Models (SLMs), focusing on hyperparameters and configurations to enhance efficiency and support low-resource AI research. https://arxiv.org/abs//2410.19456 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_pap…
…
continue reading
1
Computational Bottlenecks of Training Small-scale Large Language Models
9:57
9:57
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
9:57
This study investigates the training behavior and computational requirements of Small-scale Large Language Models (SLMs), focusing on hyperparameters and configurations to enhance efficiency and support low-resource AI research. https://arxiv.org/abs//2410.19456 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_pap…
…
continue reading
1
[QA] Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees
9:12
9:12
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
9:12
This paper introduces a hybrid approach combining physics-informed neural networks and cylindrical approximation to efficiently solve functional differential equations, addressing computational challenges and improving numerical analysis. https://arxiv.org/abs//2410.18153 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…
…
continue reading
1
Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees
19:53
19:53
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
19:53
This paper introduces a hybrid approach combining physics-informed neural networks and cylindrical approximation to efficiently solve functional differential equations, addressing computational challenges and improving numerical analysis. https://arxiv.org/abs//2410.18153 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…
…
continue reading
1
[QA] A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
8:04
8:04
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:04
This paper shows that integrating coherent reasoning in Few-shot Chain-of-Thought prompting enhances transformer performance, revealing sensitivity to errors in intermediate steps and proposing improvements using varied reasoning paths. https://arxiv.org/abs//2410.16540 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@a…
…
continue reading
1
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
18:20
18:20
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
18:20
This paper shows that integrating coherent reasoning in Few-shot Chain-of-Thought prompting enhances transformer performance, revealing sensitivity to errors in intermediate steps and proposing improvements using varied reasoning paths. https://arxiv.org/abs//2410.16540 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@a…
…
continue reading
1
HALLOWEEN (1978 /2018) - A Disassembled Double Feature!
2:32:53
2:32:53
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
2:32:53
"Death has come to your little town, Sheriff." - Dr. Loomis For our very first "Disassembled" Double Feature, Zack and Erik are joined once again by Stephen White from Horror Ramblings to talk about some creepy movies for spooky season... that's right, we're finally watching "Halloween"! Both the John Carpenter classic from 1978 *and* the 2018 lega…
…
continue reading
1
[QA] LEGO: Language Model Building Blocks
7:19
7:19
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:19
LEGO is a novel technique for extracting and recombining small language models from large language models, enhancing efficiency, robustness, and user data privacy while reducing costs. https://arxiv.org/abs//2410.18287 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.c…
…
continue reading
1
LEGO: Language Model Building Blocks
16:46
16:46
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
16:46
LEGO is a novel technique for extracting and recombining small language models from large language models, enhancing efficiency, robustness, and user data privacy while reducing costs. https://arxiv.org/abs//2410.18287 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.c…
…
continue reading
1
[QA] Knowledge Distillation Using Frontier Open-Source LLMs: Generalizability and the Role of Synthetic Data
8:13
8:13
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:13
This study explores knowledge distillation from Llama-3.1-405B to smaller models, demonstrating improved accuracy and efficiency through synthetic data and diverse evaluation methods across various tasks. https://arxiv.org/abs//2410.18588 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: http…
…
continue reading
1
Knowledge Distillation Using Frontier Open-Source LLMs: Generalizability and the Role of Synthetic Data
19:45
19:45
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
19:45
This study explores knowledge distillation from Llama-3.1-405B to smaller models, demonstrating improved accuracy and efficiency through synthetic data and diverse evaluation methods across various tasks. https://arxiv.org/abs//2410.18588 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: http…
…
continue reading
1
College Experience | The Military Should ALWAYS Be An Option
1:09:37
1:09:37
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
1:09:37
Guest Episode with IT System Administrator, Nathan Jewell! Takeaways Preparation for life after high school is crucial. Many students lack guidance in college applications. Community college can be a stepping stone. Military service offers educational benefits. Work experience is valuable for personal growth. Practical skills can lead to career opp…
…
continue reading
1
[QA] Beyond position: how rotary embeddings shape representations and memory in autoregressive transfomers
8:09
8:09
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:09
This paper explores how Rotary Positional Embeddings (RoPE) affect Transformer model dynamics, introducing phase shifts that influence embeddings, information retention, and attention through oscillatory behaviors and frequency components. https://arxiv.org/abs//2410.18067 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com…
…
continue reading
1
Beyond position: how rotary embeddings shape representations and memory in autoregressive transfomers
20:50
20:50
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
20:50
This paper explores how Rotary Positional Embeddings (RoPE) affect Transformer model dynamics, introducing phase shifts that influence embeddings, information retention, and attention through oscillatory behaviors and frequency components. https://arxiv.org/abs//2410.18067 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com…
…
continue reading
1
[QA] ALTA: Compiler-Based Analysis of Transformers
7:29
7:29
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:29
ALTA is a new programming language and compiler that maps programs to Transformer weights, enabling loop expression and improved algorithm representation, while providing tools for analyzing training challenges. https://arxiv.org/abs//2410.18077 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcast…
…
continue reading
1
ALTA: Compiler-Based Analysis of Transformers
22:56
22:56
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
22:56
ALTA is a new programming language and compiler that maps programs to Transformer weights, enabling loop expression and improved algorithm representation, while providing tools for analyzing training challenges. https://arxiv.org/abs//2410.18077 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcast…
…
continue reading
1
[QA] UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs
8:06
8:06
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:06
This paper introduces UNSTAR, a novel unlearning method for large language models using anti-samples to efficiently and selectively reverse learned associations, enhancing privacy and model modification capabilities. https://arxiv.org/abs//2410.17050 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Po…
…
continue reading
1
UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs
16:43
16:43
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
16:43
This paper introduces UNSTAR, a novel unlearning method for large language models using anti-samples to efficiently and selectively reverse learned associations, enhancing privacy and model modification capabilities. https://arxiv.org/abs//2410.17050 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Po…
…
continue reading
1
[QA] Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing
7:48
7:48
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:48
This paper explores how Knowledge Editing algorithms can unintentionally distort model representations, leading to decreased factual recall and reasoning abilities, a phenomenon termed "representation shattering." https://arxiv.org/abs//2410.17194 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
1
Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing
18:06
18:06
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
18:06
This paper explores how Knowledge Editing algorithms can unintentionally distort model representations, leading to decreased factual recall and reasoning abilities, a phenomenon termed "representation shattering." https://arxiv.org/abs//2410.17194 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
The paper proposes GenRM, a hybrid approach combining RLHF and RLAIF, improving synthetic preference labels' quality and outperforming existing models in both in-distribution and out-of-distribution tasks. https://arxiv.org/abs//2410.12832 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
The paper proposes GenRM, a hybrid approach combining RLHF and RLAIF, improving synthetic preference labels' quality and outperforming existing models in both in-distribution and out-of-distribution tasks. https://arxiv.org/abs//2410.12832 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
1
[QA] Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks
8:07
8:07
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:07
This paper presents an AI agent for error resolution in computational notebooks, enhancing bug-fixing capabilities while evaluating user experience and collaboration within the JetBrains Datalore service. https://arxiv.org/abs//2410.14393 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: http…
…
continue reading
1
Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks
10:28
10:28
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
10:28
This paper presents an AI agent for error resolution in computational notebooks, enhancing bug-fixing capabilities while evaluating user experience and collaboration within the JetBrains Datalore service. https://arxiv.org/abs//2410.14393 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: http…
…
continue reading
1
[QA] Decomposing The Dark Matter of Sparse Autoencoders
7:58
7:58
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:58
This study explores "dark matter" in sparse autoencoders, revealing that much unexplained variance can be predicted and proposing methods to reduce nonlinear error in model activations. https://arxiv.org/abs//2410.14670 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…
…
continue reading
1
Decomposing The Dark Matter of Sparse Autoencoders
15:36
15:36
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
15:36
This study explores "dark matter" in sparse autoencoders, revealing that much unexplained variance can be predicted and proposing methods to reduce nonlinear error in model activations. https://arxiv.org/abs//2410.14670 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…
…
continue reading
1
[QA] A Hitchhiker's Guide to Scaling Law Estimation
10:08
10:08
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
10:08
The paper analyzes scaling laws in machine learning, providing best practices for estimating model performance using a large dataset of pretrained models and emphasizing the importance of intermediate training checkpoints. https://arxiv.org/abs//2410.11840 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Ap…
…
continue reading
1
A Hitchhiker's Guide to Scaling Law Estimation
17:28
17:28
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
17:28
The paper analyzes scaling laws in machine learning, providing best practices for estimating model performance using a large dataset of pretrained models and emphasizing the importance of intermediate training checkpoints. https://arxiv.org/abs//2410.11840 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Ap…
…
continue reading
1
[QA] Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
7:39
7:39
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:39
This paper presents a novel method for image inversion and editing using rectified flow models, achieving superior performance in zero-shot tasks compared to existing diffusion model approaches. https://arxiv.org/abs//2410.10792 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcas…
…
continue reading
1
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
10:29
10:29
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
10:29
This paper presents a novel method for image inversion and editing using rectified flow models, achieving superior performance in zero-shot tasks compared to existing diffusion model approaches. https://arxiv.org/abs//2410.10792 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcas…
…
continue reading
1
[QA] Looking Inward: Language Models Can Learn About Themselves by Introspection
7:28
7:28
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:28
The paper explores whether large language models (LLMs) can introspect, finding that finetuned models can predict their own behavior, suggesting a form of internal knowledge access. https://arxiv.org/abs//2410.13787 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/…
…
continue reading
1
Looking Inward: Language Models Can Learn About Themselves by Introspection
26:21
26:21
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
26:21
The paper explores whether large language models (LLMs) can introspect, finding that finetuned models can predict their own behavior, suggesting a form of internal knowledge access. https://arxiv.org/abs//2410.13787 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/…
…
continue reading
1
[QA] Thinking LLMs: General Instruction Following with Thought Generation
7:25
7:25
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:25
The paper proposes a method to enhance LLMs' thinking abilities for better instruction following, improving performance across various tasks without additional human data through iterative search and optimization. https://arxiv.org/abs//2410.10630 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
1
Thinking LLMs: General Instruction Following with Thought Generation
15:52
15:52
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
15:52
The paper proposes a method to enhance LLMs' thinking abilities for better instruction following, improving performance across various tasks without additional human data through iterative search and optimization. https://arxiv.org/abs//2410.10630 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
1
[QA] Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
7:44
7:44
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
7:44
The paper investigates extreme-token phenomena in transformer-based LLMs, revealing mechanisms behind attention sinks and proposing strategies to mitigate their impact during pretraining. https://arxiv.org/abs//2410.13835 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…
…
continue reading
1
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
17:44
17:44
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
17:44
The paper investigates extreme-token phenomena in transformer-based LLMs, revealing mechanisms behind attention sinks and proposing strategies to mitigate their impact during pretraining. https://arxiv.org/abs//2410.13835 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…
…
continue reading
1
[QA] MOVIE GEN: A Cast of Media Foundation Models
8:52
8:52
Putar nanti
Putar nanti
Daftar
Suka
Menyukai
8:52
https://arxiv.org/abs//2410.13720 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…
…
continue reading