papers Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published Jan 29 • 23 Gated recurrent neural networks discover attention Paper • 2309.01775 • Published Sep 4, 2023 • 10
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published Jan 29 • 23
Network Architectures Gated recurrent neural networks discover attention Paper • 2309.01775 • Published Sep 4, 2023 • 10