The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning Paper • 2304.05366 • Published Apr 11, 2023 • 1
Explaining NonLinear Classification Decisions with Deep Taylor Decomposition Paper • 1512.02479 • Published Dec 8, 2015 • 1
Neural Machine Translation by Jointly Learning to Align and Translate Paper • 1409.0473 • Published Sep 1, 2014 • 4
Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models Paper • 2310.17086 • Published Oct 26, 2023 • 1
Cross-Entropy Loss Functions: Theoretical Analysis and Applications Paper • 2304.07288 • Published Apr 14, 2023 • 1
The Geometry of Concepts: Sparse Autoencoder Feature Structure Paper • 2410.19750 • Published Oct 10 • 1