Attend First: Consolidate Later: On the Importance of Attention in Different LLM Layers BlackBox NLP 2024 Amit Ben-Artzy, Roy Schwartz ArXiv Tweet Code