From Atoms to Trees: Building a Structured Feature Forest with Hierarchical Sparse Autoencoders
arXiv:2602.11881v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) have proven effective for extracting monosemantic features from large language models (LLMs), yet these features are typically...