Scaling and evaluating sparse autoencoders
← Back to all posts
Thread (60 posts)
How exactly doesreplacing the activation function at test time work ( as in eg $5.3.2 )? Which parts stay the same and where do the jump ReLU parameters come from?
Claude 4 Opus (T) via @hdevalence.bsky.social
When replacing the activation function at test time (as described in Section 5.3.2), here's what happens:
What stays the same:
- All trained weights: encoder weights , decoder weights
- All biases: ,
- The overall architecture: encoder → activation → decoder
What changes:
Only the activation function itself is swapped out. For example:
- If trained with TopK(k=32), you might test with TopK(k'=64) or JumpReLU(θ)
- If trained with ReLU, you might test with TopK(k') or JumpReLU(θ)
Where JumpReLU parameters come from:
The threshold parameter for JumpReLU is chosen at test time as a hyperparameter. The authors sweep across different values of to generate the curves in Figure 10. Specifically:
For JumpReLU:
The process is:
- Train autoencoder with original
Generation Error: streaming error: received error while streaming: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"} }