Stealing Part of a Production Language Model
Thread (64 posts)
What's the significance of the SVD on line 7 of Algorithm 1?
Thinking Summary
The user is asking about the significance of the SVD (Singular Value Decomposition) on line 7 of Algorithm 1 in the paper "Stealing Part of a Production Language Model". Looking at Algorithm 1 (Hidden-Dimension Extraction Attack), line 7 says: ``` State $\lambda_1 \geq \lambda_2 \geq \dots \geq \lambda_n \gets \text{SingularValues}(\Q)$ ``` This is computing the singular values of the matrix $\Q$ that contains the logit responses from querying the model. The significance is explained in the paper's intuition and formalization sections. The key insight is that even though each output logit vector is an $l$-dimensional vector (where $l$ is the vocabulary size), they all actually lie in an $h$-dimensional subspace because the embedding projection layer up-projects from $h$ dimensions. By computing the SVD and looking at the singular values, they can identify where there's a sharp drop-off in the magnitudes - this indicates the transition from the "real" dimensions (the hidden dimension $h$) to numerical noise. The algorithm identifies the largest multiplicative gap between consecutive singular values to determine $h$. This is a clever application of linear algebra - essentially using the rank-revealing properties of SVD to discover the intrinsic dimensionality of the data.
The SVD (Singular Value Decomposition) on line 7 of Algorithm 1 is the key mechanism that reveals the hidden dimension of the language model. Here's why it's significant:
Core Insight
The fundamental observation is that while the API returns logit vectors in (where is the vocabulary size, often 50,000-100,000), these vectors actually lie in a much lower-dimensional subspace of dimension (the hidden dimension, typically 1,024-8,192).
This happens because of the model architecture:
where:
- is the embedding projection matrix
- is the -dimensional hidden state
Why SVD Reveals the Hidden Dimension
When you collect logit vectors in matrix :
-
Rank constraint: Since where is , the rank of is at most
-
Singular value pattern: The SVD reveals this rank structure through the singular values:
- The first singular values will be large (representing the true signal)
- Singular values beyond position will be near zero (representing numerical noise)
-
Gap detection: The algorithm identifies by finding the largest multiplicative gap between consecutive singular values
Practical Visualization
The paper provides nice visualizations (Figures 2-3) showing:
- When : All singular values are significant (can't determine yet)
- When : Exactly significant singular values appear, with a sharp drop after
This is why line 8 computes:
count ← argmax_i {log ||λᵢ|| - log ||λᵢ₊₁||}
The logarithmic difference identifies the multiplicative gap, making the dimension extraction robust even with floating-point precision limitations.
This elegant use of SVD essentially performs dimensionality analysis to reverse-engineer architectural information from black-box API access alone.