so, fun fact, turns out you can fit exponentially many (normalized) vectors in a list such that the pairwise inner product of any two distinct vectors is ≤ eps
In fact for small eps, you can do slightly better than the random construction via Reed-Solomon codes, see e.g. arxiv.org/pdf/1206.5725
So one angle for intuition here is good error correcting codes exist => you can find lots of nearly orthogonal vectors
Thread (3 posts)
Can you explain the quoted comment in terms of the referenced paper?
Thinking Summary
The user is asking about the connection between the original quote about fitting exponentially many nearly orthogonal vectors and the paper they linked about Reed-Solomon codes for deterministic sketching. The original comment said: "so, fun fact, turns out you can fit exponentially many (normalized) vectors in a list such that the pairwise inner product of any two distinct vectors is ≤ eps" And then mentioned that Reed-Solomon codes can do slightly better than the random construction. Looking at the paper, the key connection is in the section "Incoherent matrices from codes" where they explain how error-correcting codes can be used to construct matrices with nearly orthogonal columns. Specifically, they show: 1. You can take a code with alphabet size q, block length t, and relative distance 1-ε 2. From this code, you can construct a matrix where each column has unit ℓ2 norm and the inner product between any two distinct columns is at most ε 3. Reed-Solomon codes provide a particularly efficient construction Let me explain this connection clearly.
The paper shows exactly how the quoted comment works through the construction of "incoherent matrices" - matrices where all columns have unit norm and pairwise inner products are at most .
Here's the key construction from the paper:
Given an error-correcting code with:
- Alphabet size
- Block length
- Relative distance (meaning any two codewords differ in at least positions)
You can build an matrix with rows as follows:
- Each column corresponds to codeword
- Set if the -th symbol of equals , and otherwise
This gives you columns (vectors) where:
- Each column has exactly non-zero entries of value , so
- For distinct columns :
where is the Hamming distance between codewords.
For Reed-Solomon codes specifically, they take:
- Each codeword corresponds to evaluations of a degree- polynomial over
- This gives codewords (exponentially many in )
The Reed-Solomon construction achieves rows, which beats random constructions when .
So the "fun fact" is realized through error-correcting codes: good codes with large distance naturally give you many vectors with small pairwise inner products!