I don't understand the setup in the second paragraph of §4.5. When they say "at a particular token index" what is the actual setup there? What is the index set? When they say "each autoencoder latent" what precisely is the referent of latent? The description seems densely packed and comprehensible for practitioners but is a little hard to be sure I am following precisely. cf. linked posts