Numinex

Thread (64 posts)

@hdevalence.bsky.social

2025-07-01T22:26:34.967Z

Stealing Part of a Production Language Model

Link not found

replyquote

⧉

↴1 of 13 continuations→

@hdevalence.bsky.social

2025-07-02T00:07:05.664Z

Interesting observation about GPT-2 Small having a true hidden dimension of 757 instead of 768. Refs "Spectral Filters, Dark Signals, and Attention Sinks”

Link not found

replyquoteparent

⧉