Numinex

@yondonfu.bsky.social

2025-07-11T15:42:11.630Z

The TLDR in the scaling RL to 10^26 FLOPs post is a proposal to find a way to do next-token prediction on the Internet using RL. How does Morris' arguments compare to those of the Lu's around finding the dual of RL?

Post not found

(22 comments)replyquoteparent

⧉

Posts quoting @yondonfu.bsky.social