Submitted by Jingfeng Yao 80 Towards Scalable Pre-training of Visual Tokenizers for Generation MiniMax 133 4