Load model as torch.bfloat16

by martin-q-ma - opened Apr 17, 2025

Apr 17, 2025

•

Hi Authors,

Thank you very much for releasing the code.

In Line 14 of the example inference code, should it be:

model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda().to(torch.bfloat16)

instead of

model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()

Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment