Tokenizer
- vLLM tokenizer mismatch on finetuned Mistral model
vLLM tokenizer mismatch on finetuned Mistral model Spent the afternoon benchmarking a Mistral‑7B finetune with vLLM. First prompt returned gibberish tokens. from vllm import LLM llm = LLM(model="mistralai/Mistral-7B-Instruct-v0.2", tokenizer="mistralai/Mistral-7B-Instruct-v0.2") print(llm.generate("Hello")) Output contained repeating � characters. Ubuntu 22.04, CUDA 12.1, vllm 0.4.3, transformers 4.40.0, Python 3.11. Checkpoint trained with --trust-remote-code.