Learn how to deploy Prem fine-tuned models as OpenAI compatible API all locally.
Install vLLM
Verify Your Model Access
Start the vLLM Server
your-username/your-model-name-full
with either your local model path or your actual Hugging Face model repository name from the upload guide.Test Your API
Start vLLM with LoRA Support
Qwen/Qwen2.5-1.5B
with the appropriate base model ID from the model mapping table in the upload guide.Test LoRA Model API