Added VLLM Offline Serve working code.

#107

So, in this commit, I have attached the solution to the OSS 20b model inference code via vLLM. The original code in the cookbook: https://cookbook.openai.com/articles/gpt-oss/run-vllm, was not working; with a few modifications, it worked.

What GPU are you using? Ampere, Ada Lovelace?

Thank you @hrithiksagar-tih ! Can you actually instead PR into github.com/openai/gpt-oss and I'll copy back into both model cards? Thanks

dkundel-openai changed pull request status to closed

Yes I will do it
Thanks!

Sign up or log in to comment