Spaces:

crespo12
/

medgemma

Runtime error

crespo12 commited on 16 days ago

Commit

bba6aca

1 Parent(s): 1b22e70

Initial VLLM deploy

Files changed (3) hide show

Dockerfile ADDED Viewed

+FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
+# Installer les dépendances système
+RUN apt-get update && apt-get install -y \
+    python3 python3-pip git && \
+    ln -s /usr/bin/python3 /usr/bin/python
+# Installer VLLM
+RUN pip install --upgrade pip
+RUN pip install vllm
+# Ajouter ton script
+COPY app.py /app.py
+# Lancer le serveur VLLM
+CMD ["python", "/app.py"]

app.py ADDED Viewed

+import subprocess
+# Exemple avec un modèle léger (à adapter)
+model = "TheBloke/Mistral-7B-Instruct-v0.1-GGUF"
+# Lancement de VLLM en mode API
+subprocess.run([
+    "python3", "-m", "vllm.entrypoints.api_server",
+    "--model", model,
+    "--host", "0.0.0.0",
+    "--port", "7860"
+])

requirement.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ vllm