Spaces:
Sleeping
Sleeping
Commit
·
2d458d6
1
Parent(s):
ee9899e
update
Browse files- Dockerfile +3 -2
- README.md +1 -3
- requirements.txt +4 -5
Dockerfile
CHANGED
@@ -10,12 +10,13 @@ RUN apt-get update && \
|
|
10 |
|
11 |
# Install Python dependencies
|
12 |
COPY requirements.txt .
|
13 |
-
RUN pip install --no-cache-dir --upgrade -r requirements.txt
|
|
|
14 |
|
15 |
# Pre-download the model
|
16 |
ENV HF_HOME=/code/.cache/huggingface
|
17 |
RUN mkdir -p /code/.cache/huggingface && \
|
18 |
-
pip install huggingface_hub && \
|
19 |
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='muhammadnoman76/cortex_q4', filename='unsloth.Q4_K_M.gguf', local_dir='/code', local_dir_use_symlinks=False)"
|
20 |
|
21 |
# Copy application code
|
|
|
10 |
|
11 |
# Install Python dependencies
|
12 |
COPY requirements.txt .
|
13 |
+
RUN pip install --no-cache-dir --upgrade -r requirements.txt && \
|
14 |
+
pip install --no-cache-dir llama-cpp-python==0.3.8 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
|
15 |
|
16 |
# Pre-download the model
|
17 |
ENV HF_HOME=/code/.cache/huggingface
|
18 |
RUN mkdir -p /code/.cache/huggingface && \
|
19 |
+
pip install --no-cache-dir huggingface_hub && \
|
20 |
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='muhammadnoman76/cortex_q4', filename='unsloth.Q4_K_M.gguf', local_dir='/code', local_dir_use_symlinks=False)"
|
21 |
|
22 |
# Copy application code
|
README.md
CHANGED
@@ -15,6 +15,4 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
|
|
15 |
This Space provides a FastAPI application that streams responses from the Cortex LLM model.
|
16 |
|
17 |
- Send GET requests to `/stream?task=<your_task>` to receive a streamed response from the model.
|
18 |
-
- Example: `/stream?task=make an agent which send mail by searching top 5 website from google`
|
19 |
-
|
20 |
-
**Note**: The `/ui` endpoint is not implemented in the current version.
|
|
|
15 |
This Space provides a FastAPI application that streams responses from the Cortex LLM model.
|
16 |
|
17 |
- Send GET requests to `/stream?task=<your_task>` to receive a streamed response from the model.
|
18 |
+
- Example: `/stream?task=make an agent which send mail by searching top 5 website from google`
|
|
|
|
requirements.txt
CHANGED
@@ -1,5 +1,4 @@
|
|
1 |
-
fastapi
|
2 |
-
uvicorn
|
3 |
-
pydantic
|
4 |
-
|
5 |
-
huggingface_hub>=0.25.0
|
|
|
1 |
+
fastapi==0.115.12
|
2 |
+
uvicorn==0.34.2
|
3 |
+
pydantic==2.11.4
|
4 |
+
huggingface_hub==0.30.2
|
|