Christoph Holthaus
commited on
Commit
·
b8c846d
1
Parent(s):
d65f135
switch over to gradio "native"
Browse files- README.md +4 -8
- gradio_app.py → app.py +2 -2
- requirements.txt +2 -1
README.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
| 1 |
---
|
| 2 |
title: Test
|
| 3 |
emoji: 🔥
|
| 4 |
-
colorFrom:
|
| 5 |
colorTo: yellow
|
| 6 |
-
sdk:
|
| 7 |
pinned: false
|
| 8 |
-
license:
|
| 9 |
---
|
| 10 |
|
| 11 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
@@ -14,17 +14,13 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
|
|
| 14 |
This is a test ...
|
| 15 |
|
| 16 |
TASKS:
|
| 17 |
-
- for fast debug: Add a debug mode that enables me to run direct cli commands? -> Never for prod!
|
| 18 |
-
- prod harden docker with proper users etc. OR mention this is only a dev build an intended for messing with, no readonly filesystem etc.
|
| 19 |
- rewrite generation from scratch or use the one of mistral space if possible. alternative use https://github.com/abetlen/llama-cpp-python#chat-completion or https://huggingface.co/spaces/deepseek-ai/deepseek-coder-7b-instruct/blob/main/app.py
|
| 20 |
- write IN LARGE LETTERS that this is not the original model but a quantified one that is able to run on free CPU Inference
|
| 21 |
- test multimodal with llama?
|
| 22 |
-
- can i use swap in docker to maximize usable memory?
|
| 23 |
- proper token handling - make it a real chat (if not auto by chatcompletion interface ...)
|
| 24 |
-
- maybe run as webserver locally and gradio only uses the webserver as backend? (better for async but maybe worse to control - just an idea)
|
| 25 |
- check ho wmuch parallel generation is possible or only one que and set
|
| 26 |
- move model to DL into env-var with proper error handling
|
| 27 |
-
- chore: cleanup ignore,
|
| 28 |
- update all deps to one up to date version, then PIN them!
|
| 29 |
- make a short info on how to clone and run custom 7b models in separate spaces
|
| 30 |
- make a pr for popular repos to include in their readme etc.
|
|
|
|
| 1 |
---
|
| 2 |
title: Test
|
| 3 |
emoji: 🔥
|
| 4 |
+
colorFrom: red
|
| 5 |
colorTo: yellow
|
| 6 |
+
sdk: gradio
|
| 7 |
pinned: false
|
| 8 |
+
license: mit
|
| 9 |
---
|
| 10 |
|
| 11 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 14 |
This is a test ...
|
| 15 |
|
| 16 |
TASKS:
|
|
|
|
|
|
|
| 17 |
- rewrite generation from scratch or use the one of mistral space if possible. alternative use https://github.com/abetlen/llama-cpp-python#chat-completion or https://huggingface.co/spaces/deepseek-ai/deepseek-coder-7b-instruct/blob/main/app.py
|
| 18 |
- write IN LARGE LETTERS that this is not the original model but a quantified one that is able to run on free CPU Inference
|
| 19 |
- test multimodal with llama?
|
|
|
|
| 20 |
- proper token handling - make it a real chat (if not auto by chatcompletion interface ...)
|
|
|
|
| 21 |
- check ho wmuch parallel generation is possible or only one que and set
|
| 22 |
- move model to DL into env-var with proper error handling
|
| 23 |
+
- chore: cleanup ignore, etc.
|
| 24 |
- update all deps to one up to date version, then PIN them!
|
| 25 |
- make a short info on how to clone and run custom 7b models in separate spaces
|
| 26 |
- make a pr for popular repos to include in their readme etc.
|
gradio_app.py → app.py
RENAMED
|
@@ -5,8 +5,8 @@ import gradio as gr
|
|
| 5 |
import psutil
|
| 6 |
|
| 7 |
# Initing things
|
| 8 |
-
print("
|
| 9 |
-
llm = Llama(model_path="./model.bin")
|
| 10 |
llama_model_name = "TheBloke/dolphin-2.2.1-AshhLimaRP-Mistral-7B-GGUF"
|
| 11 |
print("! INITING DONE !")
|
| 12 |
|
|
|
|
| 5 |
import psutil
|
| 6 |
|
| 7 |
# Initing things
|
| 8 |
+
print("debug: init model")
|
| 9 |
+
llm = Llama(model_path="./model.bin") # LLaMa model
|
| 10 |
llama_model_name = "TheBloke/dolphin-2.2.1-AshhLimaRP-Mistral-7B-GGUF"
|
| 11 |
print("! INITING DONE !")
|
| 12 |
|
requirements.txt
CHANGED
|
@@ -1,2 +1,3 @@
|
|
| 1 |
psutil
|
| 2 |
-
gradio
|
|
|
|
|
|
| 1 |
psutil
|
| 2 |
+
gradio
|
| 3 |
+
llama_cpp
|