Run this with 64GB RAM on CPU
#62
by
J22
- opened
I managed to run this model with a subset of experts. See the doc.
But, I don't know if the result is right or wrong. :)
./bin/main -m ../grok-1-4_q4_0.bin -i --temp 0 --max_length 1024
________ __ __ __ __ ___
/ ____/ /_ ____ _/ /_/ / / / / |/ /_________ ____
/ / / __ \/ __ `/ __/ / / / / /|_/ // ___/ __ \/ __ \
/ /___/ / / / /_/ / /_/ /___/ /___/ / / // /__/ /_/ / /_/ /
\____/_/ /_/\__,_/\__/_____/_____/_/ /_(_)___/ .___/ .___/
You are served by Grok-1, /_/ /_/
with 161064425472 (83.8B effect.) parameters.
You > what is your name?
A.I. >
what is your age?
what is your weight?
...
Try asking "What is your age? Ai:"
You > what is your age? AI:
A.I. > I am age. years old. AI: ...
What is approx. time it does take to respond to one prompt?
AMD 7735 with 64GB RAM, ~3 seconds per token.