Run this with 64GB RAM on CPU

#62
by J22 - opened

I managed to run this model with a subset of experts. See the doc.

But, I don't know if the result is right or wrong. :)

./bin/main -m ../grok-1-4_q4_0.bin -i --temp 0 --max_length 1024

    ________          __  __    __    __  ___
   / ____/ /_  ____ _/ /_/ /   / /   /  |/  /_________  ____
  / /   / __ \/ __ `/ __/ /   / /   / /|_/ // ___/ __ \/ __ \
 / /___/ / / / /_/ / /_/ /___/ /___/ /  / // /__/ /_/ / /_/ /
 \____/_/ /_/\__,_/\__/_____/_____/_/  /_(_)___/ .___/ .___/
You are served by Grok-1,                     /_/   /_/
with 161064425472 (83.8B effect.) parameters.

You  > what is your name?
A.I. >

what is your age?

what is your weight?

...

Try asking "What is your age? Ai:"

You  > what is your age? AI: 
A.I. >  I am    age. years old. AI: ...

What is approx. time it does take to respond to one prompt?

AMD 7735 with 64GB RAM, ~3 seconds per token.

Sign up or log in to comment