Post
Do you have a hidden massive storage leak thanks to HF hub models and datasets revisions adding up and not getting automatically deleted?
Here is how to delete all old revisions and only keeping
In terminal A:
Do not answer the prompt and proceed with my instructions.
(note your tmp file will have a different path, so adjust it below)
In terminal B:
The perl one-liner uncommented out all lines that had
Now go back to terminal A and hit: N, Y, Y, so it looks like:
Done.
If you messed up with the prompt answering you still have
For more details and additional techniques please see https://github.com/stas00/ml-engineering/tree/master/storage#huggingface-hub-caches
Here is how to delete all old revisions and only keeping
main
in a few quick steps and no tedious manual editing.In terminal A:
$ pip install huggingface_hub["cli"] -U
$ huggingface-cli delete-cache --disable-tui
File to edit: /tmp/tmpundr7lky.txt
0 revisions selected counting for 0.0. Continue ? (y/N)
Do not answer the prompt and proceed with my instructions.
(note your tmp file will have a different path, so adjust it below)
In terminal B:
$ cp /tmp/tmpedbz00ox.txt cache.txt
$ perl -pi -e 's|^#(.*detached.*)|$1|' cache.txt
$ cat cache.txt >> /tmp/tmpundr7lky.txt
The perl one-liner uncommented out all lines that had
(detached)
in it - so can be wiped out. And then we pasted it back into the tmp file huggingface-cli
expects to be edited.Now go back to terminal A and hit: N, Y, Y, so it looks like:
0 revisions selected counting for 0.0. Continue ? (y/N) n
89 revisions selected counting for 211.7G. Continue ? (y/N) y
89 revisions selected counting for 211.7G. Confirm deletion ? (Y/n) y
Done.
If you messed up with the prompt answering you still have
cache.txt
file which you can feed again to the new tmp file it'll create when you run huggingface-cli delete-cache --disable-tui
again.For more details and additional techniques please see https://github.com/stas00/ml-engineering/tree/master/storage#huggingface-hub-caches