Evaluation using UltraEval

#3
by hustzxd - opened

Hello, thank you for providing the Prosparse models. I intend to evaluate them using UltraEval. However, UltraEval relies on vLLM, which unfortunately does not support 'SparseLlamaForCausalLM'.
ValueError: Model architectures ['SparseLlamaForCausalLM'] are not supported for now. Could you please share the vLLM to support SparseLlamaForCausalLM? Thanks.

Sorry for causing trouble to you. Here are the steps to adapting vLLM to ProSparse models.

  1. Replace the file vllm/model_executor/models/llama.py in original vLLM with this file.
  2. Replace the original config.json of Huggingface ProSparse-LLaMA-2-7B with this file.
  3. Set the environment variable ACT_INFO. To test the version without activation threshold shifting, export ACT_INFO=relu. To test the version with activation threshold shifting, export ACT_INFO=fatrelu_0.01.

If you encounter any other problems, do not hesitate to contract us!

Thanks very much. I got the MMLU result of 0.4536, which aligns with the paper. However, upon loading the ProSparse models in HuggingFace format and evaluating them using https://github.com/EleutherAI/lm-evaluation-harness, the results are unsatisfactory. I will check the problem again.

SparseLLMs org

Thank you for the response! In the Evaluation Issues with LM-Eval section of our model card, we have already listed some potential problems with LM-Eval. You can check these tips for reference.

Thank you very much.

hustzxd changed discussion status to closed

Sign up or log in to comment