microsoft
/

Phi-3-small-8k-instruct

Text Generation

Model card Files Files and versions

Resources

View closed (24)

custom GEGLU implementation

#32 opened 7 months ago by

Independent evaluation results

#30 opened 10 months ago by

Getting the error: "triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 180224, Hardware limit: 166912. Reducing block sizes or `num_stages` may help."

#27 opened about 1 year ago by

Why the inference speed so slow compare with same 7B parameters of Qwen?

#26 opened about 1 year ago by

Upload triton_flash_blocksparse_attn.py

#25 opened about 1 year ago by

Phi-3-small doesn't load with TGI

#24 opened about 1 year ago by

Multi-GPU training fails when using device_map = "auto"

#23 opened about 1 year ago by

Shared memory error

#15 opened about 1 year ago by