Triangle104 commited on
Commit
c2315c1
·
verified ·
1 Parent(s): c961a1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +152 -0
README.md CHANGED
@@ -17,6 +17,158 @@ language:
17
  This model was converted to GGUF format from [`Spestly/Athena-1-7B`](https://huggingface.co/Spestly/Athena-1-7B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-7B) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22
 
 
17
  This model was converted to GGUF format from [`Spestly/Athena-1-7B`](https://huggingface.co/Spestly/Athena-1-7B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-7B) for more details on the model.
19
 
20
+ ---
21
+ Model details:
22
+ -
23
+ Athena-1 is a fine-tuned, instruction-following large language model derived from Qwen/Qwen2.5-7B-Instruct.
24
+ Designed to balance efficiency and performance, Athena 7B provides
25
+ powerful text-generation capabilities, making it suitable for a variety
26
+ of real-world applications, including conversational AI, content
27
+ creation, and structured data processing.
28
+
29
+
30
+
31
+
32
+
33
+
34
+
35
+
36
+ Key Features
37
+
38
+
39
+
40
+
41
+
42
+
43
+
44
+
45
+
46
+ 🚀 Enhanced Performance
47
+
48
+
49
+
50
+
51
+ Instruction Following: Fine-tuned for excellent adherence to user prompts and instructions.
52
+ Coding and Mathematics: Proficient in solving coding problems and mathematical reasoning.
53
+ Lightweight: At 7.62 billion parameters, Athena-1-7B offers powerful performance while maintaining efficiency.
54
+
55
+
56
+
57
+
58
+
59
+
60
+
61
+ 📖 Long-Context Understanding
62
+
63
+
64
+
65
+
66
+ Context Length: Supports up to 128K tokens, ensuring accurate handling of large documents or conversations.
67
+ Token Generation: Can generate up to 8K tokens of output.
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+ 🌍 Multilingual Support
76
+
77
+
78
+
79
+
80
+ Supports 29+ languages, including:
81
+ English, Chinese, French, Spanish, Portuguese, German, Italian, Russian
82
+ Japanese, Korean, Vietnamese, Thai, Arabic, and more.
83
+
84
+
85
+
86
+
87
+
88
+
89
+
90
+
91
+
92
+ 📊 Structured Data & Outputs
93
+
94
+
95
+
96
+
97
+ Structured Data Interpretation: Understands and processes structured formats like tables and JSON.
98
+ Structured Output Generation: Generates well-formatted outputs, including JSON and other structured formats.
99
+
100
+
101
+
102
+
103
+
104
+
105
+
106
+
107
+ Model Details
108
+
109
+
110
+
111
+
112
+ Base Model: Qwen/Qwen2.5-7B-Instruct
113
+ Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
114
+ Parameters: 7.62B total (6.53B non-embedding).
115
+ Layers: 28
116
+ Attention Heads: 28 for Q, 4 for KV.
117
+ Context Length: Up to 131,072 tokens.
118
+
119
+
120
+
121
+
122
+
123
+
124
+
125
+
126
+ Applications
127
+
128
+
129
+
130
+
131
+ Athena-1 is designed for a broad range of use cases:
132
+
133
+
134
+ Conversational AI: Create natural, human-like chatbot experiences.
135
+ Code Generation: Generate, debug, or explain code snippets.
136
+ Mathematical Problem Solving: Assist with complex calculations and reasoning.
137
+ Document Processing: Summarize or analyze large documents.
138
+ Multilingual Applications: Support for diverse languages for translation and global use cases.
139
+ Structured Data: Process and generate structured data, including tables and JSON.
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+
148
+ Quickstart
149
+
150
+
151
+
152
+
153
+ Here’s how you can use Athena 7B for quick text generation:
154
+
155
+
156
+ # Use a pipeline as a high-level helper
157
+ from transformers import pipeline
158
+
159
+ messages = [
160
+ {"role": "user", "content": "Who are you?"},
161
+ ]
162
+ pipe = pipeline("text-generation", model="Spestly/Athena-1-7B")
163
+ pipe(messages)
164
+
165
+ # Load model directly
166
+ from transformers import AutoTokenizer, AutoModelForCausalLM
167
+
168
+ tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-7B")
169
+ model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-7B")
170
+
171
+ ---
172
  ## Use with llama.cpp
173
  Install llama.cpp through brew (works on Mac and Linux)
174