prithivMLmods commited on
Commit
9d2039a
·
verified ·
1 Parent(s): 0af2b7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -1
README.md CHANGED
@@ -10,4 +10,67 @@ tags:
10
  - thinker
11
  - llama
12
  - express
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - thinker
11
  - llama
12
  - express
13
+ ---
14
+ # **Llama-Express.1-Tiny**
15
+
16
+ Llama-Express.1-Tiny is a 1B model based on Llama 3.2 (1B), fine-tuned on long chain-of-thought thinker datasets. This instruction-tuned, text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. It outperforms many of the available open-source and closed chat models.
17
+
18
+ # **Use with transformers**
19
+
20
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
21
+
22
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
23
+
24
+ ```python
25
+ import torch
26
+ from transformers import pipeline
27
+
28
+ model_id = "prithivMLmods/Llama-Express.1-Tiny"
29
+ pipe = pipeline(
30
+ "text-generation",
31
+ model=model_id,
32
+ torch_dtype=torch.bfloat16,
33
+ device_map="auto",
34
+ )
35
+ messages = [
36
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
37
+ {"role": "user", "content": "Who are you?"},
38
+ ]
39
+ outputs = pipe(
40
+ messages,
41
+ max_new_tokens=256,
42
+ )
43
+ print(outputs[0]["generated_text"][-1])
44
+ ```
45
+
46
+ # **Intended Use**
47
+ 1. **Multilingual Dialogue**:
48
+ - Designed for high-quality, multilingual conversations, making it suitable for applications requiring natural, fluid dialogue across languages.
49
+
50
+ 2. **Agentic Retrieval**:
51
+ - Optimized for retrieval-based tasks where reasoning and contextual chaining are crucial for extracting and summarizing relevant information.
52
+
53
+ 3. **Summarization Tasks**:
54
+ - Effective in generating concise and accurate summaries from complex and lengthy texts, suitable for academic, professional, and casual use cases.
55
+
56
+ 4. **Instruction-Following Applications**:
57
+ - Fine-tuned for tasks requiring adherence to user-provided instructions, making it ideal for automation workflows, content creation, and virtual assistant integrations.
58
+
59
+ # **Limitations**
60
+ 1. **Monomodal Focus**:
61
+ - As a text-only model, it cannot process multimodal inputs like images, audio, or videos, limiting its versatility in multimedia applications.
62
+
63
+ 2. **Context Length Constraints**:
64
+ - While optimized for long chain-of-thought reasoning, extreme cases with very large contexts may still lead to degraded performance or truncation issues.
65
+
66
+ 3. **Bias and Ethics**:
67
+ - The model might reflect biases present in the training datasets, potentially resulting in outputs that could be culturally insensitive or inappropriate.
68
+
69
+ 4. **Performance in Low-Resource Languages**:
70
+ - While multilingual, its effectiveness may vary across languages, with possible performance drops in underrepresented or low-resource languages.
71
+
72
+ 5. **Dependency on Input Quality**:
73
+ - The model's output is heavily influenced by the clarity and specificity of the input instructions. Ambiguous or vague prompts may lead to suboptimal results.
74
+
75
+ 6. **Lack of Real-Time Internet Access**:
76
+ - Without real-time retrieval capabilities, it cannot provide up-to-date information or verify facts against the latest data.