Update app.py
Browse files
app.py
CHANGED
|
@@ -6,12 +6,14 @@ api = gr.Interface.load("models/bigscience/bloom")
|
|
| 6 |
|
| 7 |
def complete_with_gpt(text):
|
| 8 |
# Use the last 50 characters of the text as context
|
| 9 |
-
return text[:-50] + api(text[-50:])
|
|
|
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
with gr.Blocks() as demo:
|
| 13 |
with gr.Row():
|
| 14 |
-
textbox = gr.Textbox(placeholder="Type here and press enter...", lines=
|
| 15 |
with gr.Column():
|
| 16 |
btn = gr.Button("Generate")
|
| 17 |
|
|
@@ -19,9 +21,10 @@ with gr.Blocks() as demo:
|
|
| 19 |
|
| 20 |
with gr.Row():
|
| 21 |
gr.Markdown("""
|
| 22 |
-
|
|
|
|
| 23 |
|
| 24 |
-
## Bloom Is Setting New Record for Most Performant and Efficient AI Model for Science Ever!
|
| 25 |
|
| 26 |
Bloom stands for:
|
| 27 |
B: Big Science
|
|
@@ -30,7 +33,7 @@ O: Open Science
|
|
| 30 |
O: Open Access
|
| 31 |
M: Multi Lingual Language Model
|
| 32 |
|
| 33 |
-
1. Video Playlist
|
| 34 |
2. Summary of Important Models and Sizes:
|
| 35 |
|
| 36 |
# Model Sizes to Date
|
|
@@ -54,8 +57,6 @@ DistilBERT|66 million
|
|
| 54 |
|
| 55 |
3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
# ChatGPT Datasets:
|
| 60 |
1. WebText
|
| 61 |
2. Common Crawl
|
|
@@ -64,43 +65,41 @@ DistilBERT|66 million
|
|
| 64 |
5. Toronto Books Corpus
|
| 65 |
6. OpenWebText
|
| 66 |
|
| 67 |
-
# Comparison to BigScience Model
|
| 68 |
|
| 69 |
-
|
|
|
|
|
|
|
| 70 |
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
# Model: https://huggingface.co/bigscience/bloom
|
| 74 |
|
| 75 |
# Papers:
|
| 76 |
-
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model https://arxiv.org/abs/2211.05100
|
| 77 |
-
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism https://arxiv.org/abs/1909.08053
|
| 78 |
-
3. 8-bit Optimizers via Block-wise Quantization https://arxiv.org/abs/2110.02861
|
| 79 |
-
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation https://arxiv.org/abs/2108.12409
|
| 80 |
-
5. https://huggingface.co/models?other=doi:10.57967/hf/0003
|
| 81 |
-
6. 217 Other Models optimizing use of bloom via specialization: https://huggingface.co/models?other=bloom
|
| 82 |
|
| 83 |
# Datasets
|
| 84 |
-
1. Universal Dependencies
|
| 85 |
-
2. WMT 2014
|
| 86 |
-
3. The Pile
|
| 87 |
-
4. HumanEval
|
| 88 |
-
5. FLORES-101
|
| 89 |
-
6. CrowS-Pairs
|
| 90 |
-
7. WikiLingua
|
| 91 |
-
8. MTEB
|
| 92 |
-
9. xP3
|
| 93 |
-
10. DiaBLa
|
| 94 |
|
| 95 |
# Deep RL ML Strategy
|
| 96 |
-
|
| 97 |
1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
|
| 98 |
2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
|
| 99 |
3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
|
| 100 |
4. Proximal Policy Optimization Fine Tuning
|
| 101 |
|
| 102 |
# Variations - Preference Model Pretraining
|
| 103 |
-
|
| 104 |
1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
|
| 105 |
2. Online Version Getting Feedback
|
| 106 |
3. OpenAI - InstructGPT - Humans generate LM Training Text
|
|
|
|
| 6 |
|
| 7 |
def complete_with_gpt(text):
|
| 8 |
# Use the last 50 characters of the text as context
|
| 9 |
+
# return text[:-50] + api(text[-50:])
|
| 10 |
+
# Use the last 100 characters of the text as context
|
| 11 |
+
return text[:-100] + api(text[-100:])
|
| 12 |
|
| 13 |
|
| 14 |
with gr.Blocks() as demo:
|
| 15 |
with gr.Row():
|
| 16 |
+
textbox = gr.Textbox(placeholder="Type here and press enter...", lines=14)
|
| 17 |
with gr.Column():
|
| 18 |
btn = gr.Button("Generate")
|
| 19 |
|
|
|
|
| 21 |
|
| 22 |
with gr.Row():
|
| 23 |
gr.Markdown("""
|
| 24 |
+
|
| 25 |
+
# Big Science and Huggingface create 176 Billion Parameter Transformer Large Language Model
|
| 26 |
|
| 27 |
+
## Bloom Is Setting A New Record for Most Performant and Efficient AI Model for Science Ever!
|
| 28 |
|
| 29 |
Bloom stands for:
|
| 30 |
B: Big Science
|
|
|
|
| 33 |
O: Open Access
|
| 34 |
M: Multi Lingual Language Model
|
| 35 |
|
| 36 |
+
1. [Video Playlist](https://www.youtube.com/playlist?list=PLHgX2IExbFouqnsIqziThlPCX_miiDq14)
|
| 37 |
2. Summary of Important Models and Sizes:
|
| 38 |
|
| 39 |
# Model Sizes to Date
|
|
|
|
| 57 |
|
| 58 |
3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
|
| 59 |
|
|
|
|
|
|
|
| 60 |
# ChatGPT Datasets:
|
| 61 |
1. WebText
|
| 62 |
2. Common Crawl
|
|
|
|
| 65 |
5. Toronto Books Corpus
|
| 66 |
6. OpenWebText
|
| 67 |
|
| 68 |
+
# Comparison to BigScience Model - Big Science - How to get started
|
| 69 |
|
| 70 |
+
Big Science is a 176B parameter ML model trained on a set of datasets for Natural Language processing, and many other tasks that are not yet explored..
|
| 71 |
+
Below is the set of the papers, models, links, and datasets around big science which promises to be the best,
|
| 72 |
+
most recent large model of its kind benefitting all science pursuits.
|
| 73 |
|
| 74 |
+
# [Model](https://huggingface.co/bigscience/bloom)
|
|
|
|
|
|
|
| 75 |
|
| 76 |
# Papers:
|
| 77 |
+
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
|
| 78 |
+
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
|
| 79 |
+
3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
|
| 80 |
+
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
|
| 81 |
+
5. [Paper](https://huggingface.co/models?other=doi:10.57967/hf/0003)
|
| 82 |
+
6. 217 Other Models optimizing use of bloom via specialization: [Paper](https://huggingface.co/models?other=bloom)
|
| 83 |
|
| 84 |
# Datasets
|
| 85 |
+
1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
|
| 86 |
+
2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
|
| 87 |
+
3. [The Pile](https://paperswithcode.com/dataset/the-pile)
|
| 88 |
+
4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
|
| 89 |
+
5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
|
| 90 |
+
6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
|
| 91 |
+
7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
|
| 92 |
+
8. [MTEB](https://paperswithcode.com/dataset/mteb)
|
| 93 |
+
9. [xP3](https://paperswithcode.com/dataset/xp3)
|
| 94 |
+
10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
|
| 95 |
|
| 96 |
# Deep RL ML Strategy
|
|
|
|
| 97 |
1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
|
| 98 |
2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
|
| 99 |
3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
|
| 100 |
4. Proximal Policy Optimization Fine Tuning
|
| 101 |
|
| 102 |
# Variations - Preference Model Pretraining
|
|
|
|
| 103 |
1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
|
| 104 |
2. Online Version Getting Feedback
|
| 105 |
3. OpenAI - InstructGPT - Humans generate LM Training Text
|