abhishekmaroon5 commited on
Commit
146fb0b
·
verified ·
1 Parent(s): dce6539

Upload Lex Fridman chatbot model

Browse files
README.md ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - conversational
6
+ - chatbot
7
+ - lex-fridman
8
+ - dialogue
9
+ - pytorch
10
+ - transformers
11
+ - fine-tuned
12
+ base_model: microsoft/DialoGPT-medium
13
+ datasets:
14
+ - custom
15
+ widget:
16
+ - text: "Human: What do you think about artificial intelligence?\n\nLex:"
17
+ example_title: "AI Discussion"
18
+ - text: "Human: How do you approach difficult conversations?\n\nLex:"
19
+ example_title: "Conversation Approach"
20
+ - text: "Human: What's your view on the future of technology?\n\nLex:"
21
+ example_title: "Future of Technology"
22
+ ---
23
+
24
+ # Lex Fridman AI Chatbot
25
+
26
+ This is a fine-tuned conversational AI model that mimics the conversational style of Lex Fridman, host of the Lex Fridman Podcast. The model was fine-tuned on podcast transcript data to capture Lex's thoughtful, philosophical, and engaging conversation style.
27
+
28
+ ## Model Details
29
+
30
+ - **Base Model**: microsoft/DialoGPT-medium
31
+ - **Model Type**: Conversational AI / Chatbot
32
+ - **Language**: English
33
+ - **Parameters**: ~117M
34
+ - **Training Data**: Lex Fridman podcast transcripts
35
+ - **Fine-tuning**: Custom dialogue dataset with Human-Lex conversation pairs
36
+
37
+ ## Training Details
38
+
39
+ - **Training Examples**: 3,100
40
+ - **Validation Examples**: 345
41
+ - **Training Duration**: 0:20:30.474449
42
+ - **Final Loss**: 4.1058
43
+ - **Epochs**: 5
44
+ - **Hardware**: NVIDIA A100 GPUs
45
+ - **Framework**: PyTorch + Transformers
46
+
47
+ ## Usage
48
+
49
+ ```python
50
+ from transformers import AutoTokenizer, AutoModelForCausalLM
51
+ import torch
52
+
53
+ # Load model and tokenizer
54
+ tokenizer = AutoTokenizer.from_pretrained("your-username/lex-fridman-chatbot")
55
+ model = AutoModelForCausalLM.from_pretrained("your-username/lex-fridman-chatbot")
56
+
57
+ # Generate response
58
+ def chat_with_lex(question):
59
+ prompt = f"Human: {question}\n\nLex:"
60
+ inputs = tokenizer.encode(prompt, return_tensors="pt")
61
+
62
+ with torch.no_grad():
63
+ outputs = model.generate(
64
+ inputs,
65
+ max_length=inputs.shape[1] + 100,
66
+ temperature=0.8,
67
+ do_sample=True,
68
+ top_p=0.9,
69
+ pad_token_id=tokenizer.eos_token_id
70
+ )
71
+
72
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
73
+ return response.split("Lex:")[-1].strip()
74
+
75
+ # Example usage
76
+ response = chat_with_lex("What do you think about artificial intelligence?")
77
+ print(response)
78
+ ```
79
+
80
+ ## Sample Conversations
81
+
82
+ **Human**: What do you think about artificial intelligence?
83
+ **Lex**: I think it's a big idea that we're seeing in the way computers are becoming self aware.
84
+
85
+ **Human**: How do you approach difficult conversations?
86
+ **Lex**: You approach them with your hands and you have a conversation with the viewer.
87
+
88
+ ## Web Interface
89
+
90
+ This model comes with a beautiful Streamlit web interface for easy interaction:
91
+
92
+ ```bash
93
+ streamlit run web_app/lex_chatbot_app.py
94
+ ```
95
+
96
+ ## Model Architecture
97
+
98
+ - **Architecture**: GPT-2 based transformer (DialoGPT)
99
+ - **Vocabulary Size**: 50,257 tokens
100
+ - **Context Length**: 1024 tokens
101
+ - **Precision**: FP16 for efficient inference
102
+ - **Multi-GPU**: Supports distributed inference
103
+
104
+ ## Limitations and Biases
105
+
106
+ - The model is trained on a limited dataset from podcast transcripts
107
+ - May reflect biases present in the training data
108
+ - Responses are generated based on patterns learned from transcripts
109
+ - Not suitable for providing factual information or professional advice
110
+ - Best used for casual conversation and entertainment
111
+
112
+ ## Ethical Considerations
113
+
114
+ - This model is for educational and entertainment purposes
115
+ - Does not represent the actual views or opinions of Lex Fridman
116
+ - Should not be used to impersonate or misrepresent the real person
117
+ - Users should be transparent about AI-generated content
118
+
119
+ ## Citation
120
+
121
+ ```bibtex
122
+ @misc{lex-fridman-chatbot,
123
+ title={Lex Fridman AI Chatbot},
124
+ author={Your Name},
125
+ year={2025},
126
+ publisher={Hugging Face},
127
+ url={https://huggingface.co/your-username/lex-fridman-chatbot}
128
+ }
129
+ ```
130
+
131
+ ## License
132
+
133
+ MIT License - See LICENSE file for details.
134
+
135
+ ## Acknowledgments
136
+
137
+ - Lex Fridman for the inspiring podcast content
138
+ - Microsoft for the DialoGPT base model
139
+ - Hugging Face for the transformers library
140
+ - The open-source AI community
checkpoint-965/config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/DialoGPT-medium",
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPT2LMHeadModel"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": 50256,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 50256,
11
+ "initializer_range": 0.02,
12
+ "layer_norm_epsilon": 1e-05,
13
+ "model_type": "gpt2",
14
+ "n_ctx": 1024,
15
+ "n_embd": 1024,
16
+ "n_head": 16,
17
+ "n_inner": null,
18
+ "n_layer": 24,
19
+ "n_positions": 1024,
20
+ "reorder_and_upcast_attn": false,
21
+ "resid_pdrop": 0.1,
22
+ "scale_attn_by_inverse_layer_idx": false,
23
+ "scale_attn_weights": true,
24
+ "summary_activation": null,
25
+ "summary_first_dropout": 0.1,
26
+ "summary_proj_to_labels": true,
27
+ "summary_type": "cls_index",
28
+ "summary_use_proj": true,
29
+ "task_specific_params": {
30
+ "conversational": {
31
+ "max_length": 1000
32
+ }
33
+ },
34
+ "torch_dtype": "bfloat16",
35
+ "transformers_version": "4.46.2",
36
+ "use_cache": true,
37
+ "vocab_size": 50257
38
+ }
checkpoint-965/generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "eos_token_id": 50256,
5
+ "transformers_version": "4.46.2"
6
+ }
checkpoint-965/merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-965/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8913fc51f3d8b7ad1e272dbcc060fc84d4a5da6714b71920e47e79462291df90
3
+ size 709676664
checkpoint-965/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4fea19ed2d0a009b3aeba1700053d91a99cf9e0a0bf6ce08c3db0de6fa4c266
3
+ size 1419536634
checkpoint-965/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37b4712e69b70b808f19db8c1b2273bf2a7379d41a2b4a53635c9d2fcd761d5d
3
+ size 14244
checkpoint-965/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1717724523643f30deded7cd205d277ea4e367e8f3b5dffcb921f61c48f2363a
3
+ size 1064
checkpoint-965/special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|endoftext|>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|endoftext|>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<|endoftext|>",
17
+ "unk_token": {
18
+ "content": "<|endoftext|>",
19
+ "lstrip": false,
20
+ "normalized": true,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
checkpoint-965/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-965/tokenizer_config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "50256": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ }
13
+ },
14
+ "bos_token": "<|endoftext|>",
15
+ "chat_template": "{% for message in messages %}{{ message.content }}{{ eos_token }}{% endfor %}",
16
+ "clean_up_tokenization_spaces": true,
17
+ "eos_token": "<|endoftext|>",
18
+ "errors": "replace",
19
+ "model_max_length": 1024,
20
+ "pad_token": "<|endoftext|>",
21
+ "tokenizer_class": "GPT2Tokenizer",
22
+ "unk_token": "<|endoftext|>"
23
+ }
checkpoint-965/trainer_state.json ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 4.985806451612904,
5
+ "eval_steps": 500,
6
+ "global_step": 965,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.25806451612903225,
13
+ "grad_norm": 60.0,
14
+ "learning_rate": 1.5e-05,
15
+ "loss": 6.9241,
16
+ "step": 50
17
+ },
18
+ {
19
+ "epoch": 0.5161290322580645,
20
+ "grad_norm": 28.375,
21
+ "learning_rate": 3e-05,
22
+ "loss": 6.2276,
23
+ "step": 100
24
+ },
25
+ {
26
+ "epoch": 0.7741935483870968,
27
+ "grad_norm": 20.125,
28
+ "learning_rate": 2.8265895953757228e-05,
29
+ "loss": 4.8431,
30
+ "step": 150
31
+ },
32
+ {
33
+ "epoch": 1.033548387096774,
34
+ "grad_norm": 18.0,
35
+ "learning_rate": 2.653179190751445e-05,
36
+ "loss": 4.1853,
37
+ "step": 200
38
+ },
39
+ {
40
+ "epoch": 1.2916129032258064,
41
+ "grad_norm": 15.4375,
42
+ "learning_rate": 2.4797687861271675e-05,
43
+ "loss": 3.9644,
44
+ "step": 250
45
+ },
46
+ {
47
+ "epoch": 1.5496774193548388,
48
+ "grad_norm": 24.0,
49
+ "learning_rate": 2.3063583815028902e-05,
50
+ "loss": 3.838,
51
+ "step": 300
52
+ },
53
+ {
54
+ "epoch": 1.8077419354838709,
55
+ "grad_norm": 18.125,
56
+ "learning_rate": 2.132947976878613e-05,
57
+ "loss": 3.7908,
58
+ "step": 350
59
+ },
60
+ {
61
+ "epoch": 2.067096774193548,
62
+ "grad_norm": 13.8125,
63
+ "learning_rate": 1.9595375722543353e-05,
64
+ "loss": 3.7595,
65
+ "step": 400
66
+ },
67
+ {
68
+ "epoch": 2.3251612903225807,
69
+ "grad_norm": 19.625,
70
+ "learning_rate": 1.7861271676300577e-05,
71
+ "loss": 3.7634,
72
+ "step": 450
73
+ },
74
+ {
75
+ "epoch": 2.5832258064516127,
76
+ "grad_norm": 12.4375,
77
+ "learning_rate": 1.6127167630057804e-05,
78
+ "loss": 3.7074,
79
+ "step": 500
80
+ },
81
+ {
82
+ "epoch": 2.5832258064516127,
83
+ "eval_loss": 3.7095940113067627,
84
+ "eval_perplexity": 40.784299067771805,
85
+ "eval_runtime": 216.8944,
86
+ "eval_samples_per_second": 1.591,
87
+ "eval_steps_per_second": 0.798,
88
+ "step": 500
89
+ },
90
+ {
91
+ "epoch": 2.841290322580645,
92
+ "grad_norm": 14.875,
93
+ "learning_rate": 1.439306358381503e-05,
94
+ "loss": 3.6857,
95
+ "step": 550
96
+ },
97
+ {
98
+ "epoch": 3.1006451612903225,
99
+ "grad_norm": 14.4375,
100
+ "learning_rate": 1.2658959537572255e-05,
101
+ "loss": 3.6973,
102
+ "step": 600
103
+ },
104
+ {
105
+ "epoch": 3.358709677419355,
106
+ "grad_norm": 23.25,
107
+ "learning_rate": 1.092485549132948e-05,
108
+ "loss": 3.7022,
109
+ "step": 650
110
+ },
111
+ {
112
+ "epoch": 3.616774193548387,
113
+ "grad_norm": 14.5,
114
+ "learning_rate": 9.190751445086706e-06,
115
+ "loss": 3.6774,
116
+ "step": 700
117
+ },
118
+ {
119
+ "epoch": 3.8748387096774195,
120
+ "grad_norm": 14.75,
121
+ "learning_rate": 7.45664739884393e-06,
122
+ "loss": 3.6879,
123
+ "step": 750
124
+ },
125
+ {
126
+ "epoch": 4.134193548387096,
127
+ "grad_norm": 15.625,
128
+ "learning_rate": 5.722543352601156e-06,
129
+ "loss": 3.6654,
130
+ "step": 800
131
+ },
132
+ {
133
+ "epoch": 4.392258064516129,
134
+ "grad_norm": 13.8125,
135
+ "learning_rate": 3.988439306358381e-06,
136
+ "loss": 3.6742,
137
+ "step": 850
138
+ },
139
+ {
140
+ "epoch": 4.650322580645161,
141
+ "grad_norm": 16.75,
142
+ "learning_rate": 2.254335260115607e-06,
143
+ "loss": 3.6861,
144
+ "step": 900
145
+ },
146
+ {
147
+ "epoch": 4.908387096774193,
148
+ "grad_norm": 17.5,
149
+ "learning_rate": 5.202312138728324e-07,
150
+ "loss": 3.6569,
151
+ "step": 950
152
+ }
153
+ ],
154
+ "logging_steps": 50,
155
+ "max_steps": 965,
156
+ "num_input_tokens_seen": 0,
157
+ "num_train_epochs": 5,
158
+ "save_steps": 1000,
159
+ "stateful_callbacks": {
160
+ "EarlyStoppingCallback": {
161
+ "args": {
162
+ "early_stopping_patience": 3,
163
+ "early_stopping_threshold": 0.0
164
+ },
165
+ "attributes": {
166
+ "early_stopping_patience_counter": 0
167
+ }
168
+ },
169
+ "TrainerControl": {
170
+ "args": {
171
+ "should_epoch_stop": false,
172
+ "should_evaluate": false,
173
+ "should_log": false,
174
+ "should_save": true,
175
+ "should_training_stop": true
176
+ },
177
+ "attributes": {}
178
+ }
179
+ },
180
+ "total_flos": 2.867827744702464e+16,
181
+ "train_batch_size": 2,
182
+ "trial_name": null,
183
+ "trial_params": null
184
+ }
checkpoint-965/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3849d00172e1d207cba3f4ac3120165d22f71ccdd4fed2cd97e66dfc3358dc2a
3
+ size 5240
checkpoint-965/vocab.json ADDED
The diff for this file is too large to render. See raw diff
 
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/DialoGPT-medium",
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPT2LMHeadModel"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": 50256,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 50256,
11
+ "initializer_range": 0.02,
12
+ "layer_norm_epsilon": 1e-05,
13
+ "model_type": "gpt2",
14
+ "n_ctx": 1024,
15
+ "n_embd": 1024,
16
+ "n_head": 16,
17
+ "n_inner": null,
18
+ "n_layer": 24,
19
+ "n_positions": 1024,
20
+ "reorder_and_upcast_attn": false,
21
+ "resid_pdrop": 0.1,
22
+ "scale_attn_by_inverse_layer_idx": false,
23
+ "scale_attn_weights": true,
24
+ "summary_activation": null,
25
+ "summary_first_dropout": 0.1,
26
+ "summary_proj_to_labels": true,
27
+ "summary_type": "cls_index",
28
+ "summary_use_proj": true,
29
+ "task_specific_params": {
30
+ "conversational": {
31
+ "max_length": 1000
32
+ }
33
+ },
34
+ "torch_dtype": "bfloat16",
35
+ "transformers_version": "4.46.2",
36
+ "use_cache": true,
37
+ "vocab_size": 50257
38
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "eos_token_id": 50256,
5
+ "transformers_version": "4.46.2"
6
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8913fc51f3d8b7ad1e272dbcc060fc84d4a5da6714b71920e47e79462291df90
3
+ size 709676664
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|endoftext|>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|endoftext|>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<|endoftext|>",
17
+ "unk_token": {
18
+ "content": "<|endoftext|>",
19
+ "lstrip": false,
20
+ "normalized": true,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "50256": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ }
13
+ },
14
+ "bos_token": "<|endoftext|>",
15
+ "chat_template": "{% for message in messages %}{{ message.content }}{{ eos_token }}{% endfor %}",
16
+ "clean_up_tokenization_spaces": true,
17
+ "eos_token": "<|endoftext|>",
18
+ "errors": "replace",
19
+ "model_max_length": 1024,
20
+ "pad_token": "<|endoftext|>",
21
+ "tokenizer_class": "GPT2Tokenizer",
22
+ "unk_token": "<|endoftext|>"
23
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3849d00172e1d207cba3f4ac3120165d22f71ccdd4fed2cd97e66dfc3358dc2a
3
+ size 5240
training_summary.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "microsoft/DialoGPT-medium",
3
+ "training_loss": 4.105761517144238,
4
+ "training_duration": "0:20:30.474449",
5
+ "num_epochs": 5,
6
+ "num_gpus": 2,
7
+ "training_examples": 3100,
8
+ "validation_examples": 345,
9
+ "completed_at": "2025-06-20T04:32:48.471531",
10
+ "output_dir": "models/lex_chatbot"
11
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff