prince-canuma commited on
Commit
32f1666
·
verified ·
1 Parent(s): da977b0

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. README.md +189 -0
  3. chat_template.json +3 -0
  4. config.json +330 -0
  5. generation_config.json +13 -0
  6. model-00001-of-00072.safetensors +3 -0
  7. model-00002-of-00072.safetensors +3 -0
  8. model-00003-of-00072.safetensors +3 -0
  9. model-00004-of-00072.safetensors +3 -0
  10. model-00005-of-00072.safetensors +3 -0
  11. model-00006-of-00072.safetensors +3 -0
  12. model-00007-of-00072.safetensors +3 -0
  13. model-00008-of-00072.safetensors +3 -0
  14. model-00009-of-00072.safetensors +3 -0
  15. model-00010-of-00072.safetensors +3 -0
  16. model-00011-of-00072.safetensors +3 -0
  17. model-00012-of-00072.safetensors +3 -0
  18. model-00013-of-00072.safetensors +3 -0
  19. model-00014-of-00072.safetensors +3 -0
  20. model-00015-of-00072.safetensors +3 -0
  21. model-00016-of-00072.safetensors +3 -0
  22. model-00017-of-00072.safetensors +3 -0
  23. model-00018-of-00072.safetensors +3 -0
  24. model-00019-of-00072.safetensors +3 -0
  25. model-00020-of-00072.safetensors +3 -0
  26. model-00021-of-00072.safetensors +3 -0
  27. model-00022-of-00072.safetensors +3 -0
  28. model-00023-of-00072.safetensors +3 -0
  29. model-00024-of-00072.safetensors +3 -0
  30. model-00025-of-00072.safetensors +3 -0
  31. model-00026-of-00072.safetensors +3 -0
  32. model-00027-of-00072.safetensors +3 -0
  33. model-00028-of-00072.safetensors +3 -0
  34. model-00029-of-00072.safetensors +3 -0
  35. model-00030-of-00072.safetensors +3 -0
  36. model-00031-of-00072.safetensors +3 -0
  37. model-00032-of-00072.safetensors +3 -0
  38. model-00033-of-00072.safetensors +3 -0
  39. model-00034-of-00072.safetensors +3 -0
  40. model-00035-of-00072.safetensors +3 -0
  41. model-00036-of-00072.safetensors +3 -0
  42. model-00037-of-00072.safetensors +3 -0
  43. model-00038-of-00072.safetensors +3 -0
  44. model-00039-of-00072.safetensors +3 -0
  45. model-00040-of-00072.safetensors +3 -0
  46. model-00041-of-00072.safetensors +3 -0
  47. model-00042-of-00072.safetensors +3 -0
  48. model-00043-of-00072.safetensors +3 -0
  49. model-00044-of-00072.safetensors +3 -0
  50. model-00045-of-00072.safetensors +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language:
4
+ - ar
5
+ - de
6
+ - en
7
+ - es
8
+ - fr
9
+ - hi
10
+ - id
11
+ - it
12
+ - pt
13
+ - th
14
+ - tl
15
+ - vi
16
+ base_model:
17
+ - meta-llama/Llama-4-Maverick-17B-128E
18
+ tags:
19
+ - facebook
20
+ - meta
21
+ - pytorch
22
+ - llama
23
+ - llama-4
24
+ - mlx
25
+ extra_gated_prompt: '**LLAMA 4 COMMUNITY LICENSE AGREEMENT**
26
+
27
+ Llama 4 Version Effective Date: April 5, 2025
28
+
29
+ "**Agreement**" means the terms and conditions for use, reproduction, distribution
30
+ and modification of the Llama Materials set forth herein.
31
+
32
+ "**Documentation**" means the specifications, manuals and documentation accompanying
33
+ Llama 4 distributed by Meta at [https://www.llama.com/docs/overview](https://llama.com/docs/overview).
34
+
35
+ "**Licensee**" or "**you**" means you, or your employer or any other person or entity
36
+ (if you are entering into this Agreement on such person or entity’s behalf), of
37
+ the age required under applicable laws, rules or regulations to provide legal consent
38
+ and that has legal authority to bind your employer or such other person or entity
39
+ if you are entering in this Agreement on their behalf.
40
+
41
+ "**Llama 4**" means the foundational large language models and software and algorithms,
42
+ including machine-learning model code, trained model weights, inference-enabling
43
+ code, training-enabling code, fine-tuning enabling code and other elements of the
44
+ foregoing distributed by Meta at [https://www.llama.com/llama-downloads](https://www.llama.com/llama-downloads).
45
+
46
+ "**Llama Materials**" means, collectively, Meta’s proprietary Llama 4 and Documentation
47
+ (and any portion thereof) made available under this Agreement.
48
+
49
+ "**Meta**" or "**we**" means Meta Platforms Ireland Limited (if you are located
50
+ in or, if you are an entity, your principal place of business is in the EEA or Switzerland)
51
+ and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 
52
+
53
+ By clicking "I Accept" below or by using or distributing any portion or element
54
+ of the Llama Materials, you agree to be bound by this Agreement.
55
+
56
+ 1\. **License Rights and Redistribution**.
57
+
58
+ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable
59
+ and royalty-free limited license under Meta’s intellectual property or other rights
60
+ owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy,
61
+ create derivative works of, and make modifications to the Llama Materials.  
62
+
63
+ b. Redistribution and Use.  
64
+
65
+ i. If you distribute or make available the Llama Materials (or any derivative works
66
+ thereof), or a product or service (including another AI model) that contains any
67
+ of them, you shall (A) provide a copy of this Agreement with any such Llama Materials;
68
+ and (B) prominently display "Built with Llama" on a related website, user interface,
69
+ blogpost, about page, or product documentation. If you use the Llama Materials or
70
+ any outputs or results of the Llama Materials to create, train, fine tune, or otherwise
71
+ improve an AI model, which is distributed or made available, you shall also include
72
+ "Llama" at the beginning of any such AI model name.
73
+
74
+ ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee
75
+ as part of an integrated end user product, then Section 2 of this Agreement will
76
+ not apply to you. 
77
+
78
+ iii. You must retain in all copies of the Llama Materials that you distribute the
79
+ following attribution notice within a "Notice" text file distributed as a part of
80
+ such copies: "Llama 4 is licensed under the Llama 4 Community License, Copyright
81
+ © Meta Platforms, Inc. All Rights Reserved."
82
+
83
+ iv. Your use of the Llama Materials must comply with applicable laws and regulations
84
+ (including trade compliance laws and regulations) and adhere to the Acceptable Use
85
+ Policy for the Llama Materials (available at [https://www.llama.com/llama4/use-policy](https://www.llama.com/llama4/use-policy)),
86
+ which is hereby incorporated by reference into this Agreement.    2\. **Additional
87
+ Commercial Terms**. If, on the Llama 4 version release date, the monthly active
88
+ users of the products or services made available by or for Licensee, or Licensee’s
89
+ affiliates, is greater than 700 million monthly active users in the preceding calendar
90
+ month, you must request a license from Meta, which Meta may grant to you in its
91
+ sole discretion, and you are not authorized to exercise any of the rights under
92
+ this Agreement unless or until Meta otherwise expressly grants you such rights.
93
+
94
+ 3**. Disclaimer of Warranty**. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS
95
+ AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES
96
+ OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED,
97
+ INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY,
98
+ OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING
99
+ THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY
100
+ RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
101
+
102
+ 4\. **Limitation of Liability**. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE
103
+ UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY,
104
+ OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT,
105
+ SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META
106
+ OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
107
+
108
+ 5\. **Intellectual Property**.
109
+
110
+ a. No trademark licenses are granted under this Agreement, and in connection with
111
+ the Llama Materials, neither Meta nor Licensee may use any name or mark owned by
112
+ or associated with the other or any of its affiliates, except as required for reasonable
113
+ and customary use in describing and redistributing the Llama Materials or as set
114
+ forth in this Section 5(a). Meta hereby grants you a license to use "Llama" (the
115
+ "Mark") solely as required to comply with the last sentence of Section 1.b.i. You
116
+ will comply with Meta’s brand guidelines (currently accessible at [https://about.meta.com/brand/resources/meta/company-brand/](https://about.meta.com/brand/resources/meta/company-brand/)[)](https://en.facebookbrand.com/).
117
+ All goodwill arising out of your use of the Mark will inure to the benefit of Meta.
118
+
119
+ b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for
120
+ Meta, with respect to any derivative works and modifications of the Llama Materials
121
+ that are made by you, as between you and Meta, you are and will be the owner of
122
+ such derivative works and modifications.
123
+
124
+ c. If you institute litigation or other proceedings against Meta or any entity (including
125
+ a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or
126
+ Llama 4 outputs or results, or any portion of any of the foregoing, constitutes
127
+ infringement of intellectual property or other rights owned or licensable by you,
128
+ then any licenses granted to you under this Agreement shall terminate as of the
129
+ date such litigation or claim is filed or instituted. You will indemnify and hold
130
+ harmless Meta from and against any claim by any third party arising out of or related
131
+ to your use or distribution of the Llama Materials.
132
+
133
+ 6\. **Term and Termination**. The term of this Agreement will commence upon your
134
+ acceptance of this Agreement or access to the Llama Materials and will continue
135
+ in full force and effect until terminated in accordance with the terms and conditions
136
+ herein. Meta may terminate this Agreement if you are in breach of any term or condition
137
+ of this Agreement. Upon termination of this Agreement, you shall delete and cease
138
+ use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of
139
+ this Agreement. 
140
+
141
+ 7\. **Governing Law and Jurisdiction**. This Agreement will be governed and construed
142
+ under the laws of the State of California without regard to choice of law principles,
143
+ and the UN Convention on Contracts for the International Sale of Goods does not
144
+ apply to this Agreement. The courts of California shall have exclusive jurisdiction
145
+ of any dispute arising out of this Agreement.'
146
+ extra_gated_fields:
147
+ First Name: text
148
+ Last Name: text
149
+ Date of birth: date_picker
150
+ Country: country
151
+ Affiliation: text
152
+ Job title:
153
+ type: select
154
+ options:
155
+ - Student
156
+ - Research Graduate
157
+ - AI researcher
158
+ - AI developer/engineer
159
+ - Reporter
160
+ - Other
161
+ geo: ip_location
162
+ ? By clicking Submit below I accept the terms of the license and acknowledge that
163
+ the information I provide will be collected stored processed and shared in accordance
164
+ with the Meta Privacy Policy
165
+ : checkbox
166
+ extra_gated_description: The information you provide will be collected, stored, processed
167
+ and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
168
+ extra_gated_button_content: Submit
169
+ extra_gated_heading: Please be sure to provide your full legal name, date of birth,
170
+ and full organization name with all corporate identifiers. Avoid the use of acronyms
171
+ and special characters. Failure to follow these instructions may prevent you from
172
+ accessing this model and others on Hugging Face. You will not have the ability to
173
+ edit this form after submission, so please ensure all information is accurate.
174
+ license: other
175
+ license_name: llama4
176
+ ---
177
+
178
+ # mlx-community/Llama-4-Maverick-17B-128E-Instruct-4bit
179
+ This model was converted to MLX format from [`meta-llama/Llama-4-Maverick-17B-128E-Instruct`]() using mlx-vlm version **0.1.21**.
180
+ Refer to the [original model card](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct) for more details on the model.
181
+ ## Use with mlx
182
+
183
+ ```bash
184
+ pip install -U mlx-vlm
185
+ ```
186
+
187
+ ```bash
188
+ python -m mlx_vlm.generate --model mlx-community/Llama-4-Maverick-17B-128E-Instruct-4bit --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>
189
+ ```
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{{- bos_token }}\n{%- if custom_tools is defined %}\n {%- set tools = custom_tools %}\n{%- endif %}\n{%- if not tools_in_user_message is defined %}\n {%- set tools_in_user_message = true %}\n{%- endif %}\n{%- if not date_string is defined %}\n {%- if strftime_now is defined %}\n {%- set date_string = strftime_now(\"%d %b %Y\") %}\n {%- else %}\n {%- set date_string = \"26 Jul 2024\" %}\n {%- endif %}\n{%- endif %}\n{%- if not tools is defined %}\n {%- set tools = none %}\n{%- endif %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0]['role'] == 'system' %} \n {%- if messages[0]['content'] is string %}\n {%- set system_message = messages[0]['content']|trim %}\n {%- else %}\n {#- FIXME: The processor requires an array, always. #}\n {%- set system_message = messages[0]['content'][0]['text']|trim %}\n {%- endif %}\n {%- set messages = messages[1:] %}\n {%- set user_supplied_system_message = true %}\n{%- else %}\n {%- set system_message = \"\" %}\n {%- set user_supplied_system_message = false %}\n{%- endif %}\n\n{#- System message if the user supplied one #}\n{%- if user_supplied_system_message %}\n {{- \"<|header_start|>system<|header_end|>\\n\\n\" }}\n {%- if tools is not none %}\n {{- \"Environment: ipython\\n\" }}\n {%- endif %}\n {%- if tools is not none and not tools_in_user_message %}\n {{- \"You have access to the following functions. To call a function, please respond with JSON for a function call.\" }}\n {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n {{- \"Do not use variables.\\n\\n\" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- \"\\n\\n\" }}\n {%- endfor %}\n {%- endif %}\n {{- system_message }}\n {{- \"<|eot|>\" }}\n{%- endif %}\n\n{#- Custom tools are passed in a user message with some extra guidance #}\n{%- if tools_in_user_message and not tools is none %}\n {#- Extract the first user message so we can plug it in here #}\n {%- if messages | length != 0 %}\n {%- set first_user_message = messages[0]['content']|trim %}\n {%- set messages = messages[1:] %}\n {%- else %}\n {{- raise_exception(\"Cannot put tools in the first user message when there's no first user message!\") }}\n{%- endif %}\n {{- '<|header_start|>user<|header_end|>\\n\\n' -}}\n {{- \"Given the following functions, please respond with a JSON for a function call \" }}\n {{- \"with its proper arguments that best answers the given prompt.\\n\\n\" }}\n {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n {{- \"Do not use variables.\\n\\n\" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- \"\\n\\n\" }}\n {%- endfor %}\n {{- first_user_message + \"<|eot|>\"}}\n{%- endif %}\n\n{%- for message in messages %}\n {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}\n {{- '<|header_start|>' + message['role'] + '<|header_end|>\\n\\n' }}\n {%- if message['content'] is string %}\n {{- message['content'] }}\n {%- else %}\n {%- for content in message['content'] %}\n {%- if content['type'] == 'image' %}\n {{- '<|image|>' }}\n {%- elif content['type'] == 'text' %}\n {{- content['text'] }}\n {%- endif %}\n {%- endfor %}\n {%- endif %}\n {{- \"<|eot|>\" }}\n {%- elif 'tool_calls' in message and message.tool_calls|length > 0 %}\n {{- '<|header_start|>assistant<|header_end|>\\n\\n' -}}\n {{- '<|python_start|>' }}\n {%- if message['content'] is string %}\n {{- message['content'] }}\n {%- else %}\n {%- for content in message['content'] %}\n {%- if content['type'] == 'image' %}\n {{- '<|image|>' }}\n {%- elif content['type'] == 'text' %}\n {{- content['text'] }}\n {%- endif %}\n {%- endfor %}\n {%- endif %}\n {{- '<|python_end|>' }}\n {%- for tool_call in message.tool_calls %}\n {{- '{\"name\": \"' + tool_call.function.name + '\", ' }}\n {{- '\"parameters\": ' }}\n {{- tool_call.function.arguments | tojson }}\n {{- \"}\" }}\n {%- endfor %}\n {{- \"<|eot|>\" }}\n {%- elif message.role == \"tool\" or message.role == \"ipython\" %}\n {{- \"<|header_start|>ipython<|header_end|>\\n\\n\" }}\n {%- if message.content is mapping or message.content is iterable %}\n {{- message.content | tojson }}\n {%- else %}\n {{- message.content }}\n {%- endif %}\n {{- \"<|eot|>\" }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|header_start|>assistant<|header_end|>\\n\\n' }}\n{%- endif %}\n"
3
+ }
config.json ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_attn_implementation_autoset": false,
3
+ "add_cross_attention": false,
4
+ "architectures": [
5
+ "Llama4ForConditionalGeneration"
6
+ ],
7
+ "bad_words_ids": null,
8
+ "begin_suppress_tokens": null,
9
+ "boi_token_index": 200080,
10
+ "bos_token_id": null,
11
+ "chunk_size_feed_forward": 0,
12
+ "cross_attention_hidden_size": null,
13
+ "decoder_start_token_id": null,
14
+ "diversity_penalty": 0.0,
15
+ "do_sample": false,
16
+ "early_stopping": false,
17
+ "encoder_no_repeat_ngram_size": 0,
18
+ "eoi_token_index": 200081,
19
+ "eos_token_id": null,
20
+ "exponential_decay_length_penalty": null,
21
+ "finetuning_task": null,
22
+ "forced_bos_token_id": null,
23
+ "forced_eos_token_id": null,
24
+ "id2label": {
25
+ "0": "LABEL_0",
26
+ "1": "LABEL_1"
27
+ },
28
+ "image_token_index": 200092,
29
+ "is_decoder": false,
30
+ "is_encoder_decoder": false,
31
+ "label2id": {
32
+ "LABEL_0": 0,
33
+ "LABEL_1": 1
34
+ },
35
+ "length_penalty": 1.0,
36
+ "max_length": 20,
37
+ "min_length": 0,
38
+ "model_type": "llama4",
39
+ "no_repeat_ngram_size": 0,
40
+ "num_beam_groups": 1,
41
+ "num_beams": 1,
42
+ "num_return_sequences": 1,
43
+ "output_attentions": false,
44
+ "output_hidden_states": false,
45
+ "output_scores": false,
46
+ "pad_token_id": null,
47
+ "prefix": null,
48
+ "problem_type": null,
49
+ "pruned_heads": {},
50
+ "quantization": {
51
+ "group_size": 64,
52
+ "bits": 4
53
+ },
54
+ "remove_invalid_values": false,
55
+ "repetition_penalty": 1.0,
56
+ "return_dict": true,
57
+ "return_dict_in_generate": false,
58
+ "sep_token_id": null,
59
+ "suppress_tokens": null,
60
+ "task_specific_params": null,
61
+ "temperature": 1.0,
62
+ "text_config": {
63
+ "return_dict": true,
64
+ "output_hidden_states": false,
65
+ "output_attentions": false,
66
+ "torchscript": false,
67
+ "torch_dtype": "bfloat16",
68
+ "use_bfloat16": false,
69
+ "tf_legacy_loss": false,
70
+ "pruned_heads": {},
71
+ "tie_word_embeddings": false,
72
+ "chunk_size_feed_forward": 0,
73
+ "is_encoder_decoder": false,
74
+ "is_decoder": false,
75
+ "cross_attention_hidden_size": null,
76
+ "add_cross_attention": false,
77
+ "tie_encoder_decoder": false,
78
+ "max_length": 20,
79
+ "min_length": 0,
80
+ "do_sample": false,
81
+ "early_stopping": false,
82
+ "num_beams": 1,
83
+ "num_beam_groups": 1,
84
+ "diversity_penalty": 0.0,
85
+ "temperature": 1.0,
86
+ "top_k": 50,
87
+ "top_p": 1.0,
88
+ "typical_p": 1.0,
89
+ "repetition_penalty": 1.0,
90
+ "length_penalty": 1.0,
91
+ "no_repeat_ngram_size": 0,
92
+ "encoder_no_repeat_ngram_size": 0,
93
+ "bad_words_ids": null,
94
+ "num_return_sequences": 1,
95
+ "output_scores": false,
96
+ "return_dict_in_generate": false,
97
+ "forced_bos_token_id": null,
98
+ "forced_eos_token_id": null,
99
+ "remove_invalid_values": false,
100
+ "exponential_decay_length_penalty": null,
101
+ "suppress_tokens": null,
102
+ "begin_suppress_tokens": null,
103
+ "architectures": null,
104
+ "finetuning_task": null,
105
+ "id2label": {
106
+ "0": "LABEL_0",
107
+ "1": "LABEL_1"
108
+ },
109
+ "label2id": {
110
+ "LABEL_0": 0,
111
+ "LABEL_1": 1
112
+ },
113
+ "tokenizer_class": null,
114
+ "prefix": null,
115
+ "bos_token_id": 200000,
116
+ "pad_token_id": 200018,
117
+ "eos_token_id": [
118
+ 200001,
119
+ 200007,
120
+ 200008
121
+ ],
122
+ "sep_token_id": null,
123
+ "decoder_start_token_id": null,
124
+ "task_specific_params": null,
125
+ "problem_type": null,
126
+ "_name_or_path": "",
127
+ "_attn_implementation_autoset": true,
128
+ "attention_bias": false,
129
+ "for_llm_compressor": false,
130
+ "model_type": "llama4_text",
131
+ "attn_temperature_tuning": 4,
132
+ "attn_scale": 0.1,
133
+ "floor_scale": 8192,
134
+ "vocab_size": 202048,
135
+ "max_position_embeddings": 1048576,
136
+ "hidden_size": 5120,
137
+ "intermediate_size": 8192,
138
+ "intermediate_size_mlp": 16384,
139
+ "num_hidden_layers": 48,
140
+ "num_attention_heads": 40,
141
+ "rope_scaling": null,
142
+ "num_key_value_heads": 8,
143
+ "hidden_act": "silu",
144
+ "initializer_range": 0.02,
145
+ "rms_norm_eps": 1e-05,
146
+ "use_cache": true,
147
+ "rope_theta": 500000.0,
148
+ "attention_dropout": 0.0,
149
+ "head_dim": 128,
150
+ "use_qk_norm": false,
151
+ "num_experts_per_tok": 1,
152
+ "num_local_experts": 128,
153
+ "output_router_logits": false,
154
+ "router_aux_loss_coef": 0.001,
155
+ "router_jitter_noise": 0.0,
156
+ "no_rope_layers": [
157
+ 1,
158
+ 1,
159
+ 1,
160
+ 0,
161
+ 1,
162
+ 1,
163
+ 1,
164
+ 0,
165
+ 1,
166
+ 1,
167
+ 1,
168
+ 0,
169
+ 1,
170
+ 1,
171
+ 1,
172
+ 0,
173
+ 1,
174
+ 1,
175
+ 1,
176
+ 0,
177
+ 1,
178
+ 1,
179
+ 1,
180
+ 0,
181
+ 1,
182
+ 1,
183
+ 1,
184
+ 0,
185
+ 1,
186
+ 1,
187
+ 1,
188
+ 0,
189
+ 1,
190
+ 1,
191
+ 1,
192
+ 0,
193
+ 1,
194
+ 1,
195
+ 1,
196
+ 0,
197
+ 1,
198
+ 1,
199
+ 1,
200
+ 0,
201
+ 1,
202
+ 1,
203
+ 1,
204
+ 0
205
+ ],
206
+ "interleave_moe_layer_step": 2,
207
+ "moe_layers": [
208
+ 1,
209
+ 3,
210
+ 5,
211
+ 7,
212
+ 9,
213
+ 11,
214
+ 13,
215
+ 15,
216
+ 17,
217
+ 19,
218
+ 21,
219
+ 23,
220
+ 25,
221
+ 27,
222
+ 29,
223
+ 31,
224
+ 33,
225
+ 35,
226
+ 37,
227
+ 39,
228
+ 41,
229
+ 43,
230
+ 45,
231
+ 47
232
+ ],
233
+ "attention_chunk_size": 8192
234
+ },
235
+ "tf_legacy_loss": false,
236
+ "tie_encoder_decoder": false,
237
+ "tie_word_embeddings": false,
238
+ "tokenizer_class": null,
239
+ "top_k": 50,
240
+ "top_p": 1.0,
241
+ "torch_dtype": "bfloat16",
242
+ "torchscript": false,
243
+ "transformers_version": "4.51.0",
244
+ "typical_p": 1.0,
245
+ "use_bfloat16": false,
246
+ "vision_config": {
247
+ "hidden_size": 1408,
248
+ "hidden_act": "gelu",
249
+ "num_hidden_layers": 34,
250
+ "num_channels": 3,
251
+ "intermediate_size": 5632,
252
+ "image_size": 336,
253
+ "vision_output_dim": 4096,
254
+ "patch_size": 14,
255
+ "norm_eps": 1e-05,
256
+ "num_attention_heads": 16,
257
+ "initializer_range": 0.02,
258
+ "pixel_shuffle_ratio": 0.5,
259
+ "projector_input_dim": 4096,
260
+ "projector_output_dim": 4096,
261
+ "multi_modal_projector_bias": false,
262
+ "projector_dropout": 0.0,
263
+ "attention_dropout": 0.0,
264
+ "vision_feature_layer": -1,
265
+ "vision_feature_select_strategy": "default",
266
+ "rope_theta": 10000,
267
+ "return_dict": true,
268
+ "output_hidden_states": false,
269
+ "output_attentions": false,
270
+ "torchscript": false,
271
+ "torch_dtype": null,
272
+ "use_bfloat16": false,
273
+ "tf_legacy_loss": false,
274
+ "pruned_heads": {},
275
+ "tie_word_embeddings": true,
276
+ "chunk_size_feed_forward": 0,
277
+ "is_encoder_decoder": false,
278
+ "is_decoder": false,
279
+ "cross_attention_hidden_size": null,
280
+ "add_cross_attention": false,
281
+ "tie_encoder_decoder": false,
282
+ "max_length": 20,
283
+ "min_length": 0,
284
+ "do_sample": false,
285
+ "early_stopping": false,
286
+ "num_beams": 1,
287
+ "num_beam_groups": 1,
288
+ "diversity_penalty": 0.0,
289
+ "temperature": 1.0,
290
+ "top_k": 50,
291
+ "top_p": 1.0,
292
+ "typical_p": 1.0,
293
+ "repetition_penalty": 1.0,
294
+ "length_penalty": 1.0,
295
+ "no_repeat_ngram_size": 0,
296
+ "encoder_no_repeat_ngram_size": 0,
297
+ "bad_words_ids": null,
298
+ "num_return_sequences": 1,
299
+ "output_scores": false,
300
+ "return_dict_in_generate": false,
301
+ "forced_bos_token_id": null,
302
+ "forced_eos_token_id": null,
303
+ "remove_invalid_values": false,
304
+ "exponential_decay_length_penalty": null,
305
+ "suppress_tokens": null,
306
+ "begin_suppress_tokens": null,
307
+ "architectures": null,
308
+ "finetuning_task": null,
309
+ "id2label": {
310
+ "0": "LABEL_0",
311
+ "1": "LABEL_1"
312
+ },
313
+ "label2id": {
314
+ "LABEL_0": 0,
315
+ "LABEL_1": 1
316
+ },
317
+ "tokenizer_class": null,
318
+ "prefix": null,
319
+ "bos_token_id": null,
320
+ "pad_token_id": null,
321
+ "eos_token_id": null,
322
+ "sep_token_id": null,
323
+ "decoder_start_token_id": null,
324
+ "task_specific_params": null,
325
+ "problem_type": null,
326
+ "_name_or_path": "",
327
+ "_attn_implementation_autoset": true,
328
+ "model_type": "llama4_vision_model"
329
+ }
330
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 200000,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 200001,
6
+ 200007,
7
+ 200008
8
+ ],
9
+ "pad_token_id": 200018,
10
+ "temperature": 0.6,
11
+ "top_p": 0.9,
12
+ "transformers_version": "4.51.0.dev0"
13
+ }
model-00001-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96a4ce933080317f696016a1adf5ee4053574b5d40427ba5eb353db9fbe8db54
3
+ size 4308004373
model-00002-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c070297a0d3d79f7655b59e169c3d2014803c2cae42ac8f8726a76b19868f993
3
+ size 3019899344
model-00003-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99508aacb146412d365b70b5547dba0912b237216ddc801722451ee3cd9708c1
3
+ size 3303430951
model-00004-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a15125fa2807b83130bcb6e761c9d3337c7fb2c6f2646698151896f5aa45418f
3
+ size 3019899350
model-00005-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31b4f2e5594ebdbe1135f941ab42f21d9dd065de0b8adcd811f4ec79cb44fa73
3
+ size 3019899344
model-00006-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b068ecf623f86405c8537b14a439eb0dfc4cb46b000b884a7de7c1928548d7e
3
+ size 3303430927
model-00007-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86aa984c6d32ec4780d133245d2dbbdf10dc603acd3c0d08e11abfafaa03c756
3
+ size 3019899350
model-00008-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e16ca3814f602001a337f110ed17fe254eeaaadc3a631f63e5868071080a6fd9
3
+ size 3019899344
model-00009-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c7e7565090f1a98335ab6f4c3f65ce2686dea69ed8cd95226332fbc99b7ce8b
3
+ size 3303430951
model-00010-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f7492de0813043da7cdb80e01ce7d0e2aa6c92a897dc232b4a9e310eb24c859
3
+ size 3019899350
model-00011-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6741aef63f058c727708609f99b96282595c97123d4e7b59c5712122e1f65bcc
3
+ size 3019899344
model-00012-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0524ea28c7c8eb9e6bcb95e32b8cd9d12ae490a7798fbe31d68ef25974491190
3
+ size 3303430947
model-00013-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85f87d3921074aadc41ee045943e05667ccce6abad0a0c9c804b46b4989a64d5
3
+ size 3019899350
model-00014-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f89fb34e7ce0a88fde4c3faeb49fa297b895f89f08d490964a6e3401ce68686
3
+ size 3019899344
model-00015-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3ecbc36b215a281601d6fea8c4b6d9ba82a04195cbfb401af9aeb9fb701839a
3
+ size 3303430970
model-00016-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c34d1d8ad9e4d75c53246061088e7930b32a44d7d3d280e69a61c1198865dffc
3
+ size 3019899353
model-00017-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6454995111aa5f43a5b28b11530cd4774cfa526df592dccc1602b1851f6a4cfe
3
+ size 3019899347
model-00018-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aadfa21ef4fdf115bda4a6bdd22a0d50b340d4cbd6a50f6c2d0f52a92751f8fa
3
+ size 3303430973
model-00019-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6171ad882cad20fd67477cea85b6c36a21c6d87fd1fb7b8c6ee036e424045d8
3
+ size 3019899353
model-00020-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07f89288d8ab988ae05e5c264559e2482a77589c22e75f24dcfb8b5d91bc6757
3
+ size 3019899347
model-00021-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc4dce8f0bb164d10b256daf1fa7ba22559e22e2ccf2f64fa278ce77c4d08378
3
+ size 3303430983
model-00022-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00c0cfe6e70a407042d7d542961eb59c49b665daa8e42b6904e2dad3966fc4fc
3
+ size 3019899353
model-00023-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:330fa98b8c4e1f24a056e09d71fc6e4fd7dae60667c0b0a11795b876319432eb
3
+ size 3019899347
model-00024-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c48329c9acb150da52f2d9c3d15309bbd9fcaa3a21cf3c9d8c8bec26a802cbbc
3
+ size 3303431003
model-00025-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ed4578168ff4ad11762f1598e5e4d0fe81964cdbcef786d510af3f5276106a6
3
+ size 3019899353
model-00026-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c1db4858193b41d6082adb8742c2e7d93e9920d85dc16027de92cacb1889029
3
+ size 3019899347
model-00027-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:428cb579ec999fd9f7d061b824fa53a1fa2e4b31c8eba870128870d7b746ba4a
3
+ size 3303431019
model-00028-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:183692aba6fe3bd78138ab66dad6bd3ea48abfd396db47d1c2fea59351b77123
3
+ size 3019899353
model-00029-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c0d9f52705a2ffbedbda440800701ead33d1791f4db7f0de4aab145833102c4
3
+ size 3019899347
model-00030-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d9d244d7175ce8394fbfdb09d8c974e5769701a0ec9c5687133f8625ac72b39
3
+ size 3303431017
model-00031-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e07467a79080294282e20eed310841c2e8c40ee9297ea9263a80de2aa49abca6
3
+ size 3019899353
model-00032-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3634e7f77aa7453b01cf817300c443ab58c084af487744496d442be4c8429b6
3
+ size 3019899347
model-00033-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7322e1fa2530f401e5da5fe876e6aa1fd2e72342b44b0a72dabe9c7b9754cc1b
3
+ size 3303430987
model-00034-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98b2e1855bc6865d863e35e5fa128486bcd12ea4cbd5d85bf1fce98bbde5fb62
3
+ size 3019899353
model-00035-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0ecd65c0d326f6eff44805f3424000573867613d9ad980cd84d8e777ef684ab
3
+ size 3019899347
model-00036-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b3499357293285c49c13eace190f62cbdb8859d43218c236aa55560dcc460542
3
+ size 3303431003
model-00037-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81bcc430a822ddb51346cff92f3715df4331a2c6cac48d6d9c244d132259c821
3
+ size 3019899353
model-00038-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d801e366330b523a6e227b411e570d9b04373b3a705bf8a24fc90793e9004991
3
+ size 3019899347
model-00039-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:170249a943340b4d10015fb2171330306aa888f02e19e18662bc0d15a325c483
3
+ size 3303430997
model-00040-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6be027228b540138827fe255d2a042e1abc5eb0b453d603d64b533d6576cdb6
3
+ size 3019899353
model-00041-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:74af69c97d246e0dfa068762cfeff681d168a15ead52868b4ef00d9685b4d76f
3
+ size 3019899347
model-00042-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e0b2e3a1c2068ff917b61c8704a489ffe4bff9c6084728b041c41a45b68a9c2
3
+ size 3303431023
model-00043-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60284ae2d175666b013d7285efba3b6f686ce3ffa9e5e39b020ae469dc3d2be2
3
+ size 3019899353
model-00044-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a3acfbecd71b7d6f89fb144a87682cfeb6e03ddbaa1d171ad3035b116e775d0
3
+ size 3019899347
model-00045-of-00072.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9afa236ee407b0a9f8d12e8759a0c6e0eb41866dd8b86aed2f26c51d402e5d57
3
+ size 3303430993