openfree commited on
Commit
2361314
·
verified ·
0 Parent(s):

Duplicate from VIDraft/Gemma-3-R1984-1B

Browse files
.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ library_name: transformers
4
+ base_model: google/gemma-3-1b-it
5
+ language:
6
+ - en
7
+ - ko
8
+ - ja
9
+ - zh
10
+ - es
11
+ - ru
12
+ - ar
13
+ - hi
14
+ - id
15
+ - ml
16
+ - fr
17
+ - de
18
+ pipeline_tag: image-text-to-text
19
+ ---
20
+
21
+ # Gemma3-R1984-1B
22
+
23
+ # Model Overview
24
+ Gemma3-R1984-1B is a robust Agentic AI platform built on Googls’s Gemma-3-4B model. It integrates state-of-the-art deep research via web search with multimodal file processing—including images, videos, and documents—and handles long contexts up to 8,000 tokens. Designed for local deployment on independent servers using NVIDIA L40s, L4, A-100(ZeroGPU) GPUs, it provides high security, prevents data leakage, and delivers uncensored responses.
25
+
26
+ # Key Features
27
+ Multimodal Processing:
28
+ Supports multiple file types such as images (PNG, JPG, JPEG, GIF, WEBP), videos (MP4), and documents (PDF, CSV, TXT).
29
+
30
+ Deep Research (Web Search):
31
+ Automatically extracts keywords from user queries and utilizes the SERPHouse API to retrieve up to 20 real-time search results. The model incorporates multiple sources by explicitly citing them in the response.
32
+
33
+ Long Context Handling:
34
+ Capable of processing inputs up to 8,000 tokens, ensuring comprehensive analysis of lengthy documents or conversations.
35
+
36
+ Robust Reasoning:
37
+ Employs extended chain-of-thought reasoning for systematic and accurate answer generation.
38
+
39
+ Secure Local Deployment:
40
+ Operates on independent local servers using NVIDIA L40s GPUs to maximize security and prevent information leakage.
41
+
42
+ **Experience the Power of Gemma3-R1984-1B**
43
+
44
+ - ✅ **Agentic AI Platform:** An autonomous system designed to make intelligent decisions and act independently.
45
+ - ✅ **Reasoning & Uncensored:** Delivers clear, accurate, and unfiltered responses by harnessing advanced reasoning capabilities.
46
+ - ✅ **Multimodal & VLM:** Seamlessly processes and interprets multiple input types—text, images, videos—empowering versatile applications.
47
+ - ✅ **Deep-Research & RAG:** Integrates state-of-the-art deep research and retrieval-augmented generation to provide comprehensive, real-time insights.
48
+
49
+ **Cutting-Edge Hardware for Maximum Security**
50
+
51
+ Gemma3-R1984-1B is engineered to operate on a dedicated **NVIDIA L40s GPU** within an independent local server environment. This robust setup not only guarantees optimal performance and rapid processing but also enhances security by isolating the model from external networks, effectively preventing information leakage. Whether handling sensitive data or complex queries, our platform ensures that your information remains secure and your AI interactions remain uncompromised.
52
+
53
+
54
+
55
+ # Use Cases
56
+ Fast-response conversational agents
57
+
58
+ Deep research and retrieval-augmented generation (RAG)
59
+
60
+ Document comparison and detailed analysis
61
+
62
+ Visual question answering from images and videos
63
+
64
+ Complex reasoning and research-based inquiries
65
+
66
+ # Supported File Formats
67
+ Images: PNG, JPG, JPEG, GIF, WEBP
68
+
69
+ Videos: MP4
70
+
71
+ Documents: PDF, CSV, TXT
72
+
73
+ # Model Details
74
+ Parameter Count: Approximately 1B parameters (estimated)
75
+
76
+ Context Window: Up to 8,000 tokens
77
+
78
+ Hugging Face Model Path: VIDraft/Gemma-3-R1984-1B
79
+
80
+ License: mit(Agentic AI) / gemma(gemma-3-1B)
81
+
82
+ # Installation and Setup
83
+ ## Requirements
84
+ Ensure you have Python 3.8 or higher installed. The model relies on several libraries:
85
+
86
+ PyTorch (with bfloat16 support)
87
+
88
+ Transformers
89
+
90
+ Gradio
91
+
92
+ OpenCV (opencv-python)
93
+
94
+ Pillow (PIL)
95
+
96
+ PyPDF2
97
+
98
+ Pandas
99
+
100
+ Loguru
101
+
102
+ Requests
103
+
104
+
105
+ # Install dependencies using pip:
106
+
107
+ pip install torch transformers gradio opencv-python pillow PyPDF2 pandas loguru requests
108
+
109
+ # Environment Variables
110
+ Set the following environment variables before running the model:
111
+
112
+ ## SERPHOUSE_API_KEY
113
+ Your SERPHouse API key for web search functionality.
114
+
115
+ Example:
116
+ export SERPHOUSE_API_KEY="your_api_key_here"
117
+
118
+ MODEL_ID
119
+ (Optional) The model identifier; default is VIDraft/Gemma-3-R1984-1B.
120
+
121
+ MAX_NUM_IMAGES
122
+ (Optional) Maximum number of images allowed per query (default is 5).
123
+
124
+ # Running the Model
125
+ Gemma3-R1984-1B comes with a Gradio-based multimodal chat interface. To run the model locally:
126
+
127
+ 1. Clone the Repository:
128
+ Ensure you have the repository containing the model code.
129
+
130
+ 2. Launch the Application:
131
+ Execute the main Python file:
132
+
133
+
134
+ python your_filename.py
135
+
136
+ This will start a local Gradio interface. Open the provided URL in your browser to interact with the model.
137
+
138
+ # Example Code: Server and Client Request
139
+ ## Server Example
140
+ You can deploy the model server locally using the provided Gradio code. Make sure your server is accessible at your designated URL.
141
+
142
+ ## Client Request Example
143
+ Below is an example of how to interact with the model using an HTTP API call:
144
+
145
+ ```py
146
+
147
+ import requests
148
+ import json
149
+
150
+ # Replace with your server URL and token
151
+ url = "http://<your-server-url>:8000/v1/chat/completions"
152
+ headers = {
153
+ "Content-Type": "application/json",
154
+ "Authorization": "Bearer your_token_here"
155
+ }
156
+
157
+ # Construct the message payload
158
+ messages = [
159
+ {"role": "system", "content": "You are a powerful AI assistant."},
160
+ {"role": "user", "content": "Compare the contents of two PDF files."}
161
+ ]
162
+
163
+ data = {
164
+ "model": "VIDraft/Gemma-3-R1984-1B",
165
+ "messages": messages,
166
+ "temperature": 0.15
167
+ }
168
+
169
+ # Send the POST request to the server
170
+ response = requests.post(url, headers=headers, data=json.dumps(data))
171
+
172
+ # Print the response from the model
173
+ print(response.json())
174
+ ```
175
+
176
+ **Important Deployment Notice:**
177
+
178
+ For optimal performance, it is highly recommended to clone the repository using the following command. This model is designed to run on a server equipped with at least an NVIDIA L40s, L4, A100(ZeroGPU) GPU. The minimum VRAM requirement is 24GB, and VRAM usage may temporarily peak at approximately 82GB during processing.
179
+
180
+ ```bash
181
+ git clone https://huggingface.co/spaces/VIDraft/Gemma-3-R1984-1B
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Gemma3ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "attn_logit_softcapping": null,
8
+ "bos_token_id": 2,
9
+ "cache_implementation": "hybrid",
10
+ "eos_token_id": [
11
+ 1,
12
+ 106
13
+ ],
14
+ "final_logit_softcapping": null,
15
+ "head_dim": 256,
16
+ "hidden_activation": "gelu_pytorch_tanh",
17
+ "hidden_size": 1152,
18
+ "initializer_range": 0.02,
19
+ "intermediate_size": 6912,
20
+ "max_position_embeddings": 32768,
21
+ "model_type": "gemma3_text",
22
+ "num_attention_heads": 4,
23
+ "num_hidden_layers": 26,
24
+ "num_key_value_heads": 1,
25
+ "pad_token_id": 0,
26
+ "query_pre_attn_scalar": 256,
27
+ "rms_norm_eps": 1e-06,
28
+ "rope_local_base_freq": 10000,
29
+ "rope_scaling": null,
30
+ "rope_theta": 1000000,
31
+ "sliding_window": 512,
32
+ "sliding_window_pattern": 6,
33
+ "torch_dtype": "float32",
34
+ "transformers_version": "4.51.3",
35
+ "use_cache": true,
36
+ "vocab_size": 262144
37
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.51.3"
13
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:454b9a6c5b06d2157518f2bbe1c96f9ab0b030980be1604731c199697bcc65a4
3
+ size 3999582960
preprocessor_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.50.0.dev0"
13
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "unk_token": {
27
+ "content": "<unk>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ }
33
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff