You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

First Experiment (Qwen/Qwen3-Coder-30B-A3B-Instruct)

A creds.txt file containing the base URL, API key, and model name will be available within the hour.

Update - August 2, 2025:

The first experiment has concluded, and it was an overwhelming success! I fully anticipated server crashes, freezes, and frequent downtime—especially considering I publicly shared the credentials online!

Several key lessons emerged from this experiment:

  • We had more GPUs than necessary.
  • The 30K context length proved too restrictive; given the number of available H100 GPUs, we should consider at least a 128K context length in future tests.
  • The request rate limit of 1 request per second per IP was likely too conservative, although this measure significantly contributed to server stability. Given the intended use case—coding assistants, DSPy apps, and similar applications—this limit still helped prevent overload.

Initial statistics from the experiment are as follows:

📊 OVERALL STATISTICS
Total requests: 2,246
Unique IP addresses: 41
Time period: 2025-07-31 19:22:39+00:00 to 2025-08-01 21:21:24+00:00
Duration: 1:58:45

⚡ TRAFFIC METRICS
Peak requests per minute: 452 (at 2025-08-01 21:08)
Average response time: 1.903s (from first token to the last token)

👥 TOP USERS (by request count)
 1. x.x.x.91     831 requests ( 37.0%)
 2. x.x.x..1        724 requests ( 32.2%)
 3. x.x.x..133     314 requests ( 14.0%)
 4. x.x.x..25     122 requests (  5.4%)
 5. x.x.x..81       59 requests (  2.6%)
 6. x.x.x..41      45 requests (  2.0%)
 7. x.x.x..165     26 requests (  1.2%)
 8. x.x.x..24      16 requests (  0.7%)
 9. x.x.x..67       13 requests (  0.6%)
10. x.x.x..244      12 requests (  0.5%)

🎯 TOP ENDPOINTS
/v1/chat/completions           2,122 requests ( 94.5%)
/v1/models                        59 requests (  2.6%)

🤖 USER AGENTS
OpenAI/Python 1.96.1                                 830 requests ( 37.0%)
python-requests/2.32.3                               600 requests ( 26.7%)
Ws/JS 4.83.0                                         335 requests ( 14.9%)
RooCode/3.25.5                                       148 requests (  6.6%)
python-requests/2.25.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support