Model Overview

Model Architecture: GPT-J
- Input: Text
- Output: Text
Model Optimizations:
- Beam search optimization (beam=4) for MLPerf (This model cannot run for greedy search, top-k, top-p)
Maximum Context Length: 2k tokens
- Maximum Prompt Length: 1920 tokens
- Maximum Generation Length: 2048 tokens
Intended Use Cases: Intended for commercial and non-commercial use. Same as EleutherAI/gpt-j-6b, this models is intended for text summarization.
Release Date: 04/12/2025
Version: v2025.2
License(s): Apache License 2.0
Supported Inference Engine(s): Furiosa LLM
Supported Hardware Compatibility: FuriosaAI RNGD
Preferred Operating System(s): Linux
Fine-tunes: This model is fine-tuned for text summarization. More details can be found at Datasets & Models at mlcommons/inferences/gpt-j/README.md
Quantization:
- Tool: Furiosa Model Compressor v0.6.2, included in Furiosa SDK 2025.2
- Weight: float8, Activation: float8, KV cache: float8
- Calibration: cnn_dailymail (instruction)

Description:

This is pre-compiled model of a fine-tuned and quantized version of EleutherAI/gpt-j-6b. cnn_dailymail is used for calibration and fine-tuned for text summarization. Detailes about how this model was fine-tuned and calibrated can be found in mlcommons/inferences/gpt-j/README.md.

As mentioned above, this model is fine-tuned for text summarization task. Please use the following prompt when using this model and replace the {INPUT} part accordingly:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Summarize the following news article:

### Input:
{INPUTS}

### Response:

Usage

Furiosa-LLM

Follow the example command below after installing Furiosa-LLM and its prerequisites.

furiosa-llm serve furiosa-ai/gpt-j-6b-FP8-MLPerf

MLPerf Benchmark using RNGD

Follow the example command below after installing furiosa-mlperf and its prerequisites.

furiosa-mlperf gpt-j-offline furiosa-ai/gpt-j-6b-FP8-MLPerf ./mlperf-result

furiosa-ai
/

gpt-j-6b-FP8-MLPerf

Model Overview

Description:

Usage

Furiosa-LLM

MLPerf Benchmark using RNGD

Model tree for furiosa-ai/gpt-j-6b-FP8-MLPerf