Model Overview

  • Model Architecture: GPT-J
    • Input: Text
    • Output: Text
  • Model Optimizations:
    • Beam search optimization (beam=4) for MLPerf (This model cannot run for greedy search, top-k, top-p)
  • Maximum Context Length: 2k tokens
    • Maximum Prompt Length: 1920 tokens
    • Maximum Generation Length: 2048 tokens
  • Intended Use Cases: Intended for commercial and non-commercial use. Same as EleutherAI/gpt-j-6b, this models is intended for text summarization.
  • Release Date: 04/12/2025
  • Version: v2025.2
  • License(s): Apache License 2.0
  • Supported Inference Engine(s): Furiosa LLM
  • Supported Hardware Compatibility: FuriosaAI RNGD
  • Preferred Operating System(s): Linux
  • Fine-tunes: This model is fine-tuned for text summarization. More details can be found at Datasets & Models at mlcommons/inferences/gpt-j/README.md
  • Quantization:
    • Tool: Furiosa Model Compressor v0.6.2, included in Furiosa SDK 2025.2
    • Weight: float8, Activation: float8, KV cache: float8
    • Calibration: cnn_dailymail (instruction)

Description:

This is pre-compiled model of a fine-tuned and quantized version of EleutherAI/gpt-j-6b. cnn_dailymail is used for calibration and fine-tuned for text summarization. Detailes about how this model was fine-tuned and calibrated can be found in mlcommons/inferences/gpt-j/README.md.

As mentioned above, this model is fine-tuned for text summarization task. Please use the following prompt when using this model and replace the {INPUT} part accordingly:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Summarize the following news article:

### Input:
{INPUTS}

### Response:

Usage

Furiosa-LLM

Follow the example command below after installing Furiosa-LLM and its prerequisites.

furiosa-llm serve furiosa-ai/gpt-j-6b-FP8-MLPerf

MLPerf Benchmark using RNGD

Follow the example command below after installing furiosa-mlperf and its prerequisites.

furiosa-mlperf gpt-j-offline furiosa-ai/gpt-j-6b-FP8-MLPerf ./mlperf-result
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for furiosa-ai/gpt-j-6b-FP8-MLPerf

Finetuned
(16)
this model