Model Description: Fine-Tuned Pythia-160M on Reversed Tokens Using FLAN Subset

This model is a fine-tuned version of Pythia-160M, trained on a customized dataset derived from the FLAN instruction-tuning collection. The dataset has been preprocessed by reversing the token order of each input-output pair, encouraging the model to learn syntactic and semantic patterns in a reversed sequence. This approach is intended to explore the model's ability to generalize beyond traditional token orders and to evaluate its robustness in understanding and generating citation attribution.

Key Features:

Base Model: afterless/reverse-pythia-160m Fine-Tuning Objective: Language modeling on reversed token sequences for citation attribution improvement. Dataset: Subset of the Open-Orca/FLAN instruction-tuning corpus, selected Zero shot QA and summarization and excluded summarization tasks. Performance Notes:

The model retains its ability to generate coherent reversed outputs aligned with instruction-following behavior. Not intended for deployment in standard NLP tasks without re-reversing outputs.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "janbol/pythia-reverse-160m-Flan"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

# Move model to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)


def generate_flipped(prompt, max_length=100):
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        return_token_type_ids=False
    ).to(device)

    # Flip input sequence
    inputs['input_ids'] = torch.flip(inputs.input_ids, (1,))

    with t.no_grad():
        output = model.generate(
            **inputs,
            max_length=max_length,
            do_sample=True,
            temperature=0.4,
            top_k=50,
            top_p=0.92,
            repetition_penalty=1.2,
            no_repeat_ngram_size=3,
            early_stopping=True
        )

    # Flip back and decode
    return tokenizer.decode(torch.flip(output, (1,))[0], skip_special_tokens=True)
print(generate_flipped("Thus this planet is theoretically habitable"))



####### Output ##########
'''being roughly one "billion-years old". This may indicate that it was formed at the end of star formation (about 250 million years ago),
along with many other environmental factors such as volcanic activity during the last ice age (about 500 million years) and ultraviolet
radiation from the sun (which has been on Earth for about 2 billion years). They also have very high surface temperatures,
which are thought to be determined by the amount of oxygen in the atmosphere.
Thus this planet is theoretically habitable'''
Downloads last month
13
Safetensors
Model size
162M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for janbol/pythia-reverse-160m-Flan

Finetuned
(1)
this model

Dataset used to train janbol/pythia-reverse-160m-Flan