Model Details

Model Description

train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

Instruction:

아래 뉴스를 읽고 '경제', '금리', '외환' 중 하나로 분류하세요.

Question:

{}

Response:

{} {}""" -------------------------------------------------

Inference Code

인퍼런스 코드

import os import pandas as pd import torch from transformers import AutoTokenizer, AutoModelForCausalLM from tqdm.auto import tqdm import time

1) 경로 설정

base_dir = ## 설정 test_excel = ## 설정 output_excel = ## 설정

2) 허깅페이스 허브 레포 ID

model_id = ## 설정

3) 모델 & 토크나이저 로드

tokenizer = AutoTokenizer.from_pretrained( model_id, use_fast=True, trust_remote_code=True ) model = AutoModelForCausalLM.from_pretrained( model_id, trust_remote_code=True, torch_dtype=torch.bfloat16, device_map={"": "cuda"}, # 전 파라미터를 GPU로만 배치 # low_cpu_mem_usage=True, # (선택) 메모리 사용을 줄이는 로드 옵션 ) model.config.use_cache = True

4) Inference 프롬프트 스타일 정의

inference_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

Instruction:

아래 뉴스를 읽고 '경제', '금리', '외환' 중 하나로 분류하세요.

Question:

{}

Response:

{} {}"""

5) 테스트셋 로드

df = pd.read_excel(test_excel, engine='openpyxl') print(f"Loaded {len(df)} examples from {test_excel}")

6) 인퍼런스 함수 수정: THEME_HIST만 입력으로 받아서 요약 생성

def predict_label(text: str) -> str: # THEME_HIST(뉴스 본문)만 question에 넣습니다 question = text.strip() # inference_prompt_style에 question만 첫 번째 {}에, 나머지 두 자리는 빈 문자열("")로 채워줍니다 prompt = inference_prompt_style.format(question, "", "") + tokenizer.eos_token

inputs = tokenizer(
    prompt,
    return_tensors='pt',
    truncation=True,
    max_length=2048
).to('cuda')
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=100,
    eos_token_id=tokenizer.eos_token_id,
    use_cache=True,
)
decoded = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
# "### Response:" 뒤의 텍스트를 요약으로 가져옵니다
summary = decoded.split("### Response:")[-1].strip()
return summary

──────────────────────────────────────────────────────────────

7) 직렬 루프 대신 ThreadPoolExecutor로 병렬 인퍼런스

import torch from concurrent.futures import ThreadPoolExecutor, as_completed

7-1) 모든 THEME_HIST를 미리 프롬프트화

prompts = [ inference_prompt_style.format(row['THEME_HIST'].strip(), "", "") + tokenizer.eos_token for _, row in df.iterrows() ]

7-2) 스레드 단위로 한 건씩 infer 실행

def infer_one(prompt: str) -> str: # LangSmith에 “llm” 타입으로 run 생성 with trace(name="Qwen3-8B Summarization", run_type="llm", inputs={"prompt": prompt}) as run: start = time.time()

    # 토크나이징
    inputs_tok = tokenizer(
        prompt,
        return_tensors="pt",
        truncation=True,
        max_length=2048
    ).to("cuda")
    input_tokens = inputs_tok.input_ids.numel()

    # 모델 생성
    outputs = model.generate(
        input_ids=inputs_tok.input_ids,
        attention_mask=inputs_tok.attention_mask,
        max_new_tokens=100,
        eos_token_id=tokenizer.eos_token_id,
        use_cache=True,
    )
    output_tokens = outputs.sequences.shape[1]

    # 디코딩 및 요약 추출
    decoded = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    summary = decoded.split("### Response:")[-1].strip()
    latency_ms = int((time.time() - start) * 1000)

    # 메타데이터 기록
    run.metadata["input_tokens"] = int(input_tokens)
    run.metadata["output_tokens"] = int(output_tokens)
    run.metadata["latency_ms"] = latency_ms

    # 결과 저장 및 run 종료
    run.end(outputs={"summary": summary})

    return summary

7-6) 후처리: 태그 뒤만 남기기

df['summary'] = ( df['summary'] .astype(str) .str.split(r'', n=1) .str[-1] .str.strip() )

8) 엑셀 저장코드 별도 작성 必

Framework versions

PEFT 0.15.2

junghan
/

News_category_segmentation

Model Details

Model Description

Instruction:

Question:

Response:

Inference Code

인퍼런스 코드

1) 경로 설정

2) 허깅페이스 허브 레포 ID

3) 모델 & 토크나이저 로드

4) Inference 프롬프트 스타일 정의

Instruction:

Question:

Response:

5) 테스트셋 로드

6) 인퍼런스 함수 수정: THEME_HIST만 입력으로 받아서 요약 생성

──────────────────────────────────────────────────────────────

7) 직렬 루프 대신 ThreadPoolExecutor로 병렬 인퍼런스

7-1) 모든 THEME_HIST를 미리 프롬프트화

7-2) 스레드 단위로 한 건씩 infer 실행

7-6) 후처리: 태그 뒤만 남기기

8) 엑셀 저장코드 별도 작성 必

Framework versions

Model tree for junghan/News_category_segmentation