--- language: - en - ko license: other tags: - facebook - meta - pytorch - llama - llama-3 - llama-3-ko pipeline_tag: text-generation license_name: llama3 license_link: LICENSE --- # Model Card for Model ID ## Model Details Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B. This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts. With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer). **Sample usage** ``` from transformers import pipeline import torch pipe = pipeline( task="text-generation", model=model, tokenizer=tokenizer, model_kwargs={"torch_dtype": torch.bfloat16}, truncation=True ) def extract_response_llama3(question): messages = [ {"role": "system", "content": ""}, {"role": "user", "content": question}, ] prompt = pipe.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) terminators = [ pipe.tokenizer.eos_token_id, pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = pipe( prompt, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.1, top_p=0.9, num_return_sequences=1 ) return outputs[0]['generated_text'].split('\n')[-1] question = "예산을 분배할 때 사업의 우선 순위를 정해서 차등 지원하는 방법을 뭐라고 하지" response = extract_response_llama3(question) print(response) question = "미세먼지 생성물질의 배출을 저감하고 종합적으로 관리하기 위한 법을 어디서 제정했니" response = extract_response_llama3(question) print(response) question = "어떤 장소의 대기오염을 방지하기 위한 정책의 법적 근거가 특별법의 제정으로 준비되었지" response = extract_response_llama3(question) print(response) ``` **Sample Output** ``` 선택과 집중 환경부 항만 ```