--- license: apache-2.0 datasets: - bengsoon/volve_alpaca language: - en base_model: - Meta/Meta-Llama-3-8B pipeline_tag: summarization tags: - oil-and-gas - energy - drilling --- # DriLLM Summarizer ## Background This is a fine-tuned model from [Meta/Meta-Llama-3-8B](https://huggingface.co/Meta/Meta-Llama-3-8B). The model was fine-tuned with [Volve DDR dataset](https://huggingface.co/datasets/bengsoon/volve_alpaca) using the Alpaca template, using [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities. ## How to use ### Sample Colab Here's a [Google colab notebook](https://colab.research.google.com/drive/10Txp14M-yeJG3hRAB8U2ydPrWFE1bypW?usp=sharing) where you can get started with using the model ### Recommended template for DriLLM-Summarizer: ``` python TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input} ### Response: """ ``` ### Inferencing using Transformers Pipeline The code below was tested on a Google colab (with the free T4 GPU). ``` python import transformers import torch model_id = "bengsoon/DriLLM-Summarizer" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto" ) TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input} ### Response: """ INSTRUCTION = """You are a Rig Supervisor working at an oil and gas offshore drilling operation. \ Your company is currently on a drilling campaign and you are the on-site Drilling Engineer (DE). \ As a DE, one of your jobs is to oversee the operations at the drilling rigs. As such, you know the ins and outs of the operation, down to the hourly activities. \ Every day, activities are recorded either by the Driller, Mud Logger, MWD / LWD engineer or the Drilling Operations Coordinator throughout the day. \ As a DE representative for your company, you are required to prepare the 24-hour summary for the Daily Drilling Report (DDR) based on the hourly activities reported. \ You must always maintain the language of report along with the terminologies and mnemonics of the Drilling Engineer. \ Given the following activities for well XX, please prepare the 24-hour summary for the Daily Drilling Report (DDR). \ Only return the 24-hour summary, and nothing else. """ hourly_events = """00:00 - 11:00: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. 11:00 - 17:00: Performed are inspection with barge engineer. Cleaned and tidied offices and workspace. Demobilized all personell. End of operation """ input = TEMPLATE.format(instruction=INSTRUCTION, input=hourly_events) output = pipeline(input) print("Response: ", output[0]["generated_text"].split("### Response:")[1].strip()) # > Response: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. Performed are inspection with barge engineer. Cleaned and tidyied offices and workspaces. ``` ### Quantized model If you are facing GPU constraints, you can try to load it with 8-bit quantization ``` python from transformers import BitsAndBytesConfig pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs = { "torch_dtype": torch.bfloat16, "quantization_config": BitsAndBytesConfig(load_in_8bit=True), # Uncomment to use 8-bit quantization, }, device_map="auto" ) ```