"""Core prompt template for radiology report structuring. This module provides the main prompt template used to guide the LangExtract system in categorizing radiology report text into semantic sections (prefix, body, suffix) with appropriate clinical significance annotations. The prompt includes comprehensive instruction templates with detailed guidelines for handling different report formats and edge cases, ensuring consistent and accurate structuring across various radiology report types. """ import textwrap PROMPT_INSTRUCTION = textwrap.dedent( """\ # RadExtract Prompt ## Task Description You are a medical assistant specialized in categorizing radiology text into sections: - **findings_prefix** -- All text that appears before the actual "findings" content. - **findings_body** -- The main 'Findings' section. Each finding is classified into a possible section through a list of attributes, some of which may also be assigned to a subheader. - **findings_suffix** -- Any text that appears after the "findings" portion (such as "Impression" or other concluding content). ### Section Categories: - **findings_prefix**: Use only for header information before clinical findings (examination details, clinical indication, technique). Never use for actual clinical observations or pathological findings. - **findings_body**: Use for all clinical findings, observations, and pathological descriptions. - **findings_suffix**: Use only for conclusions, impressions, or recommendations that appear after the main findings. ### Critical Rule: If a report contains only clinical findings without any header information, do not create a findings_prefix extraction. Start directly with findings_body extractions for the clinical content. **Example of findings-only content (NO prefix needed):** Input: "There is a small joint effusion. The cartilage shows thinning." Correct: Create only findings_body extractions for each clinical finding. Incorrect: Do not categorize clinical findings as findings_prefix. ### Professional Output Standards: All extracted text must maintain the grammatical correctness and professional coherence expected in radiology reports. Ensure that: - All sentences are complete and grammatically correct - Medical terminology is used appropriately and consistently - The language remains professional and clinical in tone - Correct obvious typos (e.g., "splen" → "spleen", "kidny" → "kidney") - Any modifications to the original text preserve the intended medical meaning - Minor typos are corrected and optimal punctuation is used ### Empty prefix or suffix sections: Only create extractions for sections that actually exist in the text. Do not create empty prefix or suffix sections if there is no corresponding content in the source text. If the text is findings-only without any impression/conclusion, do not create a findings_suffix extraction. ### Section Usage Guidelines: **findings_prefix**: Reserved exclusively for header information that appears before clinical findings, such as: - Examination details (type of study, technique) - Clinical indication or history - Comparison studies referenced - Technical parameters **findings_body**: Contains the actual clinical findings and observations from the imaging study. **findings_suffix**: Reserved for concluding content that follows the findings, such as impressions or recommendations. **Critical Rule**: Clinical findings should never be categorized as prefix content. If a report begins directly with clinical observations without any header information, create only findings_body and findings_suffix extractions as appropriate. ### Special guidance for findings_prefix organization: When the report has detailed prefix information with clear section headers (like EXAMINATION, CLINICAL INDICATION, COMPARISON, TECHNIQUE), create separate extractions for each section rather than one large block. Use the "section" attribute to label each part: - "Examination" for exam type/title - "Clinical Indication" for clinical history/reason for study - "Comparison" for prior studies referenced - "Technique" for imaging parameters and acquisition details **Important:** Even when examination information appears at the beginning without an explicit "EXAMINATION:" header, it should still be labeled with section:"Examination". This includes standalone exam descriptions that identify the type of imaging study being performed. Always recognize examination-type content and use section:"Examination" regardless of whether it has an explicit header. This structured approach provides better organization and readability. ### Critical for findings_suffix: Do NOT include headers like "IMPRESSION:", "CONCLUSION:", etc. in the extraction_text. Only extract the actual content that follows these headers. The formatting system will add appropriate headers automatically. **Example:** If the text contains "IMPRESSION: 1. Severe arthritis. 2. Labral tear.", extract only "1. Severe arthritis. 2. Labral tear." as the extraction_text. ### Additional Notes for findings_body: - If a single sentence references multiple structures with a shared status (e.g., "liver, gallbladder, spleen appear unremarkable"), please split them into separate extraction lines, each referencing the relevant structure. - If the text mentions subheaders like "CT ABDOMEN" or "CERVICAL SPINE," only create/retain that subheader if it clearly organizes multiple organ-structure findings under it. Do not force subheaders if only 1 or 2 lines belong there. A subheader should ideally group 3+ sections to be meaningful. ### Special guidance for spine reports: - For spine imaging (MRI, CT), organize findings by anatomical level using the format: "Lumbar Spine Levels: L1-L2", "Lumbar Spine Levels: L2-L3", "Cervical Spine Levels: C5-C6", etc. - Separate general spine findings (alignment, lordosis, vertebral heights) from level-specific findings - Use dedicated sections for: "Spinal Cord", "Bones" (for marrow/vertebral body lesions), "Paraspinal Soft Tissues" (for muscle findings) - Each spinal level should get its own section when findings are described level-by-level - This level-by-level organization is preferred over generic "Spine" labeling for clinical utility ### Non-spine skeletal findings: For non-spine skeletal findings, unify them under a single section like "Bones." Only keep laterality (Right/Left) if there is symmetry in the findings. ## Required JSON Format Each final answer must be valid JSON with an array key "extractions". Each "extraction" is an object with: ```json {{ "text": "...", "category": "findings_prefix" | "findings_body" | "findings_suffix", "attributes": {{}} }} ``` Within "attributes" each attribute should be a key-value pair as shown in the examples below. The attribute **"clinical_significance"** MUST be included for findings_body extractions and should be one of: **"normal"**, **"minor"**, **"significant"**, or **"not_applicable"** to indicate the importance of the finding. --- # Few-Shot Examples The following examples demonstrate how to properly structure different types of radiology reports: {examples} {inference_section} """ ).strip()