import dspy class InitialResourceSummarySignature(dspy.Signature): """ You are an AI Resource Analyzer. Analyze the provided learning resource excerpts (JSON object with filenames as keys and text content as values). For EACH resource, identify its primary subject, main topics, and key concepts/information. Optionally infer the content type/style. Present your analysis for each resource separately in a single text block. Your goal is to allow a Conversation Manager to quickly grasp the nature and content of each resource. Example of output format: **Resource: 'filename.txt'** This excerpt appears to be... * Main Topics: ... * Key Information: ... --- **Resource: 'another_file.pdf'** ... """ # Input Field Which is Clearly Input resource_excerpts_json = dspy.InputField(desc="A JSON string representing a dictionary where keys are resource identifiers (e.g., filenames) and values are the truncated text content of that resource.") summary_report = dspy.OutputField(desc="A formatted text report summarizing each resource excerpt as per the main instruction.") class DynamicSummarizationSignature(dspy.Signature): """ You are an AI Resource Analyzer. Process the provided 'learning_material_excerpt' in the context of the 'conversation_history' and its 'resource_identifier'. Extract key information MOST RELEVANT to the ongoing conversation. Pay special attention to Table of Contents, chapter overviews, or introductions. The summary should help create a structured learning syllabus addressing user's current focus. **The resource wont be passed to any other agent.** Output your analysis as a SINGLE JSON object string with the following keys: - "resource_identifier": (String, use the provided identifier) - "primary_topics_relevant_to_conversation": (List of strings) - "core_concepts_relevant_to_conversation": (List of strings) - "structure_or_progression_notes": (String) - "keywords_highlighted_by_conversation": (List of strings) - "inferred_learning_objectives_for_current_focus": (List of strings) - "contextual_notes_for_syllabus": (String) Ensure the output is ONLY the valid JSON object string. """ conversation_history_str = dspy.InputField(desc="The ongoing conversation history as a formatted string.") resource_identifier_str = dspy.InputField(desc="The identifier (e.g., filename) of the learning material.") learning_material_excerpt_str = dspy.InputField(desc="The textual content of the learning material excerpt to be summarized and the format provided would be dict.") # The LLM's direct output will be a JSON string json_summary_str = dspy.OutputField(desc="A string containing a single, valid JSON object with the summarized analysis.") class SyllabusNoResourcesSignature(dspy.Signature): """ **You are an expert AI Syllabus Creator.** Your **sole task** is to generate or modify a learning syllabus based **exclusively** on the provided 'learning_conversation' history. **No external resources, documents, or summaries are provided for this specific task, nor should any be assumed or hallucinated.** You must work only with the conversational context. **Your Goal:** Produce a well-structured, practical, and coherent syllabus XML. **Mode of Operation (Infer from 'learning_conversation'):** 1. **Modification:** If the 'learning_conversation' contains a previously presented syllabus (typically in `...` tags from an 'assistant' or 'model' role) AND subsequent user messages clearly provide feedback or request changes to THAT specific syllabus, your primary goal is to **modify that most recent relevant syllabus**. Accurately incorporate all user feedback. 2. **Generation:** If the 'learning_conversation' indicates a new learning topic, or if no prior syllabus for the current topic is evident, or if the user explicitly requests a fresh start, your goal is to **generate a new syllabus from scratch** based on the user's stated goals, experience level, and desired topic derived from the conversation. **Syllabus Structure Requirements:** * Organize into 2 to 5 distinct learning phases. * Each phase must contain 2 to 4 specific lessons or topics. * Arrange phases and lessons in a logical, progressive order, building complexity incrementally. **Lesson Detail Requirements (for each lesson):** * `Phase`: 3-4 Phases based on requirement * `Topic`: A clear, concise title. * `Keywords`: A list of 3-5 key terms or concepts. * `Objective`: 1-2 sentences describing what the learner should understand/do post-lesson. * `Focus`: 1-2 sentences on the main emphasis or key takeaways. Based on requirements You could increases the no of Topics and Phases **Output Format: CRITICAL - Output ONLY the complete syllabus XML structure enclosed within `` and `` tags. Do not include any other conversational text, explanations, or apologies before or after the XML block.Follow the lesson Detail Requirements Structure Strictly** """ learning_conversation = dspy.InputField(desc="The complete, ordered conversation history. This is your ONLY source of information for user needs, previous syllabi (if any, enclosed in tags), and feedback.") syllabus_xml = dspy.OutputField(desc="The complete generated or modified syllabus as a single XML string, starting with and ending with .") # --- Signature for LIGHT/RAW Text Resources (Revised & Detailed) --- class SyllabusWithRawTextSignature(dspy.Signature): """ **You are an expert AI Syllabus Creator.** Your **sole task** is to generate or modify a learning syllabus using the 'learning_conversation' history AND the provided 'raw_resource_excerpts_json'. **Crucial Context: The 'raw_resource_excerpts_json' you receive contains snippets of actual learning materials. This detailed content is exclusive to you for syllabus creation; no other AI agent has processed or summarized it for this purpose. Your thorough analysis and direct integration of this raw text are paramount.** **Your Goal:** Produce a well-structured syllabus XML that is deeply informed by both the user's needs (from conversation) and the specific content of the raw text excerpts. **'raw_resource_excerpts_json' Input:** This is a JSON string representing an object. Keys are resource identifiers (e.g., filenames), and values are the corresponding short raw text excerpts. **Mode of Operation (Infer from 'learning_conversation', integrate 'raw_resource_excerpts_json'):** 1. **Modification:** If the 'learning_conversation' contains a prior syllabus (in `` tags) and user feedback, **modify that syllabus**. Directly integrate relevant information, concepts, definitions, and examples from the 'raw_resource_excerpts_json' to address the feedback and enrich the syllabus. 2. **Generation:** If generating anew, **use both the 'learning_conversation' and the 'raw_resource_excerpts_json' from scratch**. The raw text should heavily influence the topics, lesson objectives, keywords, and focus points. For instance, if an excerpt details three key steps for a process, that could become a lesson or part of one. **Lesson Detail Requirements (for each lesson):** * `Phase`: 3-4 Phases based on requirement * `Topic`: A clear, concise title. * `Keywords`: A list of 3-5 key terms or concepts. * `Objective`: 1-2 sentences describing what the learner should understand/do post-lesson. * `Focus`: 1-2 sentences on the main emphasis or key takeaways. Based on requirements You could increases the no of Topics and Phases The no of Key words **Output Format: CRITICAL - Output ONLY the complete syllabus XML structure enclosed within `` and `` tags. No other text.Follow the lesson Detail Requirements Structure Strictly** """ learning_conversation = dspy.InputField(desc="Complete conversation history. May contain prior syllabi (in tags) and feedback. This defines user needs.") raw_resource_excerpts_json = dspy.InputField(desc="A JSON string: an object mapping resource IDs to their raw text snippets. This is your primary source for detailed content.") syllabus_xml = dspy.OutputField(desc="The complete syllabus XML, reflecting deep integration of raw text excerpts.") class SyllabusFeedbackRequestSignature(dspy.Signature): """ A syllabus has just been presented to the user (it is the last 'assistant' or 'model' message in the 'conversation_history_with_syllabus'). Your task is to generate a natural, concise, and engaging question to ask the user for their feedback on this newly presented syllabus. Keep it brief and open-ended. Example Output: "Here's the syllabus draft I've prepared. What are your thoughts on it?" "I've put together a syllabus based on our discussion. How does this look to you?" "Please take a look at the syllabus. Does it cover what you were expecting?" """ conversation_history_with_syllabus = dspy.InputField(desc="The conversation history, where the most recent relevant 'assistant' or 'model_artifact' message contains the syllabus that was just presented.") feedback_query_to_user = dspy.OutputField(desc="The question to ask the user for feedback on the syllabus.") class SyllabusWithSummariesSignature(dspy.Signature): """ **You are an expert AI Syllabus Creator.** Your **sole task** is to generate or modify a learning syllabus using the 'learning_conversation' history AND the provided 'resource_summaries_json'. **Crucial Context: The 'resource_summaries_json' you receive contains structured analytical summaries of larger learning materials (e.g., identifying relevant topics, core concepts, structural notes). This summarized information is exclusive to you for syllabus creation. Your task is to synthesize these expert summaries with the user's conversational needs.** **Your Goal:** Produce a well-structured syllabus XML that effectively translates the insights from the resource summaries into a practical learning plan aligned with user goals. **'resource_summaries_json' Input:** This is a JSON string representing an object. Keys are resource identifiers, and values are individual JSON summary objects for each resource (each containing keys like 'primary_topics_relevant_to_conversation', 'core_concepts_relevant_to_conversation', 'contextual_notes_for_syllabus', etc.). **Mode of Operation (Infer from 'learning_conversation', integrate 'resource_summaries_json'):** 1. **Modification:** If 'learning_conversation' shows a prior syllabus (in `` tags) and user feedback, **modify that syllabus**. Intelligently weave the topics, concepts, and contextual notes from the 'resource_summaries_json' to address feedback and improve the syllabus. 2. **Generation:** If generating anew, **use both 'learning_conversation' and 'resource_summaries_json' from scratch**. The summaries (especially 'primary_topics_relevant', 'core_concepts_relevant', 'contextual_notes_for_syllabus') should guide the choice of phases, lesson topics, keywords, objectives, and focus. **Lesson Detail Requirements (for each lesson):** * `Phase`: 3-4 Phases based on requirement * `Topic`: A clear, concise title. * `Keywords`: A list of 3-5 key terms or concepts. * `Objective`: 1-2 sentences describing what the learner should understand/do post-lesson. * `Focus`: 1-2 sentences on the main emphasis or key takeaways. Based on requirements You could increases the no of Topics and Phases * **Ensure lesson content is strongly guided by the insights presented in the 'resource_summaries_json'.** **Output Format: CRITICAL - Output ONLY the complete syllabus XML structure enclosed within `` and `` tags. No other text.Follow the lesson Detail Requirements Structure Strictly** """ learning_conversation = dspy.InputField(desc="Complete conversation history. Defines user needs and may contain prior syllabi/feedback.") resource_summaries_json = dspy.InputField(desc="A JSON string: an object mapping resource IDs to their structured summary objects. This provides high-level insights and content pointers.") syllabus_xml = dspy.OutputField(desc="The complete syllabus XML, reflecting effective use of resource summaries.") class SyllabusNegotiationSignature(dspy.Signature): """ **You are an expert AI Conversation Manager.** Your primary role is to facilitate a conversation to define requirements for a learning syllabus by analyzing the inputs and determining the next system action. **Inputs to Analyze:** 1. `conversation_history_str`: The full record of previous turns. 2. `current_syllabus_xml`: The latest syllabus draft (XML string or "None"). 3. `user_input`: The most recent message from the user. **Your Task:** Based on the inputs, determine the single most appropriate `action_code` from the list below. Additionally, if the action is purely conversational (`CONVERSE`), provide the `display_text` for the user. For all other action codes (`GENERATE`, `MODIFY`, `FINALIZE`, `PERSONA`), the `display_text` **MUST be an empty string or a placeholder like "[NO_DISPLAY_TEXT]"** as the system will handle the next step non-conversationally or with a dedicated prompter. **Action Codes & Conditions:** * `GENERATE`: Output this if sufficient initial information (topic, experience, goals) has been gathered from the conversation to request the *very first* syllabus draft. * `MODIFY`: Output this if a syllabus exists (indicated by a non-"None" `current_syllabus_xml` or visible in `conversation_history_stron_history`) AND the `user_input` (or recent history) provides clear feedback or requests changes to that existing syllabus. * `FINALIZE`: Output this if the `user_input` (or recent history) explicitly confirms that the user is satisfied with the *most recent* syllabus presented and no further changes are needed. * `PERSONA`: Output this if the conversation indicates the user has just provided their preferred learning style (this action signals readiness for the system to generate the tutor's persona prompt). `display_text` can be a very brief acknowledgment like "Got it, thanks!" or empty. * `CONVERSE`: Output this for all other situations. This includes asking clarifying questions, acknowledging user statements, providing general responses, or when a previous action (like syllabus generation) has just completed and you need to prompt the user for feedback on that artifact (which would be visible in the updated `conversation_history`). **Output Field Rules:** - `action_code`: MUST be one of the specified codes. - `display_text`: - For `CONVERSE`: Provide the natural language response to the user. - For `GENERATE`, `MODIFY`, `FINALIZE`: MUST be empty or "[NO_DISPLAY_TEXT]". - For `PERSONA`: Can be empty, "[NO_DISPLAY_TEXT]", or a very brief acknowledgment. """ conversation_history_str = dspy.InputField(desc="Previous turns in the conversation, formatted as a multi-line string. This may contain previously presented syllabi.") current_syllabus_xml = dspy.InputField(desc="The current draft syllabus XML (...), or the string 'None' if no syllabus has been successfully generated or focused on yet.") user_input = dspy.InputField(desc="The user's latest message that needs processing.") # resource_summary = dspy.InputField(desc="A brief summary/overview of user-provided learning resources, or 'None' if no resources are relevant or provided.") action_code = dspy.OutputField(desc="One of: GENERATE, MODIFY, FINALIZE, PERSONA, CONVERSE.") display_text = dspy.OutputField(desc="The conversational text response for the user. MUST be empty or '[NO_DISPLAY_TEXT]' if action_code is GENERATE, MODIFY, or FINALIZE. Can be brief for PERSONA.") class LearningStyleSignature(dspy.Signature): """ You are an AI assistant. The user has just finalized a learning syllabus. Your goal is to formulate a concise and engaging question to prompt the user about their preferred learning style and the kind of AI tutor personality they'd find most effective for the subject matter (discernible from the history). Encourage specific details beyond generic answers (e.g., interaction style, content format like examples/theory/analogies, pace, feedback type). Output ONLY the question itself. """ conversation_history_with_final_syllabus = dspy.InputField(desc="Full conversation history, including the finalized syllabus (which might be the last model_artifact turn).") question_to_user = dspy.OutputField(desc="The single, clear question to ask the user about their learning preferences.") class PersonaPromptBodyPredictSignature(dspy.Signature): """ **You are an AI Persona Architect.** Your goal is to generate the main body of a system prompt for an AI Tutor. This prompt body should accurately reflect the user's desired teaching style, personality, depth preferences, and subject matter, all derived from the provided 'conversation_history_with_style_and_syllabus_context'. **The prompt body MUST include:** 1. **Clear Persona Definition:** (e.g., AI Tutor's name like 'Synapse', its subject specialization, and its core mission). 2. **Core Principles Section:** (Detail the tutor's personality, teaching philosophy, desired traits, inspirational figures and how to emulate them, key emphasis areas. Use bullet points for clarity). 3. **Teaching Approach / Methodology Section:** (Outline specific methods: clarity/explanation style, interaction style, handling depth, practical elements, guidance vs. direct answers balance). 4. **Overall Goal Statement:** (A sentence summarizing the ultimate aim, e.g., "Your goal is to foster deep understanding..."). **CRITICAL: The generated text should be ONLY the prompt body itself, ready to have the syllabus appended to it externally. DO NOT include phrases like "Here is the syllabus..." or the {{SYLLABUS_SECTION}} placeholder.** Focus solely on crafting the persona and teaching instructions for the tutor. """ conversation_history_with_style_and_syllabus_context = dspy.InputField( desc="Full conversation history, including the finalized syllabus context (to understand the subject) and the user's stated learning style preferences (to inform persona and teaching approach)." ) # Only one output field: the prompt body text itself. prompt_body_text = dspy.OutputField( desc="The complete system prompt body for the AI Tutor, ending just before where the syllabus would be introduced by the calling system." ) class GenericInteractionSignature(dspy.Signature): """ Act as the AI Tutor defined by the persona and instructions within the provided `system_instructions`. Your role, persona, and task are dictated entirely by these dynamic instructions. Respond to the user's query based on these instructions and the conversation history. **Formatting Rules:** - Use standard Markdown for formatting like lists, bold (`**text**`), and italics (`*text*`). - **For mathematical equations and expressions, YOU MUST use KaTeX format:** - For inline mathematics, enclose the expression in single dollar signs. For example: `The variable is $x$.` - For block (display) mathematics, enclose the expression in double dollar signs. For example: `$$E = mc^2$$` """ system_instructions = dspy.InputField(desc="The full system prompt defining your current role, persona, how to interact, and often the learning material (like a syllabus).") history = dspy.InputField(desc="Recent conversation history relevant to the current interaction.") user_query = dspy.InputField(desc="The user's current question or statement.") response = dspy.OutputField(desc="Your response, adhering to the system_instructions.") class FormatSyllabusXMLToMarkdown(dspy.Signature): """" You are an expert data transformation AI. Your sole function is to convert learning syllabus content, which may be provided in XML format or as pre-formatted text, into clean, hierarchically structured, and well-formatted Markdown. Given the detailed examples and structure, a low temperature (e.g., 0.3) is appropriate for consistent output. **Key Conversion Rules:** 1. **Output Format:** The output MUST be exclusively Markdown. Do not include any introductory text, explanations, or conversational wrappers around the Markdown content. 2. **Maximum Heading Level:** The highest-level heading in the Markdown output (typically for "Phases" or main sections) MUST be `##`. Do NOT use `#` (H1) for any titles. 3. **Hierarchical Structure & Title Formatting:** * **Main Sections (e.g., Phases, Modules) - Output as `## Title`:** * **From XML:** If the input is `Actual Title`, convert to `## Actual Title`. * Example: `Foundations of AI` becomes `## Foundations of AI`. * Example: `Phase 1 - Getting Started` becomes `## Phase 1 - Getting Started`. * **From Text Input (e.g., lines starting with `# Phase:`, `## Module:`, or just `# Title`):** * If the input line is like `# Phase: Advanced Topics` (where "Advanced Topics" is descriptive and not just "Phase X"), convert to `## Phase: Advanced Topics`. The prefix ("Phase:", "Module:", etc.) is kept. * If the input line is like `# Module 1 Concepts` (where there's no explicit "Module:" prefix but the title itself indicates it), convert to `## Module 1 Concepts`. * **Avoid Redundancy:** If the input line is like `# Phase: Phase 1` or `## Module: Module A`, simplify the output to `## Phase 1` or `## Module A` respectively. The goal is to avoid `## Phase: Phase 1`. * **Implicitly Defined Sections (by `---` separator without title):** Number them sequentially, e.g., `## Phase 1`, `## Phase 2`. * **Sub-Sections (e.g., Lessons, Topics, Units) - Output as `### Title`:** (Apply similar logic) * **From XML:** `Actual Topic` becomes `### Actual Topic`. * **From Text Input (e.g., `## Lesson:`, `### Unit:`, or just `## Topic Name`):** * If input is `## Lesson: Understanding Functions` (descriptive title), convert to `### Lesson: Understanding Functions`. * If input is `### Unit Details` (no prefix), convert to `### Unit Details`. * **Avoid Redundancy:** If input is `## Lesson: Lesson 3` or `### Topic: Topic B`, simplify to `### Lesson 3` or `### Topic B`. * **Initial Items:** If lesson-like items appear at the very beginning of a text input *before* any main section (`##`) or separator, list them directly using `###` headings. 4. **Lesson/Topic Content (e.g., Keywords, Objective, Focus, Key Ideas, Goal, Core Content):** * For XML elements like `...`, `...`, `...`: * Format as: `**Keywords:** [Content of keywords tag]` `**Objective:** [Content of objective tag]` `**Focus:** [Content of focus tag]` * For text input lines like `- **Keywords:** ...` or `**Objective:** ...`: * Preserve the bolded label and its content: `**Keywords:** [Content following label]` `**Objective:** [Content following label]` * **If a specific content element (like keywords, objective, etc.) is NOT provided for a lesson/topic, completely omit that line from the Markdown output for that lesson/topic.** For example, if there's no `` tag or "Keywords:" line, do not print "**Keywords:**" at all. 5. **Error and Extraneous Text Handling:** Ignore any error messages (e.g., "Error displaying syllabus...", "BEEPBOOPFIZZ!"), comments, or other non-syllabus text present in the input. Focus solely on extracting and formatting the syllabus structure. 6. **Adaptability to Naming:** Be prepared for variations in XML tag names (e.g., `` for ``, `` for ``) or text labels (e.g., "Module:", "Unit:", "Key Ideas:", "Goal:"). Apply the hierarchical and content formatting rules consistently based on the detected structure. Maintain logical grouping as presented in the input. Your task is to process the provided input and generate only the corresponding Markdown. Input (Default Expected Format): Phase 1: Foundations and Introduction Introduction to Neural Networks and Representation Learning Representation Learning, NLP, Neural Networks, Sequence Models Understand the role of representation learning in NLP and the need for advanced sequence models beyond traditional methods. Review fundamental concepts of learning representations for sequential data, particularly in the context of natural language processing. Mathematical and Computational Building Blocks One-Hot Encoding, Dot Product, Matrix Multiplication, Embeddings Become familiar with essential mathematical operations and data encoding techniques fundamental to modern neural network architectures. Focus on how basic linear algebra and encoding schemes like one-hot and embeddings are used to process data for sequence models. Comparing Transformers to Previous Architectures Transformers, RNNs, CNNs, Sequential Processing, Parallel Processing Identify the limitations of sequential models (RNNs) and convolutional models (CNNs) when handling long-range dependencies, motivating the need for the Transformer architecture. Understand the core architectural differences and processing paradigms (sequential vs. parallel) that distinguish Transformers from RNNs and CNNs across various tasks (Language, Vision, Multimodal). Phase 2: The Attention Mechanism Understanding Attention in Sequence Models Attention, Sequence Modeling, Masking, Dependencies Grasp the fundamental concept of attention and how it allows models to weigh the importance of different parts of the input sequence. Explore how attention mechanisms address the limitations of fixed-context models and capture dependencies, including the role of masking. Self-Attention: The Core Idea Self-Attention, Queries, Keys, Values, Scaled Dot-Product Attention Learn the mechanics of self-attention, including the roles of queries, keys, and values, and how it enables parallel processing and capturing internal dependencies within a single sequence. Deep dive into the scaled dot-product self-attention calculation and its significance for the Transformer architecture. Multi-Head and Cross-Attention Multi-Head Attention, Cross-Attention, Attention Heads Understand how Multi-Head Attention enhances the model's ability to capture diverse relationships and how Cross-Attention facilitates interaction between different sequences (e.g., encoder-decoder). Examine the benefits of using multiple attention heads and the application of cross-attention in sequence-to-sequence tasks. Phase 3: Transformer Architecture and Applications Positional Encoding and Embeddings Positional Encoding, Embeddings, Sequence Order Understand how embeddings represent tokens and how positional encoding injects information about the position of tokens in the sequence, crucial since attention is permutation-invariant. Explore different methods for positional encoding and their integration with token embeddings. The Encoder-Decoder Structure Encoder-Decoder, Transformer Architecture, Feed-Forward Networks, Residual Connections, Layer Normalization Learn the overall architecture of the Transformer model, including the stack of encoder and decoder layers and their internal components (attention, feed-forward networks, skip connections, normalization). Examine the flow of information through the complete Encoder-Decoder pipeline in a Transformer. Implementation Details and Model Variations Tokenization, Implementation, GPT, BERT, Parameter Distribution Gain insight into practical aspects like tokenization and understand how the Transformer architecture is adapted in prominent models like BERT and GPT, including parameter distribution. Discuss practical considerations for implementing and working with Transformer models and analyze variations like encoder-only (BERT) and decoder-only (GPT) architectures. Phase 4: Analysis and Future Directions Analysis of Transformer Properties Advantages, Disadvantages, Time Complexity, Parallelism, Long-Range Dependencies Evaluate the key advantages (parallelism, handling long dependencies) and disadvantages (computational cost, memory) of the Transformer architecture. Analyze the computational complexity of Transformers compared to RNNs and CNNs, focusing on efficiency gains and limitations. Applications and Future Trends NLP Applications, Vision Applications, Multimodal, Future Directions, Efficiency Explore the wide range of applications where Transformers have been successful and discuss current research directions and potential future developments. Survey the impact of Transformers beyond NLP and look at efforts to improve their efficiency and capabilities. Output: ## Phase 1: Foundations and Introduction ### Topic: Introduction to Neural Networks and Representation Learning **Keywords:** Representation Learning, NLP, Neural Networks, Sequence Models **Objective:** Understand the role of representation learning in NLP and the need for advanced sequence models beyond traditional methods. **Focus:** Review fundamental concepts of learning representations for sequential data, particularly in the context of natural language processing. ### Topic: Mathematical and Computational Building Blocks **Keywords:** One-Hot Encoding, Dot Product, Matrix Multiplication, Embeddings **Objective:** Become familiar with essential mathematical operations and data encoding techniques fundamental to modern neural network architectures. **Focus:** Focus on how basic linear algebra and encoding schemes like one-hot and embeddings are used to process data for sequence models. ### Topic: Comparing Transformers to Previous Architectures **Keywords:** Transformers, RNNs, CNNs, Sequential Processing, Parallel Processing **Objective:** Identify the limitations of sequential models (RNNs) and convolutional models (CNNs) when handling long-range dependencies, motivating the need for the Transformer architecture. **Focus:** Understand the core architectural differences and processing paradigms (sequential vs. parallel) that distinguish Transformers from RNNs and CNNs across various tasks (Language, Vision, Multimodal). ## Phase 2: The Attention Mechanism ### Topic: Understanding Attention in Sequence Models **Keywords:** Attention, Sequence Modeling, Masking, Dependencies **Objective:** Grasp the fundamental concept of attention and how it allows models to weigh the importance of different parts of the input sequence. **Focus:** Explore how attention mechanisms address the limitations of fixed-context models and capture dependencies, including the role of masking. ### Topic: Self-Attention: The Core Idea **Keywords:** Self-Attention, Queries, Keys, Values, Scaled Dot-Product Attention **Objective:** Learn the mechanics of self-attention, including the roles of queries, keys, and values, and how it enables parallel processing and capturing internal dependencies within a single sequence. **Focus:** Deep dive into the scaled dot-product self-attention calculation and its significance for the Transformer architecture. ### Topic: Multi-Head and Cross-Attention **Keywords:** Multi-Head Attention, Cross-Attention, Attention Heads **Objective:** Understand how Multi-Head Attention enhances the model's ability to capture diverse relationships and how Cross-Attention facilitates interaction between different sequences (e.g., encoder-decoder). **Focus:** Examine the benefits of using multiple attention heads and the application of cross-attention in sequence-to-sequence tasks. ## Phase 3: Transformer Architecture and Applications ### Topic: Positional Encoding and Embeddings **Keywords:** Positional Encoding, Embeddings, Sequence Order **Objective:** Understand how embeddings represent tokens and how positional encoding injects information about the position of tokens in the sequence, crucial since attention is permutation-invariant. **Focus:** Explore different methods for positional encoding and their integration with token embeddings. ### Topic: The Encoder-Decoder Structure **Keywords:** Encoder-Decoder, Transformer Architecture, Feed-Forward Networks, Residual Connections, Layer Normalization **Objective:** Learn the overall architecture of the Transformer model, including the stack of encoder and decoder layers and their internal components (attention, feed-forward networks, skip connections, normalization). **Focus:** Examine the flow of information through the complete Encoder-Decoder pipeline in a Transformer. ### Topic: Implementation Details and Model Variations **Keywords:** Tokenization, Implementation, GPT, BERT, Parameter Distribution **Objective:** Gain insight into practical aspects like tokenization and understand how the Transformer architecture is adapted in prominent models like BERT and GPT, including parameter distribution. **Focus:** Discuss practical considerations for implementing and working with Transformer models and analyze variations like encoder-only (BERT) and decoder-only (GPT) architectures. ## Phase 4: Analysis and Future Directions ### Topic: Analysis of Transformer Properties **Keywords:** Advantages, Disadvantages, Time Complexity, Parallelism, Long-Range Dependencies **Objective:** Evaluate the key advantages (parallelism, handling long dependencies) and disadvantages (computational cost, memory) of the Transformer architecture. **Focus:** Analyze the computational complexity of Transformers compared to RNNs and CNNs, focusing on efficiency gains and limitations. ### Topic: Applications and Future Trends **Keywords:** NLP Applications, Vision Applications, Multimodal, Future Directions, Efficiency **Objective:** Explore the wide range of applications where Transformers have been successful and discuss current research directions and potential future developments. **Focus:** Survey the impact of Transformers beyond NLP and look at efforts to improve their efficiency and capabilities. Input(Description There wouldn't be Phases or anything. but based on heirarchy create the markdown properly and also omit errors if any): Error displaying syllabus: This page contains the following errors:error on line 1 at column 1: Start tag expected, '<' not foundBelow is a rendering of the page up to the first error. # Phase: Introduction to Sequence Transduction and the Transformer ## Lesson: Limitations of Recurrent and Convolutional Models - **Keywords:** Sequence Transduction, RNN, CNN, Recurrence, Convolution, Parallelization, Sequential Computation - **Objective:** Understand the challenges faced by traditional sequence transduction models like RNNs and CNNs, particularly regarding sequential computation and parallelization. - **Focus:** This lesson focuses on the inherent sequential nature of recurrent models and the path length issues in convolutional models that limit their efficiency and ability to learn long-range dependencies, motivating the need for a new architecture. ## Lesson: Introducing the Transformer Architecture - **Keywords:** Transformer, Attention Mechanism, Self-Attention, Sequence Transduction, Parallelization - **Objective:** Learn about the Transformer, a novel network architecture that replaces recurrence and convolutions entirely with attention mechanisms. - **Focus:** This lesson introduces the core idea of the Transformer: relying solely on attention to draw global dependencies, enabling significantly more parallelization and faster training compared to previous models. --- # Phase: Core Components of the Transformer ## Lesson: Encoder-Decoder Structure - **Keywords:** Encoder, Decoder, Stack, Sub-layer, Residual Connection, Layer Normalization - **Objective:** Describe the overall encoder-decoder structure of the Transformer and the composition of its stacked layers. - **Focus:** This lesson details how the Transformer utilizes an encoder-decoder framework with stacks of identical layers, each featuring sub-layers, residual connections, and layer normalization, as depicted in Figure 1. ## Lesson: Attention Function and Scaled Dot-Product Attention - **Keywords:** Attention Function, Query, Key, Value, Weighted Sum, Scaled Dot-Product Attention, Softmax, Compatibility Function - **Objective:** Explain the fundamental concept of an attention function and the specific implementation used in the Transformer: Scaled Dot-Product Attention. - **Focus:** This lesson covers the definition of attention as mapping queries and key-value pairs to a weighted sum of values, focusing on the Scaled Dot-Product Attention formula and the role of scaling by 1/sqrt(dk). ## Lesson: Multi-Head Attention - **Keywords:** Multi-Head Attention, Linear Projection, Parallel Attention, Representation Subspaces, Concatentation - **Objective:** Understand how Multi-Head Attention enhances the model's ability to attend to information from different representation subspaces. - **Focus:** This lesson explains the mechanism of Multi-Head Attention, involving projecting queries, keys, and values multiple times in parallel and concatenating the results, highlighting its benefit over single-head attention. ## Lesson: Applications of Attention within the Transformer - **Keywords:** Encoder-Decoder Attention, Encoder Self-Attention, Decoder Self-Attention, Masking, Auto-regressive - **Objective:** Identify the three distinct ways Multi-Head Attention is utilized in the Transformer's encoder and decoder stacks. - **Focus:** This lesson details the specific applications: encoder-decoder attention for connecting encoder/decoder outputs, encoder self-attention for processing input sequence dependencies, and masked decoder self-attention for preserving the auto-regressive property. --- # Phase: Supporting Mechanisms and Training ## Lesson: Position-wise Feed-Forward Networks, Embeddings, and Softmax - **Keywords:** Feed-Forward Network, ReLU, Embeddings, Softmax, Shared Weights - **Objective:** Describe the role of the position-wise feed-forward networks and how input/output tokens are processed using embeddings and a final softmax layer. - **Focus:** This lesson covers the independent feed-forward network applied to each position and the standard use of learned embeddings for token representation, including the sharing of weights between embedding layers and the pre-softmax transformation. ## Lesson: Positional Encoding - **Keywords:** Positional Encoding, Sine, Cosine, Sequence Order, Relative Position, Learned Embeddings - **Objective:** Explain why positional encoding is necessary in the Transformer and how the sinusoidal positional encoding is implemented. - **Focus:** This lesson focuses on the need to inject positional information due to the absence of recurrence/convolution and describes the use of sine and cosine functions at different frequencies, discussing its potential for extrapolating to longer sequences. ## Lesson: Efficiency Analysis: Why Self-Attention? - **Keywords:** Computational Complexity, Sequential Operations, Maximum Path Length, Parallelization, Long-Range Dependencies - **Objective:** Compare the efficiency of self-attention layers against recurrent and convolutional layers based on computational complexity, sequential operations, and path length. - **Focus:** This lesson uses the analysis from the paper (Table 1) to highlight the advantages of self-attention, particularly its constant number of sequential operations and path length, making it highly parallelizable and effective for capturing long-range dependencies. --- # Phase: Implementation, Results, and Impact ## Lesson: Training Setup and Regularization - **Keywords:** Training Data, Batching, Optimizer, Learning Rate Schedule, Warmup Steps, Residual Dropout, Label Smoothing - **Objective:** Summarize the training methodology, including data preparation, optimizer configuration, learning rate scheduling, and regularization techniques used. - **Focus:** This lesson details the practical aspects of training the Transformer, covering the datasets used, batching strategy, the Adam optimizer with a specific learning rate schedule and warmup, and the application of dropout and label smoothing. ## Lesson: Machine Translation Performance - **Keywords:** Machine Translation, BLEU Score, WMT 2014, State-of-the-Art, Training Cost, Benchmarks - **Objective:** Evaluate the Transformer's performance on machine translation tasks based on reported BLEU scores and training costs compared to existing models. - **Focus:** This lesson presents the key results from the paper (Table 2), demonstrating that the Transformer achieves new state-of-the-art BLEU scores on WMT 2014 English-to-German and English-to-French translation tasks with significantly reduced training costs. ## Lesson: Model Variations and Generalization - **Keywords:** Model Variations, Attention Heads, Key Size, Dropout, Positional Embedding, Constituency Parsing, Generalization [##100 completed] - **Objective:** Discuss the impact of different architectural variations and evaluate the Transformer's ability to generalize to tasks beyond machine translation. - **Focus:** This lesson examines the results of experiments on model variations (Table 3) to understand component importance and demonstrates the Transformer's successful application and strong performance on English constituency parsing (Table 4), showcasing its generalization capabilities. Output: # Phase: Introduction to Sequence Transduction and the Transformer ## Lesson: Limitations of Recurrent and Convolutional Models - **Keywords:** Sequence Transduction, RNN, CNN, Recurrence, Convolution, Parallelization, Sequential Computation - **Objective:** Understand the challenges faced by traditional sequence transduction models like RNNs and CNNs, particularly regarding sequential computation and parallelization. - **Focus:** This lesson focuses on the inherent sequential nature of recurrent models and the path length issues in convolutional models that limit their efficiency and ability to learn long-range dependencies, motivating the need for a new architecture. ## Lesson: Introducing the Transformer Architecture - **Keywords:** Transformer, Attention Mechanism, Self-Attention, Sequence Transduction, Parallelization - **Objective:** Learn about the Transformer, a novel network architecture that replaces recurrence and convolutions entirely with attention mechanisms. - **Focus:** This lesson introduces the core idea of the Transformer: relying solely on attention to draw global dependencies, enabling significantly more parallelization and faster training compared to previous models. --- # Phase: Core Components of the Transformer ## Lesson: Encoder-Decoder Structure - **Keywords:** Encoder, Decoder, Stack, Sub-layer, Residual Connection, Layer Normalization - **Objective:** Describe the overall encoder-decoder structure of the Transformer and the composition of its stacked layers. - **Focus:** This lesson details how the Transformer utilizes an encoder-decoder framework with stacks of identical layers, each featuring sub-layers, residual connections, and layer normalization, as depicted in Figure 1. ## Lesson: Attention Function and Scaled Dot-Product Attention - **Keywords:** Attention Function, Query, Key, Value, Weighted Sum, Scaled Dot-Product Attention, Softmax, Compatibility Function - **Objective:** Explain the fundamental concept of an attention function and the specific implementation used in the Transformer: Scaled Dot-Product Attention. - **Focus:** This lesson covers the definition of attention as mapping queries and key-value pairs to a weighted sum of values, focusing on the Scaled Dot-Product Attention formula and the role of scaling by 1/sqrt(dk). ## Lesson: Multi-Head Attention - **Keywords:** Multi-Head Attention, Linear Projection, Parallel Attention, Representation Subspaces, Concatentation - **Objective:** Understand how Multi-Head Attention enhances the model's ability to attend to information from different representation subspaces. - **Focus:** This lesson explains the mechanism of Multi-Head Attention, involving projecting queries, keys, and values multiple times in parallel and concatenating the results, highlighting its benefit over single-head attention. ## Lesson: Applications of Attention within the Transformer - **Keywords:** Encoder-Decoder Attention, Encoder Self-Attention, Decoder Self-Attention, Masking, Auto-regressive - **Objective:** Identify the three distinct ways Multi-Head Attention is utilized in the Transformer's encoder and decoder stacks. - **Focus:** This lesson details the specific applications: encoder-decoder attention for connecting encoder/decoder outputs, encoder self-attention for processing input sequence dependencies, and masked decoder self-attention for preserving the auto-regressive property. --- # Phase: Supporting Mechanisms and Training ## Lesson: Position-wise Feed-Forward Networks, Embeddings, and Softmax - **Keywords:** Feed-Forward Network, ReLU, Embeddings, Softmax, Shared Weights - **Objective:** Describe the role of the position-wise feed-forward networks and how input/output tokens are processed using embeddings and a final softmax layer. - **Focus:** This lesson covers the independent feed-forward network applied to each position and the standard use of learned embeddings for token representation, including the sharing of weights between embedding layers and the pre-softmax transformation. ## Lesson: Positional Encoding - **Keywords:** Positional Encoding, Sine, Cosine, Sequence Order, Relative Position, Learned Embeddings - **Objective:** Explain why positional encoding is necessary in the Transformer and how the sinusoidal positional encoding is implemented. - **Focus:** This lesson focuses on the need to inject positional information due to the absence of recurrence/convolution and describes the use of sine and cosine functions at different frequencies, discussing its potential for extrapolating to longer sequences. ## Lesson: Efficiency Analysis: Why Self-Attention? - **Keywords:** Computational Complexity, Sequential Operations, Maximum Path Length, Parallelization, Long-Range Dependencies - **Objective:** Compare the efficiency of self-attention layers against recurrent and convolutional layers based on computational complexity, sequential operations, and path length. - **Focus:** This lesson uses the analysis from the paper (Table 1) to highlight the advantages of self-attention, particularly its constant number of sequential operations and path length, making it highly parallelizable and effective for capturing long-range dependencies. --- # Phase: Implementation, Results, and Impact ## Lesson: Training Setup and Regularization - **Keywords:** Training Data, Batching, Optimizer, Learning Rate Schedule, Warmup Steps, Residual Dropout, Label Smoothing - **Objective:** Summarize the training methodology, including data preparation, optimizer configuration, learning rate scheduling, and regularization techniques used. - **Focus:** This lesson details the practical aspects of training the Transformer, covering the datasets used, batching strategy, the Adam optimizer with a specific learning rate schedule and warmup, and the application of dropout and label smoothing. ## Lesson: Machine Translation Performance - **Keywords:** Machine Translation, BLEU Score, WMT 2014, State-of-the-Art, Training Cost, Benchmarks - **Objective:** Evaluate the Transformer's performance on machine translation tasks based on reported BLEU scores and training costs compared to existing models. - **Focus:** This lesson presents the key results from the paper (Table 2), demonstrating that the Transformer achieves new state-of-the-art BLEU scores on WMT 2014 English-to-German and English-to-French translation tasks with significantly reduced training costs. ## Lesson: Model Variations and Generalization - **Keywords:** Model Variations, Attention Heads, Key Size, Dropout, Positional Embedding, Constituency Parsing, Generalization - **Objective:** Discuss the impact of different architectural variations and evaluate the Transformer's ability to generalize to tasks beyond machine translation. - **Focus:** This lesson examines the results of experiments on model variations (Table 3) to understand component importance and demonstrates the Transformer's successful application and strong performance on English constituency parsing (Table 4), showcasing its generalization capabilities. Input: (Descrption:If heirarchy is Jumbeled Find the right heirarchy and find the right markdown for it and sometimes there wouldn't be any phase at all then omit it in output just continue or if clear distinction is provided write them as Phase 1 , Phase 2 ) ERROR_PARSING_SYLLABUS_STREAM_0xDEADBEEF: Unrecoverable anomaly detected! BEEPBOOPFIZZ! Attempting to render partial content. ## Unit: Welcome and Orientation - **Key Ideas:** Course Overview, Learning Objectives, Tools Setup - **Goal:** To welcome learners and set up the learning environment. - **Core Content:** Introduction to the course, navigating the platform, installing necessary software. ## Unit: Advanced Data Manipulation // This unit belongs in the "Data Analysis" Module - **Key Ideas:** Data Wrangling, Feature Engineering, Complex Queries - **Goal:** To master techniques for transforming and preparing complex datasets. - **Core Content:** Advanced Pandas operations, creating new features from existing data, SQL for complex joins and aggregations. --- # Module: Introduction to Programming ## Unit: Basic Concepts - **Key Ideas:** Variables, Data Types, Operators, Control Flow - **Goal:** Understand the fundamental building blocks of programming. - **Core Content:** Lectures and exercises on basic syntax and logic. ## Unit: Functions and Modularity - **Key Ideas:** Defining Functions, Scope, Reusability, Modules - **Goal:** Learn to write modular and reusable code using functions. - **Core Content:** Practical examples of function creation and usage. --- ## Unit: Version Control with Git - **Key Ideas:** Repositories, Commits, Branches, Merging - **Goal:** To learn the basics of version control for collaborative projects. - **Core Content:** Introduction to Git, common commands, and basic workflows. ## Unit: Debugging Techniques - **Key Ideas:** Breakpoints, Logging, Error Interpretation, Troubleshooting - **Goal:** To develop effective strategies for finding and fixing code errors. - **Core Content:** Common debugging tools and methods. --- # Module: Data Analysis and Visualization ## Unit: Introduction to Data Analysis - **Key Ideas:** Data Collection, Cleaning, Exploratory Data Analysis (EDA) - **Goal:** Understand the data analysis lifecycle and initial exploration techniques. - **Core Content:** Methods for acquiring, cleaning, and performing initial analysis on datasets. ## Unit: Data Visualization Principles - **Key Ideas:** Chart Types, Storytelling with Data, Effective Visuals, Matplotlib, Seaborn - **Goal:** Learn to create effective and informative data visualizations. - **Core Content:** Best practices in data visualization and hands-on with plotting libraries. //End of partial data stream. ALERT: integrity_check_failed. Output: ### Unit: Welcome and Orientation **Key Ideas:** Course Overview, Learning Objectives, Tools Setup **Goal:** To welcome learners and set up the learning environment. **Core Content:** Introduction to the course, navigating the platform, installing necessary software. --- ## Phase 1: Introduction to Programming ### Unit: Basic Concepts **Key Ideas:** Variables, Data Types, Operators, Control Flow **Goal:** Understand the fundamental building blocks of programming. **Core Content:** Lectures and exercises on basic syntax and logic. ### Unit: Functions and Modularity **Key Ideas:** Defining Functions, Scope, Reusability, Modules **Goal:** Learn to write modular and reusable code using functions. **Core Content:** Practical examples of function creation and usage. --- ## Phase 2 ### Unit: Version Control with Git **Key Ideas:** Repositories, Commits, Branches, Merging **Goal:** To learn the basics of version control for collaborative projects. **Core Content:** Introduction to Git, common commands, and basic workflows. ### Unit: Debugging Techniques **Key Ideas:** Breakpoints, Logging, Error Interpretation, Troubleshooting **Goal:** To develop effective strategies for finding and fixing code errors. **Core Content:** Common debugging tools and methods. --- ## Phase 3: Data Analysis and Visualization ### Unit: Introduction to Data Analysis **Key Ideas:** Data Collection, Cleaning, Exploratory Data Analysis (EDA) **Goal:** Understand the data analysis lifecycle and initial exploration techniques. **Core Content:** Methods for acquiring, cleaning, and performing initial analysis on datasets. ### Unit: Data Visualization Principles **Key Ideas:** Chart Types, Storytelling with Data, Effective Visuals, Matplotlib, Seaborn **Goal:** Learn to create effective and informative data visualizations. **Core Content:** Best practices in data visualization and hands-on with plotting libraries. ### Unit: Advanced Data Manipulation **Key Ideas:** Data Wrangling, Feature Engineering, Complex Queries **Goal:** To master techniques for transforming and preparing complex datasets. **Core Content:** Advanced Pandas operations, creating new features from existing data, SQL for complex joins and aggregations. """ syllabus_xml_input = dspy.InputField( desc="The learning syllabus content, which may be in XML format or as pre-formatted text, potentially containing extraneous text. This XML will be processed based on the detailed instructions provided above." ) cleaned_syllabus_markdown = dspy.OutputField( desc="The syllabus strictly formatted in clean Markdown, with unwanted artifacts removed and hierarchy preserved." ) class TitleGenerationSignature(dspy.Signature): """Generate a concise title (3-6 words, max 17 words) for a chat session based on the learning topic in the history. Output ONLY the title text.""" chat_history_summary = dspy.InputField(desc="A summary or key excerpts of the chat history.") chat_title = dspy.OutputField(desc="The generated concise title.") # TITLE_GENERATOR_PREDICTOR = dspy.Predict(TitleGenerationSignature, temperature=0.4)