def run():
    import streamlit as st
    from sklearn.metrics import confusion_matrix, classification_report
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    st.title("5. Evaluation")
    st.header("Introduction")
    st.write("""
    Model Evaluation is the process of assessing the performance of a machine learning model using various metrics.
    """)
    st.header("Objectives")
    st.write("""
    - Assess model performance.
    - Compare different models.
    - Select the best model.
    """)
    st.header("Key Activities")
    st.write("""
    - Model validation.
    - Performance metrics calculation.
    - Model comparison.
    """)

    st.write("## Overview")
    st.write("Assessing model performance using appropriate evaluation metrics.")

    st.write("## Key Concepts & Explanations")
    st.markdown("### Confusion Matrix")
    st.write("""
    A confusion matrix is a table used to evaluate the performance of a classification model. It shows the number of true positives, true negatives, false positives, and false negatives. This helps in understanding how well the model is performing in terms of correctly and incorrectly classified instances.
    """)

    st.markdown("### Precision, Recall, F1-Score")
    st.write("""
    - **Precision**: This metric measures the accuracy of the positive predictions. It is the ratio of true positive predictions to the total predicted positives (true positives + false positives). High precision indicates a low false positive rate.
    - **Recall**: Also known as sensitivity, this metric measures the ability of the model to identify all relevant instances. It is the ratio of true positive predictions to the total actual positives (true positives + false negatives). High recall indicates a low false negative rate.
    - **F1-Score**: This is the harmonic mean of precision and recall. It provides a single metric that balances both precision and recall, especially useful when you need to balance the two.
    """)

    st.markdown("### ROC-AUC")
    st.write("""
    - **ROC (Receiver Operating Characteristic) Curve**: This is a graphical representation of the model's performance across different threshold values. It plots the true positive rate (recall) against the false positive rate.
    - **AUC (Area Under the Curve)**: This metric summarizes the ROC curve into a single value. It represents the likelihood that the model will rank a randomly chosen positive instance higher than a randomly chosen negative one. An AUC of 1 indicates a perfect model, while an AUC of 0.5 indicates a model with no discriminative power.
    """)
    q1 = st.radio("Which metric is used for evaluating a classification model?", ["Accuracy", "Mean Squared Error", "All of the above"])
    if q1 == "All of the above":
        st.success("✅ Correct!")
    else:
        st.error("❌ Incorrect.")

    st.write("## Code-Based Quiz")
    code_input = st.text_area("Write a function to calculate the confusion matrix", value="def confusion_mat(y_true, y_pred):\n    return confusion_matrix(y_true, y_pred)")
    if "confusion_matrix" in code_input:
        st.success("✅ Correct!")
    else:
        st.error("❌ Try again.")

    st.write("## Learning Resources")
    st.markdown("""
    - 🎓 [Evaluation Metrics in Machine Learning](https://scikit-learn.org/stable/modules/model_evaluation.html)
    """)