<!-- Use this file to provide workspace-specific custom instructions to Copilot. For more details, visit https://code.visualstudio.com/docs/copilot/copilot-customization#_use-a-githubcopilotinstructionsmd-file -->

# Web Scraper Project Instructions

This is a Python Gradio application for web scraping that:

- Scrapes text content from websites
- Formats content as markdown
- Generates sitemaps from page links
- Provides MCP (Model Context Protocol) server functionality

## Key Libraries

- gradio[mcp]: For the web interface and MCP server capabilities
- requests: For HTTP requests
- beautifulsoup4: For HTML parsing
- markdownify: For converting HTML to markdown
- urllib.parse: For URL handling

## Project Structure

- `app.py`: Main web interface application
- `mcp_server.py`: MCP server that exposes tools for AI integration

## MCP Tools

The MCP server exposes three main tools:

- `scrape_content`: Extract website content as markdown
- `generate_sitemap`: Create sitemap from page links
- `analyze_website`: Complete analysis with content and sitemap

## Code Style

- Use type hints where appropriate
- Include proper error handling for web requests
- Follow PEP 8 style guidelines
- Add docstrings for functions with clear parameter descriptions
- MCP functions should have descriptive docstrings as they become tool descriptions