transformers datasets gradio minijinja PyMuPDF torch beautifulsoup4