This converts a .json and pushes to HF
Creates question and answer pairs with GPT 3.5 turbo
Splits large pdfs into 100 page sections
Takes a pdf, cleans and outputs a txt