Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
fdaudens 
posted an update May 28
Post
1104
How can AI help us write better headlines and reach more people?

I experimented with a new approach that is both useful and fun. It can help you overcome writer’s block, find better headlines, and make your blog posts and news articles climb in search engine results. Plus, we will learn new concepts along the way!

1️⃣ First, I scraped all the blog posts written on Hugging Face to create a dataset with the headlines, texts, dates, and authors' names.

2️⃣ I filtered the dataset to remove posts that were too long and would require a model with a longer context window. This was done to keep the project simple and cost-effective (actually, free).

3️⃣ Then, I used a dataset generation workflow built by @davanstrien to generate a DPO dataset.

4️⃣ As a last step, you can collectively rate these evaluations to improve the quality of the dataset using an easy-to-use interface with Argilla. Take a look at it and rate some of them! This way, you can contribute to making this dataset useful for different newsrooms that could use it as a starting point.

𝐖𝐡𝐲 𝐢𝐭 𝐦𝐚𝐭𝐭𝐞𝐫𝐬. This example is compelling because, if you look at the dataset, you can see some examples where the headlines are enhanced by the addition of an important keyword or an action verb.
These tweaks can have a big impact on your position in search engines and, therefore, on your traffic. It’s also good leverage for our creativity since you can compare the initial idea with another one from an outside perspective.

Imagine if you’re a large news organization; you could run this experiment with thousands of news articles.

With a dataset of several hundred to thousands of entries, you could fine-tune a model to suggest headlines better tailored to your needs and writing style.

👉 Take a look at it and rate the headlines fdaudens/journalism-argilla-space
👉 Daniel's code https://github.com/huggingface/data-is-better-together/blob/main/dpo/README.md

Great stuff, I love to see how this is evolving!