Skip to content

This demo repository illustrates how to use Python to scrape news articles from Google based on a given keyword. The scraped articles are then processed by Azure OpenAI Service (AOAI)'s GPT-3 model, which generates concise summaries of the main points. The summaries are then formatted and sent via email using MailJet API.

Notifications You must be signed in to change notification settings

easonlai/google_news_content_scrape_and_analyze_with_gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Google News Content Scrape and Analyze with Azure OpenAI Service (GPT-3)

This demo repository illustrates how to use Python to scrape news articles from Google based on a given keyword. The scraped articles are then processed by Azure OpenAI Service (AOAI)'s GPT-3 model, which generates concise summaries of the main points. The summaries are then formatted and sent via email using MailJet API.

This demo uses two Python libraries to scrape the latest news articles from Google and get their full text content. The first library is GoogleNews, which allows us to search for news articles based on a keyword and get their titles and URLs. The second library is Newspaper3k, which enables us to download the HTML pages of the articles and parse them to get their text content. For this demonstration, I decided to scrape the news about GPT, a family of powerful natural language models developed by OpenAI. This topic is very popular and hard to keep up with as normal humans, because there are so many new developments and applications of GPT every day.

This demo also shows how to use the Natural Language Toolkit (NLTK) library to perform chunking, a technique that divides long articles into smaller segments based on linguistic cues. This allows us to overcome the 4000-token limit of GPT-3, which is the maximum number of words that it can process at a time.

alt text

Enjoy!

About

This demo repository illustrates how to use Python to scrape news articles from Google based on a given keyword. The scraped articles are then processed by Azure OpenAI Service (AOAI)'s GPT-3 model, which generates concise summaries of the main points. The summaries are then formatted and sent via email using MailJet API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published