This Discord bot combines two main features:
- Fetching and posting the latest ArXiv AI research papers
- Fetching and posting the latest news articles related to data engineering, prompt engineering, and AI
- Scrapes the latest AI research papers from ArXiv
- Formats paper information with clickable links
- Sends paper information to a Discord channel using a webhook
- Ensures complete article information in each message chunk
- Fetches news articles from NewsAPI
- Filters articles by specified sources (Forbes, TechRadar, WebProNews)
- Sends the latest 5 articles to a Discord channel using a webhook
- Runs asynchronously for improved performance
- Python 3.11+
- A NewsAPI key (get one at https://newsapi.org/)
- A Discord webhook URL
-
Clone this repository:
git clone https://github.com/your-username/arxiv-engineering-news-discord-bot.git cd arxiv-engineering-news-discord-bot
-
Install the required packages:
pip install -r requirements.txt
-
Create a
.env
file in the project root and add your API keys:BASE_URL=https://arxiv.org/list/cs.AI/recent DISCORD_WEBHOOK_URL=your_discord_webhook_url_here NEWS_API_KEY=your_newsapi_key_here
Run the script with:
python main.py
The script will:
- Fetch the latest ArXiv AI research papers, format them, and send them to the specified Discord channel.
- Fetch the latest engineering news articles, print them to the console, and send them to the specified Discord channel.
- To change the number of results fetched, modify the
url
variable in thescrape_and_send
function. - To adjust the formatting of paper information, modify the
format_paper_info
function.
- To change the news sources, modify the
ALLOWED_SOURCES
list in the script. - To adjust the time range for fetching news, modify the
seven_days_ago
variable in thefetch_engineering_news
function. - To change the number of articles sent to Discord, modify the slice in the
send_discord_message
function (articles[:5]
).
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.