Skip to content
/ paper-pulse Public template
forked from Jamie-Cui/paper-pulse

Automatically fetch, filter, and summarize research papers from arXiv & IACR with AI-powered bilingual summaries. Free on GitHub Pages.

License

Notifications You must be signed in to change notification settings

PingyiHu/paper-pulse

 
 

Repository files navigation

Paper Pulse

Automatically fetch, filter, and summarize research papers from arXiv & IACR — with AI-generated bilingual (Chinese/English) summaries. Deployed for free on GitHub Pages, updated daily via GitHub Actions.

Live Demo | RSS Feed

Paper Pulse Screenshot


Why Paper Pulse?

  • Tired of manually checking arXiv every day? Paper Pulse fetches papers automatically.
  • Only care about specific topics? Flexible keyword filtering (AND/OR logic) keeps only what matters.
  • Need Chinese + English summaries? AI generates both — toggle per card with one click.
  • Don't want to pay for hosting? Runs entirely on GitHub Actions + Pages. Zero cost.

Features

Feature Description
Multi-source Fetches from arXiv (configurable categories) and IACR ePrint
Keyword filtering OR between lines, AND within a line — fine-grained control
Bilingual AI summaries Chinese + English via Qwen (DashScope API), toggle per card
Daily automation GitHub Actions cron job, auto-commits results
RSS feed Subscribe in any reader — feed.xml auto-generated
BibTeX export Single paper or bulk export
Email digest Daily report with stats and token usage
Static site No server needed — GitHub Pages serves everything
Markdown summaries Rich formatting in AI-generated summaries

Quick Start (5 steps)

1. Use this template / Fork

Click "Use this template" (or fork) to create your own copy.

2. Get a DashScope API key

Sign up at DashScope Console — free tier available.

3. Add your API key to GitHub Secrets

Go to Settings → Secrets and variables → Actions → New repository secret:

  • Name: MODELSCOPE_API_KEY (or DASHSCOPE_API_KEY)
  • Value: your API key

4. Enable GitHub Pages

Go to Settings → Pages:

  • Source: Deploy from a branch
  • Branch: master, Folder: / (root)

5. Run the workflow

Go to Actions → Fetch Papers → Run workflow. After it completes, visit https://<username>.github.io/<repo-name>/.

From now on, papers are fetched automatically every day at 00:00 UTC.

6. (Optional) Enable email digest

Paper Pulse can send a daily email report with statistics (new papers, failed summaries, token usage). To enable it, add three more secrets in Settings → Secrets and variables → Actions:

Secret Description
EMAIL_USERNAME Gmail address used to send the report (e.g. you@gmail.com)
EMAIL_PASSWORD Gmail App Password (not your login password)
EMAIL_TO Recipient address (can be the same as EMAIL_USERNAME)

Note: Gmail requires an App Password — you must enable 2-Step Verification on your Google account first, then generate an App Password under Security → App passwords. Regular Gmail passwords will not work.

If these secrets are not set, the workflow still runs normally — the email step is simply skipped.

Customization

Keywords (keywords.txt)

# Each line = OR condition. Words on same line = AND condition.
transformer              # Papers containing "transformer"
neural backdoor          # Papers containing BOTH "neural" AND "backdoor"
federated learning       # Papers containing "federated learning"

Keyword filtering can be independently toggled per source (apply_to_arxiv / apply_to_iacr in config.toml).

Configuration (config.toml)

Setting Where Default
Paper retention general.days_back 30 days
arXiv categories fetchers.arxiv.categories cs.CR, cs.AI, cs.LG, cs.CL
AI model summarizer.model qwen-plus
RSS items rss.max_items 50
Site URL general.site_url (your GitHub Pages URL)

See CONFIG_GUIDE.md for all options.

Project Structure

paper-pulse/
├── .github/workflows/
│   └── fetch-papers.yml      # Daily automation
├── scripts/
│   ├── fetchers/
│   │   ├── arxiv.py          # arXiv API fetcher
│   │   └── iacr.py           # IACR RSS fetcher
│   ├── filter.py             # Keyword filtering engine
│   ├── summarizer.py         # Bilingual AI summarization
│   ├── rss.py                # RSS feed generator
│   └── main.py               # Pipeline orchestrator
├── data/
│   ├── papers.json           # Paper database
│   └── failed.json           # Failed summarization queue
├── config.toml               # All configuration
├── keywords.txt              # Keyword filter rules
├── index.html / app.js / styles.css  # Frontend
└── feed.xml                  # RSS feed (auto-generated)

How It Works

Fetch (arXiv + IACR)
  → Filter (keyword matching)
    → Summarize (Qwen AI, bilingual)
      → Merge & Deduplicate
        → Save (papers.json)
          → Generate RSS (feed.xml)
            → Commit & Push (GitHub Actions)

Failed summaries are retried automatically on the next run.

License

GPL-3.0

About

Automatically fetch, filter, and summarize research papers from arXiv & IACR with AI-powered bilingual summaries. Free on GitHub Pages.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 72.3%
  • JavaScript 14.6%
  • CSS 9.0%
  • HTML 4.1%