Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSS feeds for search queries #37

Open
bnewbold opened this issue Jan 22, 2021 · 6 comments
Open

RSS feeds for search queries #37

bnewbold opened this issue Jan 22, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@bnewbold
Copy link
Contributor

This feature would allow creation of RSS feed endpoints for any search query. The feed would allow users to "subscribe" to new search hits.

Some implementation thoughts:

  • if query is embedded in feed URL, no need to retain any server-side state
  • utility of this may be dependent on having decent subject/categorization metadata? or maybe not, if keywords are used.
  • feed could be sorted/filtered by 1) release date of works 2) index document update time 3) index document creation time. or maybe some combination? index document creation time will not be a stable/long-term metadata field (eg, when re-indexing all document creation times will get incremented)
  • it is an assumption that this would not result in many actual queries and search engine load, as RSS usually is only fetched... daily? if there was a lot of load, could cache results in elasticsearch itself (eg, hash the query string, check if there is a cached result from past N hours, only run query and update cache if stale; store results in separate index)
@bnewbold bnewbold added the enhancement New feature or request label Jan 22, 2021
@sckott
Copy link

sckott commented Jan 22, 2021

I'd be happy to test if implemented

@bnewbold
Copy link
Contributor Author

Re-indexing is finally caught up, and the "papers from the past week" type of query should work, so starting to think about this. And i'm excited!

This library seems like a great super-simple way to implement an RSS feed in fastapi; though maybe Atom is preferred?: https://pypi.org/project/fastapi-rss/

Presumably would have a small jinja2 template to render a summary with the existing macros into HTML, and inject that into the items.

Would probably be two new endpoints: a form to help craft a query, with "feed-specific" query parameters, and an RSS endpoint itself (XML).

I think the default parameters should be:

  • filter to doc_type:work, so no digitized pages (which are a hack)
  • filter to the time range "today to 90 days ago", and sort by recency (date), and then sort by _doc (so there is a stable sort order, within works with the same date)

Then in the generation page have a form for other filters, and a free-form query box (same as the regular search).

Run the query with the same current routine, take the results, transform, and return as the feed.

@sckott
Copy link

sckott commented Feb 15, 2021

I don't have a preference between RSS and atom.

Plan sounds great to me

@bnewbold
Copy link
Contributor Author

bnewbold commented Apr 7, 2022

I just pushed a minimal version of this. On search result pages, there is an "RSS Feed" link under the search box, which goes straight to an RSS 2.0 file with the search parameters.

I tested with my feed reader and it seems to be working over the past week. Any feedback welcome!

@sckott
Copy link

sckott commented Apr 7, 2022

Awesome! Thanks for getting this working. My original use case is gone with changing jobs, but this will be useful for tracking different paper topics in Feedbin

@lstrtstr
Copy link

lstrtstr commented May 5, 2023

Any feedback welcome!

Many RSS feeds of journal searches seem to have no entries.
Is that on purpose or a bug?

For example, if I search for journal:Catena, I get results as recent as 2022 (https://scholar.archive.org/search?q=journal%3ACatena&sort_order=time_desc). The corresponding RSS feed https://scholar.archive.org/feed/rss?q=journal%3ACatena has no entries, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants