-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: consider article:author meta tag as a source of author name metadata #938
Comments
Another example of that field containing a non-URL: https://www.newyorker.com/magazine/2024/12/16/president-emmanuel-macron-has-plunged-france-into-chaos <meta property="article:author" content="Lauren Collins"> <meta property="article:author" content="Stella Kim"> |
danielnixon
added a commit
to danielnixon/readability
that referenced
this issue
Jan 1, 2025
PR: #942 |
gijsk
added a commit
that referenced
this issue
Jan 2, 2025
* Handle article:author meta tag. Fixes #938 * Add newly found BBC byline, revert apparently unnecessarily regex change. --------- Co-authored-by: Gijs Kruitbosch <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
article:author
meta tag is "meant" to contain a URL (see https://developers.facebook.com/blog/post/2013/06/19/platform-updates--new-open-graph-tags-for-media-publishers-and-more/).On many sites it does seem to contain a URL, but on a number of sites I've tested it contains the author's name.
One example is https://www.atlasobscura.com/articles/the-deck-of-cards-that-made-tarot-a-global-phenomenon
On that site, we have:
On that site, there are no other better sources of author name, so Readability consults the DOM and arrives at an unfortunate author string of
Laura June Topolsky July 10, 2015
.My suggestion:
The text was updated successfully, but these errors were encountered: