We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eprints2archives does behave somewhat like a web crawler, so maybe it should pay attention to robots.txt.
Here are some resources to start looking at what needs to be done:
https://developers.google.com/search/reference/robots_txt https://tools.ietf.org/html/draft-koster-rep-00 https://yoast.com/ultimate-guide-robots-txt/ https://en.wikipedia.org/wiki/Robots_exclusion_standard https://opensource.googleblog.com/2019/07/googles-robotstxt-parser-is-now-open.html https://github.com/google/robotstxt https://www.scrapehero.com/how-to-prevent-getting-blacklisted-while-scraping/ https://medium.com/ub-women-data-scholars/let-the-robot-do-your-work-web-scraping-with-python-9c147fb7690f https://www.promptcloud.com/blog/how-to-read-and-respect-robots-file/ https://opensource.googleblog.com/2019/07/googles-robotstxt-parser-is-now-open.html
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Eprints2archives does behave somewhat like a web crawler, so maybe it should pay attention to robots.txt.
Here are some resources to start looking at what needs to be done:
https://developers.google.com/search/reference/robots_txt
https://tools.ietf.org/html/draft-koster-rep-00
https://yoast.com/ultimate-guide-robots-txt/
https://en.wikipedia.org/wiki/Robots_exclusion_standard
https://opensource.googleblog.com/2019/07/googles-robotstxt-parser-is-now-open.html
https://github.com/google/robotstxt
https://www.scrapehero.com/how-to-prevent-getting-blacklisted-while-scraping/
https://medium.com/ub-women-data-scholars/let-the-robot-do-your-work-web-scraping-with-python-9c147fb7690f
https://www.promptcloud.com/blog/how-to-read-and-respect-robots-file/
https://opensource.googleblog.com/2019/07/googles-robotstxt-parser-is-now-open.html
The text was updated successfully, but these errors were encountered: