RedShelf Downloader

Downloads the content of each page of the textbook and converts it to a PDF file.

Each page is stored in an independent folder so if you want to compile the textbook in a different way, you can.

Dependencies

requests
pdfkit
pymupdf
wkhtmltopdf (not from pip)

pip install requests pdfkit pymupdf

Install wkhtmltopdf using your package manager or from their website. Make sure that it is added to your PATH environment variable as well.

Usage

python scrape.py

Configure

Before using this tool, you must configure the config.json file. If the file is not present, run the program to automatically generate it.

It should look something like this:

{
  "num_threads": 1,
  "num_pages": 1,
  "download_path": "pages",
  "book_id": "",
  "cookies": {
    "csrftoken": "",
    "session_id": ""
  }
}

To get the values for cookies, num_pages, and book_id, open the textbook in the browser.

Cookies

Enter the browser's devtools and inspect a network request for the page (in the file column, it should be a single number). Click on the request and copy and paste the cookies into the file.

Num Pages / Book ID

Go all the way to the end of the textbook and look at the url.

The url should be formatted like so: https://platform.virdocs.com/read/book_id/page_number

In book_id replace the X's with the book ID in the url. Update num_pages with the page number in the URL, not in the UI of the website.

Num Threads

The number of threads to use to download the book. The higher the number, the faster the book will download, but you may run into rate limits at higher numbers.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
examples		examples
redshelf_downloader		redshelf_downloader
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
scrape.py		scrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RedShelf Downloader

Dependencies

Usage

Configure

Cookies

Num Pages / Book ID

Num Threads

About

Releases

Packages

Contributors 2

Languages

License

erikas-taroza/redshelf_downloader

Folders and files

Latest commit

History

Repository files navigation

RedShelf Downloader

Dependencies

Usage

Configure

Cookies

Num Pages / Book ID

Num Threads

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages