Skip to content

Commit

Permalink
Update README and add script comment
Browse files Browse the repository at this point in the history
  • Loading branch information
jxu committed Aug 31, 2024
1 parent a559382 commit 8fa8535
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 3 deletions.
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Example Usage

mkdir render
cd render
../download.sh 1 761
./download.bash


Known Bugs/Inconveniences
Expand All @@ -37,6 +37,10 @@ History
This simple download-and-combine script has been written several times as exercises in different tools and in response to Project Euler layout changes.

1. The first version used PhantomJS as a headless browser to render the problems, then a separate python script using BeautifulSoup4 to search and download extra files (text and GIF), Pillow to check for animated GIFs, and PyPDF2 to combine the PDFs of all problems into one PDF.

2. Later I discovered Project Euler had a convenient special URL to show all problems on one page (https://projecteuler.net/show=all). This let me simply use Firefox to print a smaller PDF (in pages and size) that did not rely on the discontinued PhantomJS. The python script to download extra files remained the same.
3. Sometime after summer 2020 the convenient URL functionality disappeared, so I had to go back to downloading problems individually and combining them. This time I decided to forgo python and use only shell tools as an exercise (and to produce a smaller script). Chromium in headless mode had a convenient option to print to PDF, pup handled searching the HTML for extra files, Ghostscript combined the PDFs to a set print quality, and ImageMagick identified animated GIFs.

3. In summer 2022, the convenient show all functionality disappeared, so I had to go back to downloading problems individually and combining them. This time I decided to forgo python and use only shell tools as an exercise (and to produce a smaller script). Chromium conveniently printed to PDF in headless mode, pup handled searching the HTML for extra files, Ghostscript combined the PDFs to a set print quality, and ImageMagick identified animated GIFs.

4. In summer 2024, I noticed the site had a link to display a group of 50 archived problems on the same page, with published date, solved by, and difficulty rating info. I considered the idea of just distributing PDFs for each group of 50. Unfortunately, this page is only available to logged-in users, probably for bandwidth reasons as alluded to in the News update. I could print all of them manually from Firefox, but I didn't feel like doing that, so the problems PDF is the same.
Also, for each page there was a new button for the minimal HTML, which gave me the silly idea to create a single giant raw HTML page. Then I could use Pandoc to turn this into more readable Markdown, for those who really insisted on a text file, like the original Local Euler. I don't think anyone will actually use this, but I also don't have any evidence people used the regular PDF either. Anyhow, it shows off how flexible Pandoc's conversion abilities are.
3 changes: 2 additions & 1 deletion download.bash
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ pupcurl () {
}

# loop through page numbers
for i in {1..903}; do
# could be done in parallel, but wouldn't be nice for their servers
for i in {001..903}; do
problem_url="https://projecteuler.net/problem=$i"
tmp_html=tmp.html

Expand Down

0 comments on commit 8fa8535

Please sign in to comment.