-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor project to use single problems page
Remove PhantomJS / javascript code Simplify and rename python script Update README
- Loading branch information
Showing
4 changed files
with
61 additions
and
157 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,27 @@ | ||
Project Euler Offline | ||
===================== | ||
All Project Euler problems, with MathJax and images, as a single PDF. Additional text files are provided. Get the releases [here](https://github.com/wxv/project-euler-offline/releases). | ||
All Project Euler problems, with MathJax and images, as a single PDF. Additional text files are provided. [Get the releases here.](https://github.com/wxv/project-euler-offline/releases) | ||
|
||
Please report any inaccuracies or give feedback. Thanks. | ||
|
||
Inspired by [Kyle Keen's original Local Euler](http://kmkeen.com/local-euler/2008-07-16-07-33-00.html). | ||
|
||
Installation and Usage | ||
---------------------- | ||
|
||
Note: previously PhantomJS was used to download each problem individually as a PDF, and PyPDF2 was used to combine together all problems. | ||
|
||
Now, use "Print to File" https://projecteuler.net/show=all using Firefox (with no Headers and Footers in options). This is simpler, produces a smaller PDF, and does not rely on the discontinued PhantomJS. The python script to download extra files remains the same functionally. | ||
|
||
Requirements: | ||
- PhantomJS (`apt install phantomjs`) | ||
- Node modules system, webpage (`npm install system webpage`) | ||
- Python 3 and PyPDF2, BeautifulSoup, lxml, Pillow (`pip install beautifulsoup4 lxml pypdf2 pillow`) | ||
- Python 3 and BeautifulSoup, lxml, Pillow (`pip install beautifulsoup4 lxml pillow`) | ||
- BeautifulSoup and Pillow are only required for downloading extra text and images (animated GIF only). | ||
|
||
My usage process (replace 1 and 628 with whatever range you like): | ||
My usage process: | ||
|
||
phantomjs capture.js 1 628 | ||
python3 combine.py 1 628 | ||
// Optional: download solutions from https://github.com/luckytoilet/projecteuler-solutions | ||
mkdir render | ||
# Save render/problems.pdf with Firefox as above | ||
python3 download_extra.py | ||
cd render | ||
zip problems problems.pdf *.txt *.gif | ||
zip problems.zip problems.pdf *.txt *.gif | ||
|
||
Since each page is independent, it is possible to run multiple processes of | ||
`capture.js` at once, each downloading a certain range. |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import sys | ||
from os import sep | ||
# Not async for now to keep rate of requests low | ||
from bs4 import BeautifulSoup | ||
import requests | ||
from os.path import basename | ||
from PIL import Image | ||
from io import BytesIO | ||
|
||
|
||
RENDER_DIR = "render" | ||
SITE_MAIN = "https://projecteuler.net/" | ||
|
||
|
||
def download_extra(url): | ||
"""Finds if available a .txt attachment or animated .gif and downloads it | ||
to RENDER_DIR | ||
""" | ||
content = requests.get(url).content | ||
soup = BeautifulSoup(content, "lxml") | ||
for a in soup.find_all('a', href=True): | ||
href = a["href"] | ||
if href.endswith(".txt"): | ||
print("Writing", href) | ||
r = requests.get(SITE_MAIN + href) | ||
with open(RENDER_DIR + sep + basename(href), 'wb') as f: | ||
f.write(r.content) | ||
|
||
for img in soup.find_all("img"): | ||
img_src = img["src"] | ||
|
||
# Skip non-GIFs and spacer.gif | ||
if not img_src.endswith(".gif") or img_src.endswith("spacer.gif"): | ||
continue | ||
|
||
r = requests.get(SITE_MAIN + img_src) | ||
# Only write animated GIFs | ||
if Image.open(BytesIO(r.content)).is_animated: | ||
print("Writing", img_src) | ||
with open(RENDER_DIR + sep + basename(img_src), 'wb') as f: | ||
f.write(r.content) | ||
|
||
|
||
def main(): | ||
download_extra("https://projecteuler.net/show=all") | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |