Skip to content

Commit

Permalink
cleared up some code
Browse files Browse the repository at this point in the history
  • Loading branch information
yash1337 committed Mar 12, 2017
1 parent 7162226 commit 8e49015
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions src/Scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,16 +31,17 @@
for link in soup.find_all('a'):
small_link.append(link['href']) #getting the link without the base URL
links.append(urljoin(baseURL,link['href'])) #joining the base and the relative URL



#getting the links that we are intrested in i.e. .txt .pdf and .cpp
correctRelativeURL=[]
correctCompletedURL=[]
for item in small_link:
if (".txt" in item or ".pdf" in item or ".cpp" in item):
if ".txt" in item or ".pdf" in item or ".cpp" in item:
correctRelativeURL.append(item)

for item in links:
if (".txt" in item or ".pdf" in item or ".cpp" in item or ".h" in item):
if ".txt" in item or ".pdf" in item or ".cpp" in item or ".h" in item:
correctCompletedURL.append(item)

#looping through the links, downloading files
Expand Down

0 comments on commit 8e49015

Please sign in to comment.