This library is written for crawling Sina Weibo due to extremely unfriendly Sina API.
At the time, pycrawler_weibo only supports crawling on searching certain keyword.
- [Python 2.7] (https://www.python.org/downloads/)
- [Beautifulsoup4] (http://www.crummy.com/software/BeautifulSoup/bs4/)
pip install beautifulsoup4
- [MySQL-python] (http://mysql-python.sourceforge.net/) (option)
pip install mysql-python
- Open test.py and - edit login information and topic/mention - setup MySQL (option)
- Go to working directory in terminal
cd ~/...
- Run test.py
python test.py
- class WeiboCrawler(isConnectMySQL=True, htmlOutputDir='')
- def search(keyword, pages=range(1, 51))
- param keyword: (str/list) search keyword
- param pages: (int/list) pages of search
- def search(keyword, pages=range(1, 51))