Skip to content

Latest commit



314 lines (262 loc) · 10.4 KB

File metadata and controls

314 lines (262 loc) · 10.4 KB

1. Inspect element issue

While checking the source of any webpage - Directly Inspect element of Value visible on Webpage isnt always helpful

Thanks to

  • Realization .
  1. User(Client) Requests using Browser
  2. Reaches the Server
  3. Data can be fetched and injected into the Template and send as a Response back to the Client i.e. Server side Page creation OR Client side Web App -
  4. Server sends the Static content HTML page but Data - Javascript in the server response fetches the data from an API and uses it to create the page client-side

To get such data - Developer tools - Network - XHR (instead of normal HTML )

#html_doc = urllib2.urlopen("").read()

When u make a HTTP Request, if we read it using urllib It treats the response as a String Hence despite the format was in JSON, Errors like

  1. TypeError: string indices must be integers, not str
  2. AttributeError str-object-has-no-attribute

Hence use other library that treats Request - Response as Json

2. Attribute error

AttributeError: 'unicode' object has no attribute 'str'

Solution -


3. Unicode Encode Error

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)


s = full_name.encode('ascii', 'ignore').decode('ascii')

Disadv - removes all other internationalizations

4. Unicode - String

', u'ElliotEmbleton', u'DeclanRice']

To get rid of u' Convert Unicode string to normal ascii string


5. Type error

response = requests.get(""+main_user_id+"/event/"+gameweek+"/picks")
TypeError: cannot concatenate 'str' and 'int' objects



6. Name error

if player["is_captain"]==true:
NameError: name 'true' is not defined
  if player["is_captain"]===true:                             ^
SyntaxError: invalid syntax


if player["is_captain"]:

7. Type Error - NoneType

TypeError: 'NoneType' object has no attribute '__getitem__'

At times some data depends on User Login - Such data is not returned in JSON response as visible in Postman or HTTP request's corr. response

However visible in Browser->Network->XHR->response part To know what data was sent while making such a request - rightclick - Get cURL - for e.g. For -

Actual URL was

curl '' -H 'Host:' -H 'User-Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'X-Requested-With: XMLHttpRequest' -H 'Referer:' -H 'Cookie: _ga=GA1.2.309538033.1473872047; __gads=ID=1e26f8a1f937797f:T=1476526048:S=ALNI_MYwwjDM8utxJfBY-D7A1smdFhSnIA; csrftoken=GcIusuHLJOCxMHiZ1wbI8IsqkcybWq2t; _ga=GA1.3.309538033.1473872047; pl_profile="eyJzIjogIld6SXNNemc1TkRjME5GMDoxY05MRlY6TFE0aE0yWTI4WFhOaTdqeW04c1lqdXU2Nk4wIiwgInUiOiB7ImxuIjogIkJhcGF0IiwgImZjIjogOCwgImlkIjogMzg5NDc0NCwgImZuIjogIkNoYWl0YW55YSJ9fQ=="; sessionid=".eJyrVkpPzE2NT85PSVWyUirISSvIUdJRik8sLcmILy1OLYpPSkzOTs1LAUsmVqYW6UEFivUCwHwnqDyKpkyg-mhDHWMLSxNzE5PYWgBVoiN7:1cNcq5:gyTfZ4HaGrHPDIPnYJh0O6NGLKs"; _gat=1; _dc_gtm_UA-33785302-1=1' -H 'Connection: keep-alive'

Here actual Sessions, parameters, tokens were passed

8. Value Error

Traceback (most recent call last):
  File "", line 23, in <module>
    json_data = json.loads(response.text)
  File "/usr/lib64/python2.7/json/", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/", line 382, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
  • Solution - Handle the exception (basically bad request)

9. URL issue

File "", line 36, in <module>
    response = requests.get(""+str(counter))
  File "/usr/lib/python2.7/site-packages/requests/", line 71, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/", line 57, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/", line 585, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/", line 403, in send
  File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/", line 578, in urlopen
  File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/", line 385, in _make_request
    httplib_response = conn.getresponse(buffering=True)
  File "/usr/lib64/python2.7/", line 1136, in getresponse
  File "/usr/lib64/python2.7/", line 453, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/", line 409, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/", line 480, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/", line 756, in recv
  File "/usr/lib64/python2.7/", line 643, in read
    v =

10. Connection Error

Traceback (most recent call last):
File "", line 20, in <module>
response = requests.get(""+str(main_user_id)+"/event/"+str(gameweek)+"/picks")
File "/usr/lib/python2.7/dist-packages/requests/", line 55, in get
return request('get', url, kwargs)
File "/usr/lib/python2.7/dist-packages/requests/", line 44, in request
return session.request(method=method, url=url, kwargs)
File "/usr/lib/python2.7/dist-packages/requests/", line 335, in request
resp = self.send(prep, send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/", line 438, in send
r = adapter.send(request, kwargs)
File "/usr/lib/python2.7/dist-packages/requests/", line 327, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='', port=443): Max retries exceeded with url: /drf/entry/197156/event/1/picks (Caused by <class 'socket.error'>: [Errno 104] Connection reset by peer)

Solution: Check for the response and if it returns in any error - query after 5 sec sleep

        while response == '':
                        response = requests.get(url)
                        print "Connection refused"
                print "na"
                print "200"
                json_data = json.loads(response.text)

11. Package error

mongoimport -d fpl_users -c fpl_user_data --type csv --file data.csv --headerline
bash: mongoimport: command not found...
Packages providing this file are:
dnf install mongodb-org-tools
Failed to synchronize cache for repo 'dockerrepo', disabling.
Last metadata expiration check: 0:27:04 ago on Thu Jan  5 11:01:05 2017.
Error: package mongodb-org-tools-3.2.0-1.el7.x86_64 conflicts with mongodb-server provided by mongodb-server-3.2.8-2.fc25.x86_64
(try to add '--allowerasing' to command line to replace conflicting packages)
  • Solution
dnf install mongo-tools

12. Operation failed

  • Problem -
Error: error: {
   "waitedMS" : NumberLong(0),
   "ok" : 0,
   "errmsg" : "Executor error during find command: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.",
   "code" : 96
{ "was" : 33554432, "ok" : 1 }
> db.fpl_user_data.find().sort({KEY:1})

13. Unsorted output

Error: error: {
	"waitedMS" : NumberLong(0),
	"ok" : 0,
	"errmsg" : "bad sort specification",
	"code" : 2
  • Issue - Output was not coming in sorted fashion

  • Solution


db.fpl_user_data.find().sort( { main_user_id: 1 } ).pretty()

14. Access Mongo using Python

  • Problem - Accesing MongoDB using Python

15. Aggregation

16. Data inconsistency

  1. Incomplete
  • count - 193212
  • last elemet id 198181
  1. empty data fields (ASCII Unicode)
  • first name last name
  • main user team name
  1. duplicate data
  • Solution
  • key = attribute name / field / column name
  • value = corresponding value

17. Performance Increase

  • Trying to Improve the Speed / Rate of Requests from 1 request per second to 10 or more
  • Tried Python libraries
  1. Twisted
  2. Grequests
  • Grequest is better but gave following errors
('Connection aborted.', error(22, 'Invalid argument'))
HTTPSConnectionPool(host='', port=443): Max retries exceeded with url: /drf/entry/2365060 (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f9b6dcf1e50>: Failed to establish a new connection: [Errno 110] Connection timed out',))
No JSON object could be decoded