Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to cache the new benchmark returns #1947

Closed
ntindicator opened this issue Sep 15, 2017 · 5 comments
Closed

Failed to cache the new benchmark returns #1947

ntindicator opened this issue Sep 15, 2017 · 5 comments

Comments

@ntindicator
Copy link

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

  • Operating System: macOS Sierra 10.12.4
  • Python Version: 3.5.1
  • Python Bitness: 64
  • How did you install Zipline: Other - using Pycharm Interpreter to virtualenv
  • Python packages:
    alembic==0.9.3
    appnope==0.1.0
    bcolz==0.12.1
    Bottleneck==1.2.1
    cachetools==1.1.6
    certifi==2017.4.17
    chardet==3.0.4
    click==6.7
    contextlib2==0.5.5
    cycler==0.10.0
    cyordereddict==1.0.0
    Cython==0.25.2
    decorator==4.0.11
    empyrical==0.3.0
    entrypoints==0.2.2
    idna==2.5
    intervaltree==2.1.0
    ipykernel==4.5.0
    ipython==5.1.0
    ipython-genutils==0.1.0
    ipywidgets==5.2.2
    Jinja2==2.8
    jsonschema==2.5.1
    jupyter==1.0.0
    jupyter-client==4.4.0
    jupyter-console==5.0.0
    jupyter-core==4.2.0
    Logbook==1.1.0
    lru-dict==1.1.6
    Mako==1.0.7
    MarkupSafe==1.0
    matplotlib==2.0.2
    mistune==0.7.3
    multipledispatch==0.4.9
    multitasking==0.0.4
    nbconvert==4.2.0
    nbformat==4.1.0
    networkx==1.11
    notebook==4.2.3
    numexpr==2.6.2
    numpy==1.13.1
    pandas==0.18.1
    pandas-datareader==0.4.0
    patsy==0.4.1
    pexpect==4.2.1
    pickleshare==0.7.4
    prompt-toolkit==1.0.7
    ptyprocess==0.5.1
    Pygments==2.1.3
    pyparsing==2.2.0
    python-dateutil==2.6.1
    python-editor==1.0.3
    pytz==2017.2
    pyzmq==15.4.0
    qtconsole==4.2.1
    requests==2.18.1
    requests-file==1.4.2
    requests-ftp==0.3.1
    scipy==0.19.1
    simplegeneric==0.8.1
    six==1.10.0
    sortedcontainers==1.5.7
    SQLAlchemy==1.1.11
    statsmodels==0.8.0
    tables==3.4.2
    terminado==0.6
    toolz==0.8.2
    tornado==4.4.1
    traitlets==4.3.0
    urllib3==1.21.1
    wcwidth==0.1.7
    widgetsnbextension==1.2.6
    zipline==1.1.1

Now that you know a little about me, let me tell you about the issue I am
having:

When I try to execute the example script it's generating an error

  • What did you expect to happen?
    something like this:
    AAPL
    [2015-11-04 22:45:32.820166] INFO: Performance: Simulated 3521 trading days out of 3521.
    [2015-11-04 22:45:32.820314] INFO: Performance: first open: 2000-01-03 14:31:00+00:00
    [2015-11-04 22:45:32.820401] INFO: Performance: last close: 2013-12-31 21:00:00+00:00

  • What happened instead?
    2017-09-15 02:02:50.333834] INFO: Loader: Downloading benchmark data for 'SPY' from 1989-12-29 00:00:00+00:00 to 2017-09-13 00:00:00+00:00
    [2017-09-15 02:02:50.724867] ERROR: Loader: Failed to cache the new benchmark returns
    urllib.error.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

Here is how you can reproduce this issue on your machine:

Reproduction Steps

  1. Install Zipline with Pycharm project interpreter to virtualenv
  2. zipline ingest
  3. copy buyapple.py to project folder
  4. zipline run -f buyapple.py --start 2000-1-1 --end 2014-1-1 -o buyapple_out.pickle
    ...

What steps have you taken to resolve this already?

looked on stackoverflow and here for similar issues
...

Anything else?

Tried using different dates
...

@Gillu13
Copy link

Gillu13 commented Sep 16, 2017

Hello,

It's fun because I also have an issue with the zipline examples approximately in the same place in the code. The fun part is that the loader worked perfectly well a couple of weeks ago and does not work any more now.

My guess is that it has to do either with Google finance or with the pandas_datareader module. Indeed, it looks like your run fails here.

What happens if you try to run the following piece of script?

import pandas_datareader.data as pd_reader
import datetime

start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2013, 1, 27)
f = pd_reader.DataReader("SPY", 'google', start, end)
f.ix['2010-01-04']

As far as I am concerned, it does not fail but downloads data from totally different time range (between 2016-09-19 and now).

@Gillu13
Copy link

Gillu13 commented Sep 16, 2017

OK, it looks like this is a known and new issue

@Gillu13
Copy link

Gillu13 commented Sep 16, 2017

I am not 100% sure that this is the root cause of your issue, but I am pretty sure that even if your run would have passed this step it would have failed at the next step because of this. Obviously Google recently stopped providing historical data older than one year.

@ntindicator
Copy link
Author

Thanks for replying. Using the code you provided worked with the first date downloaded.
f.ix['2016-09-19'] Out[8]: Open 214.13 High 214.88 Low 213.03 Close 213.41 Volume 80250490.00 Name: 2016-09-19 00:00:00, dtype: float64

I did a bit of research and the data is still there on Google. They've probably added a cookie like yahoo. If you visit this link and change the dates it loads the data.
http://www.google.com/finance/historical?q=NASDAQ:GOOGL

@freddiev4
Copy link
Contributor

freddiev4 commented Oct 3, 2017

The reason for this is because Google has now limited users to about 251 days worth of data per request, so you can't run backtests over a year. There is a fix currently being worked on.

There are duplicates of this issue so I'm just going to direct everyone to this issue: #1965. I'll comment there when there is a fix on master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants