-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disruption of equities data :: pandas_datareader dependency on Yahoo and Google Finance API #7
Comments
Reports of Google limiting data length to merely one year |
What means:
Will there be no new quotes after 2017-09-05 or no data before 2017-09-05? I don't understand it. I want to mention, that the new sources are not suitable for all users and do not cover all fields of interest :( |
@paintdog tail() retrieves the tail end of a dataframe. All our sources for suitable for all users in the field of financial economics. |
StackOverflow: DataReader google finance date not working points to change in URL at Google as cause for data disruption. But unfortunately, for the PR to modify said URL, pydata/pandas-datareader@eac67a4 , |
Google Finance website currently displays a Yellow Warning Banner which reads as follows:
Their notice at https://support.google.com/finance indicates:
Why? Their reasoning: "to make Google Finance more accessible and user-friendly for a wider audience." Yeah, right 👎 See also Quora: Why is the Google Finance portfolio feature disabled? Thus any tentative fixes now for data retrieval may go to waste after renovation is completed:
Noteworthy is the secure https redirect from the "www" to "finance" subdomain for google.com. |
Upstream fixes not passing the tests in Travis CI build has been resolved: pydata/pandas-datareader#404 This is especially pertains to: Noteworthy snippet for Google Finance URL fix: |
What does
mean? At the moment I can still retrieve data from Google! |
@paintdog That line is used to avoid testing designated parts of the As you have observed, the current code is still operational, That way upstream, we can later run: |
Alternative: MorningstarSome work has begun: pydata/pandas-datareader#411 At fecon235, we may integrate partial code directly from: On 2018-07-31, pydata/pandas-datareader#557 (comment) @hubbins wrote:
|
See next to previous comment for details |
Alternative: TiingoTiingo has a REST and Real-Time Data API includes support for At fecon235, we may integrate partial code directly from: From our Gitter, https://gitter.im/rsvp/fecon235 on 2017-10-24:
The free API permits up to 20,000 requests per day on over 56,000 securities globally. On 2017-10-25, Rishi responds:
|
Alternative: Alpha Vantage
Documentation https://www.alphavantage.co/documentation
Hence his Python module to get stock data and cryptocurrencies: 2018 development at pandas-datareader: We would like to invite more discussion comparing the data vendors |
Alternative, not supported: Bloomberg APIDocumentation https://www.bloomberg.com/professional/support/api-library/ For Python pandas interface, we refer to the repo by @matthewgilbert |
pandas-datareader RemoteDataErrorFor 0.5.0, failures reported for both Alternative: Barchart APIDocumentation: https://www.barchart.com/ondemand/api
# JSON gist
import json
import requests
quote = json.loads(requests.get('http://marketdata.websol.barchart.com\
/getQuote.json?apikey=<api_key>&symbols={}'.format(tic)).text)\
['results'][0]
# Replace <api_key> with your own key. Thanks to @liuyigh ! @BlackArbsCEO has provided a working gist: @femtotrader a pandas-datareader contributor,
|
This is too bad - no hope that Google will deliver data again in the soon future??? |
Scraping by URL
Notes on Alpha Vantage
Comments last updated: 2017-11-07 🔢 REQUEST Emoji on the "Alternatives" above to express your reactions,or please kindly write out your full opinion here regarding your preferences. |
Yahoo cookie / crumb : other Python solutionsCrumb is just part of the cookie, and here are some Python URL scraping solutions:
fix-yahoo-finance by Ran Aroussi@ranaroussi claims his fix also works independently of pandas_datareader:
For details, see https://github.com/ranaroussi/fix-yahoo-finance (> 130 stars) The fragility of the using a non-API solution is illustrated by:
Corey Goldberg @cgoldberg started ystockquote five years ago: Thanks all ! And be sure to PR upstream. |
Alternatives: Misc. ETC.@wilsonfreitas provides an extensive listing of data sources: Any preferences therein which are reliable for equities data? URL shortcut to awesome-quant page: https://git.io/eqdata |
Alternative: IEX API
Main page: https://iextrading.com/developer
Thanks to @iexg and @lockefox
Thanks to @addisonlynch who notes historical datasets are available
|
Some interim Google functionality: November 2017
import pandas_datareader.data as web
import datetime
start = datetime.datetime(2017, 1, 1)
end = datetime.date.today()
google = False
if google:
f = web.DataReader("ETR:SIE", 'google', start, end)
else:
f = web.DataReader("SIE.DE", 'yahoo', start, end)
print(f.Close) |
I hope that the info from @VicTangg will be used to repair pandas datareader. It seems that Google is still delivering data in an acceptable quality. |
A new issue has surfaced in the last few days regarding google pulls with However, the API itself still seems intact: import datetime
import requests
from io import StringIO
# This is just a wrapper importing the compatible version of
# urllib's urlencode--see pandas docs
from pandas.io.common import urlencode
import pandas as pd
BASE = 'http://finance.google.com/finance/historical'
# There seems to be confusion over whether the date api has changed.
# https://github.com/pydata/pandas-datareader/pull/425
# Both formats seem to work, but I'll use the "newer" one here to be safe
def get_params(symbol, start, end):
params = {
'q': symbol,
'startdate': start.strftime('%Y/%m/%d'),
'enddate': end.strftime('%Y/%m/%d'),
'output': "csv"
}
return params
def build_url(symbol, start, end):
params = get_params(symbol, start, end)
return BASE + '?' + urlencode(params)
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime.today() # made around 10:30 am EST
sym = 'SPY'
url = build_url(sym, start, end)
data = requests.get(url).text
data = pd.read_csv(StringIO(data), index_col='Date', parse_dates=True)
print(data.head())
# Open High Low Close Volume
# Date
# 2017-11-30 263.76 266.05 263.67 265.01 127894389
# 2017-11-29 263.02 263.63 262.20 262.71 77512102
# 2017-11-28 260.76 262.90 260.66 262.87 98971719
# 2017-11-27 260.41 260.75 260.00 260.23 52274922
# 2017-11-24 260.32 260.48 260.16 260.36 27856514 |
RE: UnicodeDecodeError: 'utf-8' codec can't decode byte: invalid start byte: Decoding issue was also reported at Reddit In regards to your proposals for Google URL date format Brad, your fixes are very much appreciated. Thank you! |
@rsvp I'm still not sure if bytes v. string is the issue here, though, the more I look into it. Reading in bytes is explicitly covered/addressed. Instead, I'm starting to think it's just that the GitHub code is not reflected in PyPI, despite both ostensibly being version 0.5.0. I.e.: import datetime
from pandas.compat import StringIO, bytes_to_str
from pandas.io.common import urlencode
import requests
BASE = 'http://finance.google.com/finance/historical'
def _get_params(symbol, start, end):
params = {
'q': symbol,
'startdate': start.strftime('%Y/%m/%d'),
'enddate': end.strftime('%Y/%m/%d'),
'output': "csv"
}
return params
def build_url(symbol, start, end, form='new'):
params = _get_params(symbol, start, end)
return BASE + '?' + urlencode(params)
sym = 'AAPL'
start = date(2010, 1, 1)
end = date.today()
url = build_url(sym, start, end)
# http://finance.google.com/finance/historical?q=AAPL&startdate=Jan+01%2C+2010&enddate=Dec+05%2C+2017&output=csv
session = requests.Session()
byts = session.get(url).content
out = StringIO()
out.write(bytes_to_str(byts))
out.seek(0)
data = pd.read_csv(out, index_col=0, parse_dates=True).sort_index()
data
Open High Low Close Volume
# Date
# 2010-01-04 30.49 30.64 30.34 30.57 123432050
# 2010-01-05 30.66 30.80 30.46 30.63 150476004
# 2010-01-06 30.63 30.75 30.11 30.14 138039594
# 2010-01-07 30.25 30.29 29.86 30.08 119282324
# 2010-01-08 30.04 30.29 29.87 30.28 111969081
# 2010-01-11 30.40 30.43 29.78 30.02 115557365
# 2010-01-12 29.88 29.97 29.49 29.67 148614774
# 2010-01-13 29.70 30.13 29.16 30.09 151472335
# 2010-01-14 30.02 30.07 29.86 29.92 108288411
# 2010-01-15 30.13 30.23 29.41 29.42 148584065
# 2010-01-19 29.76 30.74 29.61 30.72 182501620
# 2010-01-20 30.70 30.79 29.93 30.25 153037892
# ... ... ... ... ...
# 2017-11-16 171.18 171.87 170.30 171.10 23637484
# 2017-11-17 171.04 171.39 169.64 170.15 21899544
# 2017-11-20 170.29 170.56 169.56 169.98 16262447
# 2017-11-21 170.78 173.70 170.78 173.14 25131295
# 2017-11-22 173.36 175.00 173.05 174.96 25588925
# 2017-11-24 175.10 175.50 174.65 174.97 14026673
# 2017-11-27 175.05 175.08 173.34 174.09 20716802
# 2017-11-28 174.30 174.87 171.86 173.07 26428802
# 2017-11-29 172.63 172.92 167.16 169.48 41666364
# 2017-11-30 170.43 172.14 168.44 171.85 41527218
# 2017-12-01 169.95 171.67 168.50 171.05 39759288
# 2017-12-04 172.48 172.62 169.63 169.80 32542385 |
@bsolomon1124 hi Brad, interesting detective work there with PyPI. The question of code replication would then extend to those who have |
Some fixes on forthcoming pandas_datareader 0.6.0Thanks to @davidastephens Yahoo: cryptocurrency quotesWhen people are taking out mortgages on their homes to make bets...
Bitcoin futuresCME BTC quotes and charts: http://www.cmegroup.com/trading/equity-index/us-index/bitcoin.html Cboe XBT quotes: http://cfe.cboe.com/cfe-products/xbt-cboe-bitcoin-futures The notional value of a CME contract is 5 times greater than a Cboe contract. |
UPGRADE to development version of pandas_datareaderIs there a way to mitigate interim disruptions with Yahoo Anaconda distributionIt is possible to @bashtage notes:
Direct from source using git# To get master HEAD:
git clone https://github.com/pydata/pandas-datareader
cd pandas-datareader
python setup.py install It is an open question whether setup.py will properly Installation using pip
|
pandas_reader DeprecationsAs of 2018-01-18, leading up to their 0.6.0 release,
Fallback vendor(s) for reliable equities data has not been clarified |
Alternative: Robinhood
Robinhood Markets Inc. is a commission-free, online securities brokerage. Snippet for current quotes: Snippet for historical data: |
REFERENCES for new Alternatives |
Alternative: Interactive BrokersOfficial: https://interactivebrokers.github.io The official Interactive Brokers Python API has a few design choices which make it run slowly. Specifically: excessive debug logging, and an overly cautious lock on the socket connection. ibapi-grease provides monkey patches that eliminate these bottlenecks by turning off the logging and removing the locks. -- @quantrocket-llc IB's Zipline only supports backtesting, while IB's Catalyst supports backtesting and live trading. Zipline and Catalyst support a separate data ingestion step which downloads the data once: http://www.zipline.io/bundles.html Above, thanks to @westurner |
pandas_datareader v0.6.0 Release@bashtage released this 3 hours ago -- tremendous work! Warning: Yahoo!, Google Options, Google Quotes and EDGAR have been immediately deprecated. But Google finance is still functioning for historical price data, although there are frequent reports of failures. Google failure is frequently encountered when bulk downloading historical price data. Highlights include:
Documentation: https://pandas-datareader.readthedocs.io/en/latest/remote_data.html Example, given pandas_datareader v0.6.0Let's suppose we want quotes for the S&P500 ETF called "SPY" >>> spy = get("s4spy")
/home/yaya/net/anaconda/lib/python2.7/site-packages/pandas_datareader/google/daily.py:40:
UnstableAPIWarning: The Google Finance API has not been stable since late 2017.
Requests seem to fail at random. Failure is especially common when bulk downloading.
warnings.warn(UNSTABLE_WARNING, UnstableAPIWarning)
:: Retrieved from Google Finance: SPY We successfully got a pandas DataFrame in a variable called spy, [pandas_datareader v0.7.0 Release]But wait, Yahoo Finance to be reintegrated in 0.7.0, see |
2018-03-20 @jfunction in pydata/pandas-datareader#502 (comment)
That specific Google Help page: https://support.google.com/websearch/answer/86640 "Unusual traffic from your computer network"
In other words, it seems that Google Finance is now only intended for human eyes, |
CHANGELOG 2018-06-23 (tag: v6.18.0623)Major version change for fecon235 from v5 to v6 Henceforth, fecon235 becomes a repository solely of Jupyter notebooks. Revise docs/fecon235-00-README.ipynb to introduce fecon236. NOTICE of MOVEThis issue and its remedies has moved to: MathSci/fecon236#2 Your review and feedback there would be greatly appreciated. Thank you! |
Possibility of using QUANDL is likely nill for freely accessible data... since Nasdaq acquired Quandl on or about December 4, 2018, For full details, see follow-up on "Disruption of Quandl data, esp. futures" |
Description of specific issue
We are expecting major disruption in getting data on
equities, mutual funds, and ETFs via pandas_datareader
due its dependency on API of both Yahoo and Google Finance.
yi_stock
module.Observed behavior
Yahoo employee has confirmed that the free End-Of-Day data has been terminated, 2017
presumably due to acquisition by Verizon.
/r/algotrading on Google JSON termination
presumably due to cost-cutting by new product manager.
Our
yi_stock
module may appear to be working, butplease tail() your dataframe to verify whether quote retrieval
goes beyond 2017-09-05 [silent fail to get() current stock quotes].
Alternatives to enhance behavior
Switch over to Quandl, using our
lib/yi_quandl.py
moduleAlpha Vantage: https://www.alphavantage.co/documentation - Requires user to get free API key
tiingo: https://api.tiingo.com/docs/general/overview - Requires user to get free API key
$$$ EODhistoricaldata: https://eodhistoricaldata.com
Scrape Google Finance pages: https://github.com/CNuge/general_use_functions/blob/master/international_stock_scraper.py (currently also works for international stocks) -- which is not a robust solution (relative to API code) since the page layout is subject to UI changes: International market data - functionality addition request pydata/pandas-datareader#408
Why would the improvement be useful to most users?
Information from the equities markets is vital for financial economics.
Before releasing our own independent solution, we expect to
make a pull request to the pandas_datareader repository.
Please kindly propose alternative solutions
below, or at https://gitter.im/rsvp/fecon235
... and look into pandas_datareader issues
Check for revisions upstream
It is possible that a solution has been merged into pandas_datareader
and all that is necessary is an update of the package, see its CHANGELOG:
https://pandas-datareader.readthedocs.io/en/latest/whatsnew.html
and make sure the update is compatible with the latest fecon235.
Note: Anaconda distribution uses hyphen, not underscore:
Additional helpful details for bugs
[/] Problem started recently
[/] Problem can be reliably reproduced
fecon235 version: v5.17.0603
pandas version: 0.19.2
pandas_datareader version: 0.2.1
The text was updated successfully, but these errors were encountered: