Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TableTraits.jl integration #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidanthoff
Copy link

This gives you full integration with the Queryverse.jl. So you get file IO (CSV, Feather, Excel, SAS, SPSS, STATA, Parquet), plotting (VegaLite.jl, StatsPlots.jl, Gadfly.jl, DataVoyager.jl), conversion to and from other tables (pretty much all table types, DataFrames, IndexedTables, TimeSeries, Temporal, Pandas) and it works with Query.jl, and I probably forgot something ;)

In case you are worried about taking on additional dependencies, here is the landscape on that: IteratorInterfaceExtensions.jl and TableTraits.jl are super stable. I haven't changed anything (apart from the julia 0.7 update) about these in about 1.5 years, and I plan to make zero changes going forward. They are also super minimal (take a look), so I think there is no risk in taking a dependency on these (and I think I have a track record at this point for not breaking stuff, ask e.g. Chris Rackauckas what his experience has been with taking the same dependency for his ecosystem). TableTraitsUtils.jl is getting changes as I improve the implementation there, but I am also not breaking packages that use it. It is also not really necessary to play in the Queryverse, it really is just a shortcut to make it a bit easier to implement the interface. So if this dependency was not welcome, one could remove it down the road (but then would need a bit more code here in the repo).

The whole TableTraits.jl story is documented here. There is one twist, and that is that isiterabletable can now return missing. The code in this PR doesn't handle that yet. At this point there are no sources that lead to this, so it shouldn't be an issue, and I would follow up and fix this later.

Should also say that I like everything I see here A LOT! I think there is also room for a much deeper integration between Query.jl and this stuff here, with a custom Query.jl backend that makes use of the acceleration stuff you have. But we could try to figure that out in a second step.

function Table(source)
iit = TableTraits.isiterabletable(source)
if iit===true
cols, names = TableTraitsUtils.create_columns_from_iterabletable(source, na_representation=:datavalue)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment about this: you could also call this with na_representation=:missing, and then any column with missing would be a Vector{Union{T,Missing}}. With the current code version any column with missing you would get as a DataValueVector{T}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant