Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better names for duckplyr_df_from_csv() and duckplyr_df_from_parquet() #210

Open
hadley opened this issue Jul 25, 2024 · 11 comments · Fixed by #396
Open

Better names for duckplyr_df_from_csv() and duckplyr_df_from_parquet() #210

hadley opened this issue Jul 25, 2024 · 11 comments · Fixed by #396
Milestone

Comments

@hadley
Copy link
Member

hadley commented Jul 25, 2024

I was thinking that read_csv_lazy() and read_parquet_lazy() might more clearly convey their usage.

@hadley
Copy link
Member Author

hadley commented Sep 4, 2024

Or maybe read_csv_duckplyr() or similar.

I find the current selection of function names to be quite confusing.

@hadley hadley added this to the 1.0.0 milestone Sep 25, 2024
@DavisVaughan
Copy link
Member

My vote is for read_csv_duckplyr(), just "lazy" as the suffix doesn't feel quite right (too ambiguous?)

But keeping the read_csv_ as the prefix is quite nice for muscle memory with readr and autocomplete

@hannes
Copy link
Contributor

hannes commented Sep 25, 2024

we should also have some functions of output, e.g. write_parquet_duckplyr() or something like that

@krlmlr

This comment was marked as duplicate.

@hannes
Copy link
Contributor

hannes commented Oct 16, 2024

happy with read_*_duckplyr

@krlmlr
Copy link
Member

krlmlr commented Oct 17, 2024

DoD:

  • Review duckdbfs API (very similar scope as these functions)
  • Find names and agree
  • Add deprecation warnings
  • Implement with new names
  • Add an argument to disable automatic materialization
  • Release to CRAN

@krlmlr
Copy link
Member

krlmlr commented Nov 10, 2024

Let's go with the full lifecycle changes. We now also have the infrastructure to disable automatic materialization, this seems most useful when reading directly from a file, and will provide an incentive to use the new functions. This needs #255.

I wonder if the new functions should turn auto-mat on or off by default.

@krlmlr
Copy link
Member

krlmlr commented Dec 16, 2024

I decided to go with duckparquet(), duckcsv(), duckjson() and duckfile() . I really like how this blends into the new ducktbl() constructor, and what it looks like in the reference index: https://duckplyr.tidyverse.org/dev/reference/index.html .

That said, I can easily change to duckparquet_read() or even read_parquet_duckplyr() if needed.

@etiennebacher
Copy link

etiennebacher commented Dec 18, 2024

I mentioned this in Mastodon and it had some support (just in terms of likes), so I'm just putting it back here so that it's not completely lost (sorry for the duplication):

I think having read_*_duckplyr() or read_*_duckdb() would be better names. There are already many functions starting with read_ (arrow::read_csv_arrow(), readr::read_csv(), nanoparquet::read_parquet(), tidypolars::read_*_polars(), etc.) so typing read_ and then seeing all the options proposed by autocomplete would be very helpful.

IMO duckparquet() is not clear on whether it does writing or reading, and duckparquet_read() means that all the advantages of muscle memory and autocompletion mentioned above are discarded.

@krlmlr
Copy link
Member

krlmlr commented Dec 18, 2024

Thanks, Etienne, point taken.

By now I'm really convinced of the duck_ prefix, the only remaining question being if it's duck_tbl() or duck_tibble(), slightly gravitating towards the latter to leave the duck_tbl() name for accessing a table/view in a DuckDB database.

https://duckplyr.tidyverse.org/dev/reference/

On the upside, the duck_ prefix is much easier to type with 10 fingers on a QWERT keyboard than the read_ prefix. It's also consistent with the idea of having a common prefix for functions from a package. (I deliberately don't use the duck_ prefix for all functions, just for the central ones.)

@hadley
Copy link
Member Author

hadley commented Dec 30, 2024

As much as I love the duck prefix, given tidypolars::read_*_polars() and friends I do think it's worthwhile to think this through a bit more before release.

@hadley hadley reopened this Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants