Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional safety for custom copy sql #41

Open
jfrost opened this issue Feb 16, 2022 · 2 comments
Open

Add additional safety for custom copy sql #41

jfrost opened this issue Feb 16, 2022 · 2 comments
Labels
enhancement New feature or request p1

Comments

@jfrost
Copy link
Collaborator

jfrost commented Feb 16, 2022

We can add an additional safety measure against the custom SQL destroying all or most of someone's table. Simply compare the results from SELECT reltuples FROM pg_class WHERE relname = <old table> against the same query for the new table after the ANALYZE has been run, but before the tables are swapped. Since this is an estimate, we should probably use a comparison that looks something like >= 0.95 * old_tuples. We should also add a flag like --copy_percentage that lets you set a lower threshold for that comparison for use cases where the user is purposely deleting much of the table data.

This would help guard against a user not understanding the documentation and using something unfortunate like:

-- file: /src/query.sql
INSERT INTO %{shadow_table}(foo, bar, baz, rental_id, tenant_id)
SELECT 1,1,1,1,1
@shayonj shayonj added enhancement New feature or request p1 labels Feb 21, 2022
@shayonj
Copy link
Owner

shayonj commented Feb 22, 2022

Just FYI: I am eyeing this for 0.5.0/0.6.0 release. Hashing out any other edge cases in the meantime.

@shayonj
Copy link
Owner

shayonj commented Feb 26, 2022

I was trying to pseudo-code this and it looks like we may be maintaining some very custom logic against an open use case.

Since, even on the happy path (where users are aware of data deletion), I wonder if we should just keep it simple and have users acknowledge that they are aware of the impact by passing an additional flag? Like --copy-statement /tmp/foo.sql --ack-copy-statement. If they pass copy-statement but not ack-copy-statement, the program will just terminate early.

Thinking out loud, lmk what you think (no rush).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request p1
Projects
None yet
Development

No branches or pull requests

2 participants