-
Notifications
You must be signed in to change notification settings - Fork 42
Delete data and anonymize the remaining records #69
Comments
Maybe @johnorourke might have an idea about this, since he built the delete part? |
@peterjaap The original design for that was "you can either delete or anonymize, not both", but this is a good idea. We have several possible requirements:
So for maximum flexibility maybe we need to just allow different 'where' statements for anonymisation and deletion. However, Perhaps this approach:
@IvanChepurnyi I can see your work on the DataProvider system, so it would be good to get your input on this. Should we avoid backward compatibility and go for a generic "actions" config, instead of using implcit actions? It's a balance between easy config with "sensible defaults", the learning curve for new users, and reducing unexpected behaviour. |
@johnorourke i'm currently using Adding both |
@johnorourke I like your approach, and if There is probably an opportunity to hide this logic behind the |
Watching this, as I'm also interested in this feature. Until then, is it possible to run masquerade twice with two different configs? I'm thinking I can run the anon, then export for a full anon backup. Only problem is I need two different config file setups for this correct? I guess I could run two different phar's each with their own config, but that doesn't seem very elegant. |
@SAN1TAR1UM the |
I read all and I have a question, why just not having in yaml file: customer_grid_flat:
provider:
delete: true
where: "`created_at` < now() - interval 30 day" and this below: customer_grid_flat:
columns:
name:
formatter: name
email:
formatter: email
unique: true
nullColumnBeforeRun: true
dob:
formatter: dateTimeThisCentury
optional: true
billing_full:
... First block will clean, second one will anonymize? |
@mehdichaouch I think multiple configs for the same table are ignored - the latest one wins, due to using |
The idea is that we will delete all of the older customer data (for example, delete customers that have been created more than 30 days ago), so that the DB dump will be a lot smaller + reducing Masquerade execution time. The remaining data should be anonymized so we can use it anymwhere.
Example config:
Currently, Masquerade just executes the delete, then it moves on to the next table, leaving the remaining records in the table anonymized. Very logical, but it would be nice to have the possibility to delete AND anonymize.
What would be the best place to implement this feature?
The text was updated successfully, but these errors were encountered: