Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use attr_encrypted or symmetric_encryption for protecting sensitive data at rest? #413

Open
colinxfleming opened this issue Jun 24, 2016 · 22 comments

Comments

@colinxfleming
Copy link
Member

colinxfleming commented Jun 24, 2016

(updated 2022-July)

We'd like to have our very sensitive / PII shaped data encrypted at rest, especially now that this is built into Rails 7, which we're running now.

As a potential limiting factor, we also want to preserve a reasonable search functionality.

https://guides.rubyonrails.org/v7.0/active_record_encryption.html

@colinxfleming
Copy link
Member Author

Marking this hacktoberfest to increase visibility.

As you can imagine, there's a lot of sensitive information in this system -- I'm interested in whether or not this is an effective strategy for safely storing information in such a way that some jerk can't just run the mongodb equivalent of select * from patients. Unfortunately, the PII we most want to keep under wraps (names and contact info) are also what we search by, which means we can't just plug and play.

If it's not an effective strategy, I'm pretty interested in what other people's experiences have been.

@colinxfleming
Copy link
Member Author

@Kevin-Wei I've been thinking about this a lot lately. Would love to pick your brain on this.

@Kevin-Wei
Copy link
Contributor

@Kevin-Wei
Copy link
Contributor

even with heroku configuration http://rocketjob.github.io/symmetric-encryption/heroku.html

@colinxfleming
Copy link
Member Author

@Kevin-Wei this looks awesome.

I think we have two requirements here:

  • encrypt our existing data (if encrypting the entire document is the path of least resistance I am cool as hell with that)
  • make sure new data gets encrypted as it's saved and updated etc

If you have any code in that direction can you whip up a WIP PR with it, so we have it handy?

@colinxfleming
Copy link
Member Author

colinxfleming commented Aug 16, 2017

  • symmetric-encryption seems straightforward to run but entails some additional setup
  • attr_encrypted also seems straightforward, but also has some additional setup (
  • last ditch, this gem which the maintainer is looking to shove off (RIP mongodb): https://github.com/KoanHealth/mongoid-encrypted-fields

Maybe see https://stackoverflow.com/questions/4343996/rails-storing-encrypted-data-in-database

@colinxfleming colinxfleming changed the title Use attr_encrypted for protecting sensitive data at rest? Use attr_encrypted or symmetric_encryption for protecting sensitive data at rest? Aug 24, 2017
@colinxfleming colinxfleming added this to the Active development! milestone Aug 24, 2017
@colinxfleming
Copy link
Member Author

I'm promoting this to active, but it's definitely complicated and tough, so please.

symmetric_encryption seems to be slightly better documented, but migrating is going to be difficult. Let's start with a proof of concept of adding an encrypted field on the model and go from there.

@colinxfleming
Copy link
Member Author

Tagging this as hacktoberfest -- if you have set this up in MongoDB before I would love to talk to you!

@colinxfleming
Copy link
Member Author

Setting this up has proven to be such an unbelievable pain that I am inclined to mark this futuretech for the time being.

@tingaloo
Copy link
Contributor

@colinxfleming interested in bumping this?

@colinxfleming
Copy link
Member Author

@tingaloo I'd like to keep this in futuretech for the time being I think, at least until we go thru the survey process. I'd rather focus your energy on the budget bar stuff for rn.

@colinxfleming
Copy link
Member Author

Adding this to our DARIA Independence milestone.

I think what we've learned from our attempts at this so far is that a) doing this on the fly is hard; b) plugging it into mongoid is also really hard. (Maybe the ecosystem has changed a bit.)

Before we start on this, we should probably make a decision on whether or not to move to postgres one way or the other.

@colinxfleming
Copy link
Member Author

I'd like to revisit this:

  • I've generally assumed that 'encryption at rest' would mean implementing something like this. I think there's some disagreement on that - drive encryption (like macs have) counts under some standards. By that standard we have a lot more wiggle room, as heroku postgres above a certain tier + mongo atlas both have that.
  • I think the fact that we've been meaning to do this for basically the lifetime of the project and haven't yet is probably a sign of how much it would suck to do retroactively.

I wonder if the root problem here is secretly closeable / a problem we should just throw money at, and we shouldn't bother doing an app based fix for it.

@colinxfleming
Copy link
Member Author

Worth noting that this rolled out in rails: https://edgeguides.rubyonrails.org/active_record_encryption.html

@colinxfleming
Copy link
Member Author

and, uh, it looks like it might plausibly work really well - seems to even solve lookup problems, though not clear whether it would cover our approach of using database ilikes? This is in rails 7, which is probably ~6m a way, but reads as an extremely viable option to me in a way that attr_encrypted/symmetric_encryption really don't. TLDR: keep our powder dry and revisit this when we're on rails 7.

@tingaloo
Copy link
Contributor

@colinxfleming glad you're still on top of things. My experience is to wait also. at work we got this white source thing that's catching gems with security vulnerabilties.

@colinxfleming
Copy link
Member Author

@lomky the thrust of this issue is still the same and yr updated description is pretty much right - column-encrypted fields for our PII, and we'll have to preserve the fields we search on (basically ilike of name, phone, alt contact name, alt contact phone iirc - should be in the searchable concern). Mayyyy be worth clarifying that the fields we let search on are I believe all on the patients table, which hopefully simplifies things a bit engineeringwise? and we can produce a list of what's considered sensitive pretty easily.

@mercedesb
Copy link
Contributor

@colinxfleming is this related at all to #2586? Now that we have ActiveRecord encryption set up, is there anything else we want to go ahead and encrypt?

@colinxfleming
Copy link
Member Author

yeah actually - I think that it does kind of behoove us to get more aggressive about data obfuscation, now that we can do it more or less without inventing something from scratch. Would you be so kind as to do the honors here?

This is a list of fields in tables that fit the following criteria:

  • not directly queried (e.g. no find_by or other sql filters)
  • not an archive concern
  • have semi-sensitive data in varchars or varchar-ish things
event.cm_name
event.event_type
event.patient_name
event.pledge_amount

clinic.name
clinic.street_address
clinic.city
clinic.state
clinic.zip
clinic.phone
clinic.fax
clinic.gestational_limit
clinic.coordinates
clinic.email_for_pledges

I have similar lists for external pledges and fulfillments, but those tables are part of an archiving process so let's get the reps down on Clinic and Event first. Let's start with Event as its own PR to get the pattern down, if that's cool?

@mercedesb
Copy link
Contributor

@colinxfleming the enum event.event_type is stored in the db as an integer. This actually makes it so we can't use Rails encryption out-of-the-box. We could probably write something custom (I spent some time this afternoon looking into it), but it could become a pain to maintain. Do you feel like the integers in the db are sensitive enough that we should continue to explore this?

@colinxfleming
Copy link
Member Author

@mercedesb sorry for the delay here. I think the sensitive-est things here are CM name and Patient Name, which I believe are stored as strings; if event type or pledge amount require extra work I'm perfectly fine with not scrambling those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants