-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy loaded (and evictable) dictionaries #18
Conversation
…keep dictionary data in memory this also matches expectations from other ruby ports of zxcvbn
def self.zxcvbn(password, user_inputs = []) | ||
start = (Time.now.to_f * 1000).to_i | ||
matches = Matching.omnimatch(password, user_inputs) | ||
result = Scoring.most_guessable_match_sequence(password, matches) | ||
result["calc_time"] = (Time.now.to_f * 1000).to_i - start | ||
attack_times = TimeEstimates.estimate_attack_times(result["guesses"]) | ||
attack_times.each do |prop, val| | ||
result[prop] = val | ||
end | ||
result["feedback"] = Feedback.get_feedback(result["score"], result["sequence"]) | ||
result | ||
Tester.new.zxcvbn(password, user_inputs) | ||
end | ||
|
||
def self.test(password, user_inputs = []) | ||
Result.new(Zxcvbn.zxcvbn(password, user_inputs)) | ||
Tester.new.test(password, user_inputs) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about making both of these use a shared, memoized Tester instance? That'd still lazy load the first time, but avoid reloading all the time.
To unload / reload on every use, an app could just instantiate a new Tester each time.
@formigarafa, I can see how this allows for lazy loading and unloading, and prepares the path for custom dictionaries. Pretty good to enable all of that through a single PR! For our use case here, I have some hesitation about unloading / reloading the data every time. Those are two-fold: Do we have a measurement of how long a single load of the dictionaries takes? I might suggest making the default API memoize the Tester. Then it'd lazy load just once and stay in memory. That's more backwards compatible with the existing design and seems like a better default to me. I could also see the potential benefit to an API to force eager loading. A sketch of both memoizing by default, and an eager loading option (called simply with module Zxcvbn
def self.zxcvbn(...)
default_tester.zxcvbn(...)
end
def self.test(...)
default_tester.test(...)
end
def self.default_tester(eager_load: false)
@default_tester ||= Tester.new(eager_load:)
end
class Tester
def initialize(eager_load: false)
matching if eager_load
end
end
end |
I have just implemented what has been pending since the beginning. I still need to update the README to include this as is present in the other ports. |
Hello @zarqman, I had a read on your concerns and I decided to not move forward without giving them a good consideration. But then the end of the last year had been a bit rough (it kind of still is) and despite my previous answer I've been thinking about your points and I think they are important. I feel that Item b (multiple instances) is of a greater concern than item a (delay added). In terms of multiple instances, I think this is a more concerning scenario but memoizing should have that sorted, and likely solve the other item, too. I've been chasing after improving performance, memory usage and garbage collectability for a while. IMHO I don't think it was sensible to keep that much password test related data in memory while there is an app trying to run a business (unless of course you are on a business of password checking, of course). I think I put myself in this position because I ported the javascript code which came with their own assumptions about memory and process management from a webpage in the browser. Back then I've decided to solve these problems later when I had everything else under control and I could match the other ports expectations. If I had the capability to implement that from the start, as I wanted, I would not even need to be thinking about how to accomodate these features, now. My problem is that from the point of view of a lib developer I feel that makes more sense to give the power to the developer to handle the lib, more than creating a tool to handle the lib which would increase the api surface of the lib and create more possibilities to conflict and require maintenance. From what I could infer about your perspective, I think you could create a wrapper class to initialize Zxcvbn and keep it in memory according to your requirements.
|
With these changes dictionary files are no longer loaded with the gem. It will wait until it is used. (solves #15)
The code as is won't persist the dictionary in memory either allowing them to be evicted from memory if garbage collected (no references).
You can also keep an instance of
Zxcvbn::Tester
loaded to prevent reloading the dictionaries on every call. (addresses some concerns of #9)This allow a drastic reduction on the number of retained objects in memory.
This approach also eases the path to implement configurable dictionaries without sacrificing any other features.