-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will write()s larger than PIPE_BUF_MAX be interleaved? #1
Comments
I looked around and found this, which suggests that this is only an issue when writing to a pipe. Can anyone with a longer unix beard than me comment on this? Why else would Logger need a mutex? |
My unix beard is certainly not longer than yours, but from my reading, it appears that POSIX only guarantees atomicity when dealing with pipes or FIFOs, and does not say anything about regular files. It would seem there are no guarantees wrt atomicity when writing to files, and even if there were, there are a lot of filesystems out there that do not fully conform to POSIX anyway. |
@jacknagel My research pointed in the opposite direction. It seems that MonoLogger (and Logger for that matter) are doing the right thing wrt atomically appending to files. I'll close this for now until something more concrete comes up on the other side. |
The syscall itself may be atomic, but I don't think that has anything to do with whether the write of the data is actually atomic. write(2) can be interrupted by a signal, in which case it returns the number of bytes written, which may or may not be the number of bytes we tried to write. One must then call write again with the correct number of bytes and offset into the data in order to complete the operation, and there are ample opportunities in this sequence for interleaving to occur. I'd love to be wrong about this, but it seems that any stronger guarantees are getting into ACID territory. Really, as long as this library is using IO#write, discussing this may be pointless. There are potentially two layers of buffering (Ruby's internal buffering, and the libc functions that Ruby wraps) here. I think one has to use IO#syswrite to get anywhere close to any guarantees that write(2) may or may not provide. By doing so you lose the benefits of buffering, of course, so perhaps that is why the stdlib logger was using a mutex in the first place. |
I forgot about the signal interaction, that could definitely cause issues. This alone warrants the use of a mutex. Ruby's buffering is out of the question here. IO#sync is set to |
How the hell am I not watching my own repository? So sorry about this, because I never got an email, I never saw that this discussion was happening. So, basically, yes, I just took this from OpsCode, who took this from the standard library. I am unsure if it's a 100% perfect solution, but I'm very interested in making it one. We're using it in Resque and nobody has reported any problems so far. |
I've found mono_logger to not be thread-safe when used under heavy load with Sidekiq but could not figure out why. See #2 |
Eric Wong got in touch after he saw my article and tried to set the record straight. I posted his messages on the usp.ruby mailing list at http://librelist.com/browser//usp.ruby/2013/6/5/o-append-atomicity/. Relevant exceprts:
|
I wanted to chime in that STDOUT has sync set to false. % ruby -e 'puts STDOUT.sync.inspect'
false
% ruby -e 'puts STDERR.sync.inspect'
true In fact, OpsCode has committed a change[1] to ensure that all log devices have sync set to true. Perhaps this discussion is related? |
This implementation is actually identical to Ruby's Logger, but without the mutex. So I wondered why Logger used a mutex there.
A call to write() is atomic. But, AFAIK, only for writes up to PIPE_BUF_MAX. I couldn't reproduce it, but technically if you're logging something bigger than this, it could interleave with what other threads are writing.
On my system this value is 512 bytes.
$ ulimit -a | grep pipe
pipe size (512 bytes, -p) 1
The text was updated successfully, but these errors were encountered: