Commit df3cee6c authored by Raphael 'kena' Poss's avatar Raphael 'kena' Poss

util/log: clean up and document the handling of stderr

Previous work in this area of the code introduced a confusion between
two orthogonal concepts:

- each logger might copy its log entries to the process' external
  stderr stream (e.g. the terminal during interactive use), as
  set by its "stderrThreshold" variable.

- direct writes by Go code to process-wide file descriptor 2 (such
  as done by the Go runtime)  or `os.Stderr` (such as done
  by 3rd party packages when doing their own ad-hoc logging)
  can be redirected to a logger's output file.

The confusion (mostly mine - @knz) was to mix the two kinds of
"stderr" and mistakenly conflating "entries submitted to this logger
via API calls" and "direct writes to fd 2 by other go code without
looking at the logging API". These are actually completely separate
and independent concepts/mechanisms.

The code clarifies the situation as follows:

- the process' external stderr stream is always available via
  `log.OrigStderr` and this is meant to never change throughout
  execution.

- the external stderr stream is the sink for `Shout()` API calls
  and also the copy of log entries whose severity
  exceed the "stderrThreshold" variable.

- separately, *at most one logger* may redirect internal writes to fd
  2 and os.Stderr to its log file. This is determined by
  its variable "noRedirectInternalStderrWrites" (previously
  named "noRedirectStderr").

Beyond this, this commit fixes 3 bugs.

1. the code was intending to both redirect the standard stderr file
   descriptor (fd 2 on unix, error handle on windows) and also
   `os.Stderr` separately, but failing to do so on unix build targets.
   It was done correctly for windows. This is now corrected so that
   `os.Stderr` gets assigned on all targets.

   (The separate assignment of `os.Stderr` is necessary because
   although Go intializes this to be equivalent to the standard
   file descriptor upon process startup, other Go code can
   assign `os.Stderr` after initialization.)

2. upon encountering a write error to its output file, a logger
   would previously report that write error twice to the
   process' external stderr. This has been corrected
   so that the write error is only reported once.

3. previously, upon a log.Fatal performed while configuration would not
   otherwise cause a copy the F entry go to the process external stderr,
   the code would override the configuration and force a copy of the
   entry to the external stderr stream.

   There is no good reason for this - either the user wants F entries
   on the external stderr and signal this desire via
   `--logtostderr=FATAL` or lower, or they don't want F entries there
   at all via `--logtostderr=NONE`. There is no reason for an
   override and the code should honor the configuration.

   (I introduced the bug together with the confusion mentioned
   at the beginning.)

Release note: None
parent e0ad962a
......@@ -30,6 +30,13 @@ import (
// mainLog is the primary logger instance.
var mainLog loggerT
// stderrLog is the logger where writes performed directly
// to the stderr file descriptor (such as that performed
// by the go runtime) *may* be redirected.
// NB: whether they are actually redirected is determined
// by stderrLog.redirectInternalStderrWrites().
var stderrLog = &mainLog
// logging is the global state of the logging setup.
var logging loggingT
......@@ -37,9 +44,6 @@ var logging loggingT
//
// TODO(knz): better separate global state and per-logger state.
type loggingT struct {
// Level flag for output to stderr. Handled atomically.
stderrThreshold Severity
// pool for entry formatting buffers.
bufPool sync.Pool
......@@ -82,13 +86,36 @@ type loggerT struct {
// Name prefix for log files.
prefix string
// Level flag for output to files.
// Level beyond which entries submitted to this logger are written
// to the output file. This acts as a filter between the log entry
// producers and the file sink.
fileThreshold Severity
// noStderrRedirect, when set, disables redirecting this logger's
// log entries to stderr (even when the shared stderr threshold is
// matched).
noStderrRedirect bool
// Level beyond which entries submitted to this logger are written
// to the process' external standard error stream (OrigStderr).
// This acts as a filter between the log entry producers and the
// stderr sink.
stderrThreshold Severity
// noRedirectInternalStderrWrites, when UNset (set to false), causes
// this logger to capture writes to system-wide file descriptor 2
// (the standard error stream) and os.Stderr and redirect them to this
// logger's output file.
// Users of the logging package should ensure that at most one
// logger has this flag set to redirect the system-wide stderr.
//
// Note that this mechanism redirects file descriptor 2, and does
// not only assign a different *os.File reference to
// os.Stderr. This is because the Go runtime hardcodes stderr writes
// as writes to file descriptor 2 and disregards the value of
// os.Stderr entirely.
//
// Callers are encouraged to use the redirectInternalStderrWrites()
// accessor method for clarity.
//
// TODO(knz): The double negative is somewhat inconvenient. We could
// consider flipping the meaning for enhanced clarity.
noRedirectInternalStderrWrites bool
// notify GC daemon that a new log file was created
gcNotify chan struct{}
......@@ -109,12 +136,18 @@ type loggerT struct {
func init() {
logging.bufPool.New = newBuffer
logging.mu.fatalCh = make(chan struct{})
// Default stderrThreshold and fileThreshold to log everything.
mainLog.prefix = program
// Default stderrThreshold and fileThreshold to log everything
// both to the output file and to the process' external stderr
// (OrigStderr).
// This will be the default in tests unless overridden; the CLI
// commands set their default separately in cli/flags.go.
logging.stderrThreshold = Severity_INFO
mainLog.prefix = program
mainLog.stderrThreshold = Severity_INFO
mainLog.fileThreshold = Severity_INFO
// In addition, we want all writes performed directly to the
// internal stderr file descriptor (fd 2) to go to the stderr log
// file.
stderrLog.noRedirectInternalStderrWrites = false
}
// FatalChan is closed when Fatal is called. This can be used to make
......@@ -171,6 +204,12 @@ func (l *loggerT) ensureFile() error {
return nil
}
// redirectInternalStderrWrites is the positive accessor
// for the noRedirectInternalStderrWrites flag.
func (l *loggerT) redirectInternalStderrWrites() bool {
return !l.noRedirectInternalStderrWrites
}
// writeToFile writes to the file and applies the synchronization policy.
// Assumes that l.mu is held by the caller.
func (l *loggerT) writeToFile(data []byte) error {
......@@ -262,18 +301,30 @@ func (l *loggerT) outputLogEntry(s Severity, file string, line int, msg string)
}()
}
if s >= logging.stderrThreshold.get() || (s == Severity_FATAL && l.stderrRedirected()) {
// We force-copy FATAL messages to stderr, because the process is bound
// to terminate and the user will want to know why.
l.outputToStderr(entry, stacks)
if s >= l.stderrThreshold.get() {
if err := l.outputToStderr(entry, stacks); err != nil {
// The external stderr log is unavailable. However, stderr was
// chosen by the stderrThreshold configuration, so abandoning
// the stderr write would be a contract violation.
//
// We definitely do not like to lose log entries, so we stop
// here. Note that exitLocked() shouts the error to both stderr
// and the log file, so even though stderr is not available any
// more, we'll keep a trace of the error in the file.
l.exitLocked(err)
l.mu.Unlock() // unreachable except in tests
return // unreachable except in tests
}
}
if l.logDir.IsSet() && s >= l.fileThreshold.get() {
if err := l.ensureFile(); err != nil {
// Make sure the message appears somewhere.
l.outputToStderr(entry, stacks)
// We definitely do not like to lose log entries, so we stop
// here. Note that exitLocked() shouts the error to both stderr
// and the log file, so even though the file is not available
// any more, we'll likely keep a trace of the error in stderr.
l.exitLocked(err)
l.mu.Unlock()
return
l.mu.Unlock() // unreachable except in tests
return // unreachable except in tests
}
buf := logging.processForFile(entry, stacks)
......@@ -281,9 +332,9 @@ func (l *loggerT) outputLogEntry(s Severity, file string, line int, msg string)
if err := l.writeToFile(data); err != nil {
l.exitLocked(err)
l.mu.Unlock()
putBuffer(buf)
return
l.mu.Unlock() // unreachable except in tests
putBuffer(buf) // unreachable except in tests
return // unreachable except in tests
}
putBuffer(buf)
......@@ -313,12 +364,15 @@ func DumpStacks(ctx context.Context) {
}
// printPanicToFile copies the panic details to the log file. This is
// useful when the standard error is not redirected to the log file
// (!stderrRedirected), as the go runtime will only print panics to
// stderr.
// useful when the internal stderr writes are not redirected to the
// log file (!stderrRedirected), as the go runtime hardcodes
// its writes to file descriptor 2.
//
// TODO(knz): Create a log entry header for the panic details, to
// get a timestamp and later a redactable marker.
func (l *loggerT) printPanicToFile(r interface{}) {
if !l.logDir.IsSet() {
// There's no log file. Nothing to do.
// There's no log file. Can't do anything.
return
}
......@@ -326,24 +380,27 @@ func (l *loggerT) printPanicToFile(r interface{}) {
defer l.mu.Unlock()
if err := l.ensureFile(); err != nil {
fmt.Fprintf(OrigStderr, "log: %v", err)
// We're already exiting; no need to pile an error upon an
// error. Simply report the logging error and continue.
l.reportErrorEverywhereLocked(err)
return
}
panicBytes := []byte(fmt.Sprintf("%v\n\n%s\n", r, debug.Stack()))
if err := l.writeToFile(panicBytes); err != nil {
fmt.Fprintf(OrigStderr, "log: %v", err)
// Ditto; report the error but continue. We're terminating anyway.
l.reportErrorEverywhereLocked(err)
return
}
}
func (l *loggerT) outputToStderr(entry Entry, stacks []byte) {
// outputToStderr writes the provided entry and potential stack
// trace(s) to the process' external stderr stream.
func (l *loggerT) outputToStderr(entry Entry, stacks []byte) error {
buf := logging.processForStderr(entry, stacks)
_, err := OrigStderr.Write(buf.Bytes())
putBuffer(buf)
if err != nil {
l.exitLocked(err)
}
return err
}
const fatalErrorPostamble = `
......
......@@ -90,8 +90,11 @@ func setFlags() {
ResetExitFunc()
mainLog.mu.Lock()
defer mainLog.mu.Unlock()
mainLog.noStderrRedirect = false
logging.stderrThreshold = Severity_ERROR
// Make the internal stderr writes go to the stderr log file.
stderrLog.noRedirectInternalStderrWrites = false
// Make all logged errors go to the external stderr, in addition to
// the log file.
mainLog.stderrThreshold = Severity_ERROR
}
// Test that Info works as advertised.
......@@ -568,7 +571,7 @@ func TestFatalStacktraceStderr(t *testing.T) {
defer s.Close(t)
setFlags()
logging.stderrThreshold = Severity_NONE
mainLog.stderrThreshold = Severity_NONE
SetExitFunc(false /* hideStack */, func(int) {})
defer setFlags()
......@@ -606,7 +609,7 @@ func TestRedirectStderr(t *testing.T) {
defer s.Close(t)
setFlags()
logging.stderrThreshold = Severity_NONE
mainLog.stderrThreshold = Severity_NONE
Infof(context.Background(), "test")
......
......@@ -108,20 +108,31 @@ var Safe = errors.Safe
func ReportPanic(ctx context.Context, sv *settings.Values, r interface{}, depth int) {
Shout(ctx, Severity_ERROR, "a panic has occurred!")
if mainLog.stderrRedirected() {
// We do not use Shout() to print the panic details here, because
// if stderr is not redirected (e.g. when logging to file is
// disabled) Shout() would copy its argument to stderr
// unconditionally, and we don't want that: Go's runtime system
// already unconditonally copies the panic details to stderr.
// Instead, we copy manually the details to stderr, only when stderr
// is redirected to a file otherwise.
fmt.Fprintf(OrigStderr, "%v\n\n%s\n", r, debug.Stack())
if stderrLog.redirectInternalStderrWrites() {
// If we decided that the internal stderr writes performed by the
// Go runtime are going to our file, that's also where the panic
// details will go automatically when the runtime processes the
// uncaught panic.
// This means the panic details won't get copied to
// the process' external stderr stream automatically
// any more.
// Now, if the user asked that fatal errors be copied
// to the external stderr as well, we'll take that
// as an indication they also want to see panic details
// there. Do it here.
if LoggingToStderr(Severity_FATAL) {
// TODO(knz): Produce a log entry header to get a timestamp
// here.
fmt.Fprintf(OrigStderr, "%v\n\n%s\n", r, debug.Stack())
}
} else {
// If stderr is not redirected, then Go's runtime will only print
// out the panic details to the original stderr, and we'll miss a copy
// in the log file. Produce it here.
mainLog.printPanicToFile(r)
// If we are not redirecting internal stderr writes, then the
// details printed by the Go runtime won't automatically reach our
// log file and instead go to the process' external stderr.
//
// However, we actually want to persist these details. So print
// them in the log file ourselves.
stderrLog.printPanicToFile(r)
}
sendCrashReport(ctx, sv, PanicAsError(depth+1, r), ReportTypePanic)
......
......@@ -56,6 +56,28 @@ func ResetExitFunc() {
func (l *loggerT) exitLocked(err error) {
l.mu.AssertHeld()
l.reportErrorEverywhereLocked(err)
logging.mu.Lock()
f := logging.mu.exitOverride.f
logging.mu.Unlock()
if f != nil {
// Avoid conflicting lock order between l.mu and locks in f.
l.mu.Unlock()
f(2, err)
// Avoid double unlock on l.mu.
l.mu.Lock()
} else {
os.Exit(2)
}
}
// reportErrorEverywhereLocked writes the error details to both the
// process' original stderr and the log file if configured.
//
// TODO(knz): Create a log entry header for this error, to get a
// timestamp and later a redactable marker.
func (l *loggerT) reportErrorEverywhereLocked(err error) {
// Either stderr or our log file is broken. Try writing the error to both
// streams in the hope that one still works or else the user will have no idea
// why we crashed.
......@@ -75,16 +97,4 @@ func (l *loggerT) exitLocked(err error) {
fmt.Fprintf(w, "log: exiting because of error: %s\n", err)
}
l.flushAndSync(true /*doSync*/)
logging.mu.Lock()
f := logging.mu.exitOverride.f
logging.mu.Unlock()
if f != nil {
// Avoid conflicting lock order between l.mu and locks in f.
l.mu.Unlock()
f(2, err)
// Avoid double unlock on l.mu.
l.mu.Lock()
} else {
os.Exit(2)
}
}
......@@ -18,14 +18,14 @@ import (
func init() {
logflags.InitFlags(
&mainLog.noStderrRedirect,
&stderrLog.noRedirectInternalStderrWrites,
&mainLog.logDir, &showLogs, &noColor,
&logging.vmoduleConfig.mu.vmodule,
&LogFileMaxSize, &LogFilesCombinedMaxSize,
)
// We define these flags here because they have the type Severity
// which we can't pass to logflags without creating an import cycle.
flag.Var(&logging.stderrThreshold,
flag.Var(&mainLog.stderrThreshold,
logflags.LogToStderrName, "logs at or above this threshold go to stderr")
flag.Var(&mainLog.fileThreshold,
logflags.LogFileVerbosityThresholdName, "minimum verbosity of messages written to the log file")
......
......@@ -60,7 +60,10 @@ func Shoutf(ctx context.Context, sev Severity, format string, args ...interface{
})
defer t.Stop()
}
if mainLog.stderrRedirected() {
if !LoggingToStderr(sev) {
// The logging call below would not otherwise appear on stderr;
// however this is what the Shout() contract guarantees, so we do
// it here.
fmt.Fprintf(OrigStderr, "*\n* %s: %s\n*\n", sev.String(),
strings.Replace(MakeMessage(ctx, format, args), "\n", "\n* ", -1))
}
......
......@@ -75,9 +75,13 @@ func testLogGC(
const newLogFiles = 20
// Prevent writes to stderr from being sent to log files which would screw up
// the expected number of log file calculation below.
mainLog.noStderrRedirect = true
// Prevent direct writes to fd 2 from being sent to log
// files which would screw up the expected number of log file
// calculation below.
//
// TODO(knz): I suspect this is not needed. Investigate and perhaps
// remove.
stderrLog.noRedirectInternalStderrWrites = true
// Ensure the main log has at least one entry. This serves two
// purposes. One is to serve as baseline for the file size. The
......
......@@ -66,11 +66,12 @@ func NewSecondaryLogger(
}
l := &SecondaryLogger{
logger: loggerT{
logDir: DirName{name: dir},
prefix: program + "-" + fileNamePrefix,
fileThreshold: Severity_INFO,
noStderrRedirect: true,
gcNotify: make(chan struct{}, 1),
logDir: DirName{name: dir},
prefix: program + "-" + fileNamePrefix,
fileThreshold: Severity_INFO,
stderrThreshold: mainLog.stderrThreshold.get(),
noRedirectInternalStderrWrites: true,
gcNotify: make(chan struct{}, 1),
},
forceSyncWrites: forceSyncWrites,
enableMsgCount: enableMsgCount,
......
......@@ -79,7 +79,7 @@ func TestRedirectStderrWithSecondaryLoggersActive(t *testing.T) {
defer s.Close(t)
setFlags()
logging.stderrThreshold = Severity_NONE
mainLog.stderrThreshold = Severity_NONE
// Ensure that the main log is initialized. This should take over
// stderr.
......
......@@ -23,23 +23,19 @@ var OrigStderr = func() *os.File {
}()
// LoggingToStderr returns true if log messages of the given severity
// sent to the main logger are also visible on stderr.
// sent to the main logger are also visible on stderr. This is used
// e.g. by the startup code to announce server details both on the
// external stderr and to the log file.
//
// This is also the logic used by Shout calls.
func LoggingToStderr(s Severity) bool {
return s >= logging.stderrThreshold.get()
}
// stderrRedirected returns true if and only if logging to this logger
// captures stderr output to the log file. This is used e.g. by
// Shout() to determine whether to report to standard error in
// addition to logs.
func (l *loggerT) stderrRedirected() bool {
return logging.stderrThreshold > Severity_INFO && !l.noStderrRedirect
return s >= mainLog.stderrThreshold.get()
}
// hijackStderr replaces stderr with the given file descriptor.
//
// A client that wishes to use the original stderr must use
// OrigStderr defined above.
// A client that wishes to use the original stderr (the process'
// external stderr stream) must use OrigStderr defined above.
func hijackStderr(f *os.File) error {
return redirectStderr(f)
}
......
......@@ -18,6 +18,7 @@ import (
"golang.org/x/sys/unix"
)
// dupFD is used to initialize OrigStderr (see stderr_redirect.go).
func dupFD(fd uintptr) (uintptr, error) {
// Warning: failing to set FD_CLOEXEC causes the duplicated file descriptor
// to leak into subprocesses created by exec.Command. If the file descriptor
......@@ -38,6 +39,17 @@ func dupFD(fd uintptr) (uintptr, error) {
return nfd, nil
}
// redirectStderr is used to redirect internal writes to fd 2 to the
// specified file. This is needed to ensure that harcoded writes to fd
// 2 by e.g. the Go runtime are redirected to a log file of our
// choosing.
//
// We also override os.Stderr for those other parts of Go which use
// that and not fd 2 directly.
func redirectStderr(f *os.File) error {
return unix.Dup2(int(f.Fd()), unix.Stderr)
if err := unix.Dup2(int(f.Fd()), unix.Stderr); err != nil {
return err
}
os.Stderr = f
return nil
}
......@@ -16,6 +16,7 @@ import (
"golang.org/x/sys/windows"
)
// dupFD is used to initialize OrigStderr (see stderr_redirect.go).
func dupFD(fd uintptr) (uintptr, error) {
// Adapted from https://github.com/golang/go/blob/go1.8/src/syscall/exec_windows.go#L303.
p, err := windows.GetCurrentProcess()
......@@ -26,7 +27,17 @@ func dupFD(fd uintptr) (uintptr, error) {
return uintptr(h), windows.DuplicateHandle(p, windows.Handle(fd), p, &h, 0, true, windows.DUPLICATE_SAME_ACCESS)
}
// redirectStderr is used to redirect internal writes to the error
// handle to the specified file. This is needed to ensure that
// harcoded writes to the error handle by e.g. the Go runtime are
// redirected to a log file of our choosing.
//
// We also override os.Stderr for those other parts of Go which use
// that and not fd 2 directly.
func redirectStderr(f *os.File) error {
if err := windows.SetStdHandle(windows.STD_ERROR_HANDLE, windows.Handle(f.Fd())); err != nil {
return err
}
os.Stderr = f
return windows.SetStdHandle(windows.STD_ERROR_HANDLE, windows.Handle(f.Fd()))
return nil
}
......@@ -92,11 +92,12 @@ func (sb *syncBuffer) rotateFile(now time.Time) error {
return err
}
// Redirect stderr to the current INFO log file in order to capture panic
// stack traces that are written by the Go runtime to stderr. Note that if
// --logtostderr is true we'll never enter this code path and panic stack
// traces will go to the original stderr as you would expect.
if sb.logger.stderrRedirected() {
// If this logger is responsible for capturing direct writes to the
// process' file descriptor 2, then do it here.
//
// This captures e.g. all writes performed by internal
// assertions in the Go runtime.
if sb.logger.redirectInternalStderrWrites() {
// NB: any concurrent output to stderr may straddle the old and new
// files. This doesn't apply to log messages as we won't reach this code
// unless we're not logging to stderr.
......
......@@ -83,17 +83,20 @@ func ScopeWithoutShowLogs(t tShim) *TestLogScope {
func enableLogFileOutput(dir string, stderrSeverity Severity) (func(), error) {
mainLog.mu.Lock()
defer mainLog.mu.Unlock()
oldStderrThreshold := logging.stderrThreshold.get()
oldNoStderrRedirect := mainLog.noStderrRedirect
oldStderrThreshold := mainLog.stderrThreshold.get()
oldNoStderrRedirect := stderrLog.noRedirectInternalStderrWrites
undo := func() {
mainLog.mu.Lock()
defer mainLog.mu.Unlock()
logging.stderrThreshold.set(oldStderrThreshold)
mainLog.noStderrRedirect = oldNoStderrRedirect
mainLog.stderrThreshold.set(oldStderrThreshold)
stderrLog.noRedirectInternalStderrWrites = oldNoStderrRedirect
}
logging.stderrThreshold.set(stderrSeverity)
mainLog.noStderrRedirect = true
mainLog.stderrThreshold.set(stderrSeverity)
stderrLog.noRedirectInternalStderrWrites = true
// TODO(knz): if/when stderrLog becomes different from mainLog,
// ensure that the stderrLog file output gets re-opened
// and os.Stderr gets redirected at this point.
return undo, mainLog.logDir.Set(dir)
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment