-
Notifications
You must be signed in to change notification settings - Fork 7.5k
[RFC] logging: Default to redirecting printk() through it #10715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] logging: Default to redirecting printk() through it #10715
Conversation
@@ -185,6 +185,7 @@ config LOG_MAX_LEVEL | |||
|
|||
config LOG_PRINTK | |||
bool "Enable processing of printk messages." | |||
default y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will ugly clash with shell when boot banner is enabled :/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't. That has been fixed with #10510
I didn't yet even looked into details of new logging (i.e. not enabled it "for real" to debug some issue), but already see artifacts due to its buffering nature: lack of output in case of crash and:
this, yes. But for me this "solution" looks like a workaround of deficiencies of new logging (disclaimer: not trying to diminish "efficiencies" of it, just again, I haven't had a chance to witness them yet in real life). Instead, there should be an easy to access switch to make logging synchronous and it should be spammily advertized, so no developer had a chance to miss it. So, informally I'm -1, and happy to move this into formal review. (Just trying to avoid knee-jerk reactions re: new things introduced). At the very least, it's too early for such changes yet, IMHO, there should be more time left to let people "feel" thru it. |
Codecov Report
@@ Coverage Diff @@
## master #10715 +/- ##
==========================================
+ Coverage 53.23% 53.25% +0.01%
==========================================
Files 214 214
Lines 25831 25831
Branches 5697 5697
==========================================
+ Hits 13752 13757 +5
+ Misses 9762 9759 -3
+ Partials 2317 2315 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, as there're too many +1's now, let me call for more consideration (== time) on this one.
@pfalcon please note that with #10098 logger will flush all buffered logs in case of captured error so there are few cases when you won't see the log:
|
Please show us screen captures of samples/hello_world before and after the change. (And may I say that such information should go into the commit message (of course, if there's interest in the quicker turnaround on the matter, otherwise can indeed wait until 10-20 people try themselves and form an opinion)). |
Without this PR
With this PR
|
Wait, do you want to say that "redirecting thru logging" doesn't mean that usual logging message envelope will be added to it? Because for me, that's what phrase "redirect thru logging" means, and then again, the commit message could be improved to avoid confusion. |
Default to redirecting the printk() calls through the logging subsystem using the special logginng log_printk() function that encapsulates raw strings into buffers and prints them as-is, without any of the logger additional metadata. This has 2 advantages: * Removes potential race conditions between printk() and the logger backend usage of the UART or any other backend * Printsk the printk() statements in-order with the rest of the logs, since now by default printk() messages are deferred Disadvantages: * printk() statements are limited in length * Additional memory consumption by default Signed-off-by: Carles Cufi <[email protected]>
f777831
to
d32dc5a
Compare
That's right, I edited the commit message to clarify this. |
@aescolar the |
@carlescufi : regarding tests/posix/common see #10508 |
@carlescufi : regarding tests/kernel/common it seems it is failing like in all the other platforms. |
Since the printk test is verifying the standard printk functionality, ensure that printk is not being redirected to the logging subsystem. Signed-off-by: Carles Cufi <[email protected]>
ACK, thanks!
Oops, my mistake. Pushed a fix. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds awful to me. Given that the logging subsystem is up and available now, the ONLY remaining use should be for synchronous debug logging, right?
And making printk asynchronous and forcing it to depend on another rather large subsystem is absolutely a hardship for debugging.
I mean, I need to get printk output from interrupt handlers. I need it to work in early boot. I need to see it synchronously so I can debug SMP ordering issues. If that doesn't work by default then I'll be constantly turning this feature off and complaining when support for "traditional" printk bitrots, as it surely will.
I guess the meat of my question is: who are the printk() users who will actually benefit from this change?
The only remaining use of printk() is synchronous debug logging and also providing a simple way of outputting strings to a serial backend without additional code (the logging subsystem requires defining a module AFAIK).
That is not what we are doing though. We are only making it asynchronous when the users enable
Most samples use |
@nordic-krch @jarz-nordic @andyross @nashif @jhedberg That said, @andyross has a point, so there could be an alternative solution. Instead of doing this, we could do the following:
Thoughts? |
does this work for fatal CPU exceptions? |
@andyross logger is intended to be used in interrupt handlers and actually it's much more convenient there since it does not block interrupt for significant amount of time (like synchronous printk will do). It takes <5us to log a message while it will take >200us when synchronous approach is used. Logger is ready from the start (initiliazed as early as possible in Logger supports multiple backends and each backend can have different filter settings so in future we can add something like dmesg where messages from kernel would be buffered in ram and flushed on shell command. Another thing is that when printk goes through logger, output is multiplexed on message basis. If printk goes directly to the output it will compete with the logger, or shell or any other user. @andrewboie |
Having fast logging doesn't do any good if I never see the message. All I'm saying is that to do my job I absolutely need a trivially easy synchronous output routine I can use anywhere. If that's not going to be printk() then what is it? If it's a non-default mode of the logging subsystem, are we testing it to make sure it always works? |
Since there is no consensus here I will instead send PRs to switch the samples to using |
@carlescufi : Well, it's hard to refrain from comment that road to far and roundabout is paved with intentions. We've got so advanced, so interstellar logging ... that now we have to try hard to find a way to work it around.
... And while doing so, we're trying to hog the notion of "console", first by shell, now by logging, round and round. Dare I say what I've said before #8168: the only thing we needed in is interrupt-driven, not thread-driven logging, with clear way to make it fully blocking, as it was before This is in turn addressed by #10765. But surely, let's keep working it around. I don't see that converge to anything though, until logging is scaled down to work in down-to-earth mode. |
@pfalcon not sure I follow the irony-charged statements. The logger might not be ideal, but it's functional and it certainly works better than what we had before. |
Logger is way too good for task which most people have at hand. You know rsyslog? Like, there was normal syslogd, but then they wrote rsyslog. You probably can't run 1024 CPU cluster without it. Not sure anybody need it at home. Or some advanced NoSQL database - not sure about Redis, but MongoDB just must have it. But embedded system? Well, nice niche feature which may be needed sometimes.
Yes, that's why we use deferred logging. But turns out, deferred logging doesn't play well with non-deferred output. Our solution? Defer output! Just imagine, what is printed as normal output by one thread, will be output by another thread. Nobody else has that, but Zephyr will. There's no irony here - it's plain-word description of the situation we find ourselves in. |
Back to the contents of this PR - I'm debugging a case when on driver initialization it gets bombarded with spurious IRQs. Bug in QEMU, or it's dawning on me that maybe IRQ polarity is wrong. ISR has printk() to have insight into that stuff. As expected, I get absolutely no output with this patch - an unacknowledged interrupt on ARM is effectively an infinite loop. The solution would seem simple with CONFIG_LOG_PRINTK=n, but who knows about it? The commit message in this PR doesn't even mention it, but to keep everyone in the know, that's far from enough. I picture monthly automated FAQ posting to the mailing list along the lines of: Q: I'm pulling my hair for 2nd day - my stuff just locks up with no output. But seriously, let's indeed keep things how they were with printk. I appreciate this issue being closed. |
Have you been experiencing any issues with new deferred logging? We've been using same architecture for some time and it worked well. There is CONFIG_LOG_INPLACE_PROCESS option which handles logs synchronously. Currently i see following cases when you will not see latest logs:
For each case there is countermeasure you can apply (except sudden death). |
I must disagree. When we want to log in interrupts, log to different backends with independent filtering, log from multicores and we have kb of rom and ram and mHz not gHz we need something different than simple printk. You are mainly focusing on the cost of deferred logging forgetting benefits. I heard that using sys_log couldn't be used in usb, i've seen that net ping was decreased hundred times by switching to new logger (#8816 (comment)) and i have seen how we benefited in nordic sdk by switching to deferred logging. I think we should focus on closing those gaps where possible (like logging in initial phase). |
Default to redirecting the printk() calls through the logging subsystem.
This has 2 advantages:
backend usage of the UART or any other backend
logs, since now by default printk() messages are deferred
Disadvantages:
Signed-off-by: Carles Cufi [email protected]