Skip to content

uart: Problems with interrupt-driven UART in QEMU and some hw boards #8869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nniranjhana opened this issue Jul 12, 2018 · 27 comments
Closed
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@nniranjhana
Copy link

The echo test in zephyr/samples/subsys/console/ does not print back characters to console.

Platforms: qemu_x86, quark_se_c1000_devboard

In the test, console_getchar gets input from user, and console_putchar should post characters got to the console. But it doesn't, and only on adding two console_putchar instead of one, prints back the character.

@nashif nashif added the bug The issue is a bug, or the PR is fixing a bug label Jul 12, 2018
@nashif nashif added the priority: medium Medium impact/importance bug label Jul 12, 2018
@pfalcon
Copy link
Collaborator

pfalcon commented Jul 13, 2018

I cannot reproduce this. For me, samples/subsys/console/echo works as expected for BOARD=qemy_x86 - echoes character immediately after key press, without any changes, and worked like that all the time. My setup: Zephyr rev 3161617, Zephyr SDK 0.9.3. I don't have quark_se_c1000_devboard, so can't test with it.

@nniranjhana, feel free to provide details of your setup, build steps, and any other info which may help shed some light on what may be wrong on your side.

@pfalcon pfalcon removed the priority: medium Medium impact/importance bug label Jul 13, 2018
@pfalcon
Copy link
Collaborator

pfalcon commented Jul 13, 2018

How does samples/subsys/console/getchar work for you on qemu_x86?

@nashif
Copy link
Member

nashif commented Jul 13, 2018

echo sample does not work for me in qemu.

@nashif nashif added the priority: medium Medium impact/importance bug label Jul 13, 2018
@nashif
Copy link
Member

nashif commented Jul 13, 2018

please do not remove priorities, you can argue to lower or raise it, but do not remove it.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 8, 2018

Ok, so I have to news, good and bad, and actually I don't know which of them is which:

  1. Still can't reproduce with for qemu_x86 (qemu from SDK 0.9.5, master 93bd263).
  2. But now I can see "something" with mps2_an385 in QEMU (which I never used before otherwise, and for which I don't have real board either).

So, this "something" behaves in following way: a) I do the usual "make run"; b) now I start typing "1", "2", "3", etc. in sequence. So, on typing "1", nothing happens. On typing "2", the "1" is shown on the screen, then on typing "3" - "2" is show, etc. I.e. output legs behind by one character from input.

For reference, qemu_cortex_m3 works w/o issue for me, the same as qemu_x86.

I'm getting back to this because after #10923, this issue (or issues) are going to affect Zephyr shell too. Actually, it does for me for mps2_an385, and the behavior with shell seems to be even more erratic as described above.

@jarz-nordic, @nordic-krch: Please join the fun of testing this!

@nordic-krch
Copy link
Collaborator

you can disable interrupt uart driver (thanks to imply being used in shell kconfig) for this board. It seems like uart_cmsdk_apb driver which is used here is not working well with interrupts. We might need to disable uart with interrupts for that board until it is resolved.

Please join the fun of testing this!

And again, it's not our fault that bug are discovered. It's rather the opposit, it's fame and glory. Same story was with complaints about shell not working well with STM devices. We ended up getting those boards and @jarz-nordic found a bug in ST uart implementation (#11039).

@pfalcon pfalcon changed the title samples: console: echo test does not echo uart: Problems with interrupt-driven UART in QEMU and some hw boards Nov 9, 2018
@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

Continuing from #10923 (comment):

@nordic-krch :

Well, i think that we have uart.h api that i should be able to rely on.

So, now you know that you can't. Actually I thought you guys are well aware of that, and that's why there's e.g. #10820. So, just to avoid any future confusion, let me share another best kept secret: there're many parts in Zephyr which can't be relied upon. That's because Zephyr is so far developed in extensive (vs intensive) growth, and is effectively a frankenstein system collected from kinda-randomly put together parts. We need to mend its design, replace bad parts with better parts, and actually make sure that they are better. That's why you see a lot of scrutiny like in #10820.

If specific driver implements interrupt version then it's not my fault if it is faulty :).

Well, nobody says it's your fault. If there're any implications you can make out of it all, it would be:

  1. Now you know why subsys/console, introduced 1.5yrs ago, develops so slow. (One of the reasons) Because soon after it was introduced, I hit these issues with qemu and as nobody gave a damn about it all, well, it just stayed there.
  2. Now you also should understand why I don't rush at all with replacing tty's adhoc ring buffers with your refactored ring_buffer - because that implementation was tested on a few of real hw's, and ISRs are very subtle matter, any minor change there can lead to regressions. So, I'm putting that off until I'll be able to retest on bunch of hw again. (Got many other active tasks, you know.)

If qemu interrupt driven uart is lousy then it should be improved or disabled or we should have big banner somewhere do not trust drivers, test all before any change.

Right. So:

Step 0: Let's join the forces, as finally there's more than just me playing well with uart in interrupt mode, but you too.

Step 1: Can you please test scenarios with samples/subsys/console/echo as described above, and share your results, so we better understand how inconsistent patterns we see.

Step 2: Likewise, should test/describe behavior of Z shell as of the current master.

Feel free to swap steps 1 and 2 ;-).

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

it's fame and glory

Absolutely. I for one was locked in that tower waiting for you guys to come and help me fight the dragon. ;-). So, let's get to it.

It seems like uart_cmsdk_apb driver which is used here is not working well with interrupts.

Again, don't discount bugs in QEMU UART emulation. We likely have both bugs in Zephyr drivers, and bugs in QEMU emulation, in arbitrary combinations for a particular target.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

For shell, apparently makes sense to standardize on samples/subsys/shell/shell_module for testing. Here's the behavior I see with ... BOARD=mps2_an385 run:

  1. On start, the usual uart:~$ prompt is not printed.
  2. Now I press "1", get two linefeeds.
  3. "2" - ua gets printed.
  4. "3" - rt:~$ gets printed
  5. "4" - 123 gets printed.
  6. "5" - 4
  7. "6" - 5, and following, it converges to the behavior I described previously for samples/subsys/console/echo, where on a key press, previous char gets printed, i.e. output lags by 1 char.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

And hypothesis we follow here is "shell may be broken for some user in qemu_x86". The logic here is:

@nniranjhana, @nashif previously reported that samples/subsys/console/echo didn't work for them in qemu -> Z shell now uses about the same interrupt-driven UART access method as console/echo -> hence, there's a possibility that shell won't work for the users who were initially affected.

So, @nniranjhana, @nashif, can you please how samples/subsys/shell/shell_module works for you in qemu_x86?

Also, can you retest samples/subsys/console/echo with the current version of SDK's QEMU too? (Please provide as much as possible additional info about your environment, if the problem persists.)

Thanks.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

It seems like uart_cmsdk_apb driver which is used here is not working well with interrupts.

@galak, Do you have that physical MPS2 board? If so, can you help us to decide whether it's Zephyr-side driver problem, or QEMU problem? Just run samples/subsys/shell/shell_module on the board and see if you get anything like #8869 (comment) or not. Thanks!

@jakub-uC
Copy link
Collaborator

jakub-uC commented Nov 9, 2018

On start, the usual uart:~$ prompt is not printed.
Now I press "1", get two linefeeds.

Two line feeds are first characters send by the shell before promt is printed.

"2" - ua gets printed.

ua looks like beginning of the promt :)

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

ua looks like beginning of the promt :)

Absolutely. Again, to get to the bottom of it, we first need to compare, exactly, what behavior each us see. I described mine. Now your call guys. Do you see ua at that step, or maybe you see uar, or maybe something else completely?

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

@nordic-krch :

you can disable interrupt uart driver (thanks to imply being used in shell kconfig) for this board.

Well, that's not what I think. I think we should have a switch to disable interrupt-based UART support on shell level. Let me know if you disagree. (And I again would point out that if already had common console layer, we'd need to apply that in one place, now we need to apply it in 2).

In that regard, CONFIG_SHELL_BACKEND_SERIAL_FORCE_INTERRUPTS=n in prj.conf doesn't give the expected effect (it stays at y in .config).

@jakub-uC
Copy link
Collaborator

jakub-uC commented Nov 9, 2018

I have tested:

  • nrf52840_pca10056
  • frdm_k64f
  • disco_l475_iot1
  • qemu_x86

I did not observe any problems using example:
samples/subsys/shell/shell_module

@nordic-krch
Copy link
Collaborator

Well, that's not what I think. I think we should have a switch to disable interrupt-based UART support on shell level.

I disagree, if we have board/platform which has bugs in interrupt version of uart driver (or qemu error) we should disable that on that platform and not in shell. We don't want other modules to use uart interrupts there only to find out again that it does not work.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

@jarz-nordic, Cool, thanks, so what about BOARD=mps2_an385?

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

@nordic-krch

I disagree, if we have board/platform which has bugs in interrupt version of uart driver (or qemu error)

Then I have bad news for you, e.g. #8187 . Per my hypothesis, QEMU serial emulations is always broken, the question is how much, i.e. error rate. It may be 1 char per million, or 1 char per thousand, or maybe every other char. You say that we should not QEMU serial at all (== not use QEMU at all), and I say that we should be more flexible than that, it devise a set of workarounds for particular cases of adverse bahavior, until we understood it better and fix it (be it on QEMU level or otherwise).

@nordic-krch
Copy link
Collaborator

i'm only saying to not use serial with interrupts on qemu (SERIAL_SUPPORT_INTERRUPT=n). shell_uart handles that case and i assume that other modules using uart should also support handling that case since it's not guaranteed that uart will support interrupts.

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

@nordic-krch : And I'm saying that interrupts should work on qemu, and SERIAL_SUPPORT_INTERRUPT=n is a way to pretend that they shouldn't even work. We don't try to resolve the problem, we try to pretend it doesn't exist this way.

So, instead IMHO we should stick around warning posts like SHELL_DISABLE_UART_INTERRUPT, so it was clear to everyone that it's ugly workaround until a better solution is found.

@jakub-uC
Copy link
Collaborator

jakub-uC commented Nov 9, 2018

@pfalcon
-DBOARD=qemu_cortex_m3 works just fine
-DBOARD=mps2_an385 works as you described :/

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 9, 2018

@jarz-nordic : Thanks for confirming!

Oh, btw: #11202

@jakub-uC
Copy link
Collaborator

jakub-uC commented Nov 9, 2018

Yes I saw that. It looks like a problem with serial interrupt API.

@nniranjhana
Copy link
Author

@pfalcon

So, @nniranjhana, @nashif, can you please how samples/subsys/shell/shell_module works for you in qemu_x86?

Shell module works good for me on QEMU, able to use all modules which are enabled.

Also, can you retest samples/subsys/console/echo with the current version of SDK's QEMU too? (Please provide as much as possible additional info about your environment, if the problem persists.)

Yes, it does persist with the latest commits on Zephyr and with the newest version of Zephyr SDK too. Again, if I add another console_putchar(c), it echoes back, else it doesn't.

Some info on my environment, and console log:

SeaBIOS (version rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org)
Booting from ROM..***** Booting Zephyr OS zephyr-v1.13.0-1788-ge3506832a4 *****
Start typing characters to see them echoed back
still not echoing
nope

Zephyr SDK version 0.9.5
Linux version 4.16.11-100.fc26.x86_64
gcc version 7.3.1 20180130 (Red Hat 7.3.1-2) (GCC))

On /usr/bin/qemu-system-x86_64 --version, I get

QEMU emulator version 2.9.1(qemu-2.9.1-2.fc26)

Let me know if you need any further information

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 15, 2018

@nniranjhana: Thanks for re-testing and additional info!

QEMU emulator version 2.9.1(qemu-2.9.1-2.fc26)

Well, sorry, if you use that, then it's not supported for Zephyr. You should use qemu from Zephyr SDK 0.9.5, which is:

$ /home/pfalcon/opt/zephyr-sdk-0.9.5/sysroots/x86_64-pokysdk-linux/usr/bin/qemu-system-i386 --version
QEMU emulator version 3.0.50 (v3.0.0-614-g19b599f766-dirty)

Let me know if you need any further information

In #11314, I modify the console/echo sample to provide additional output to hint want may be wrong. When it's merged, I would ask you to re-run it.

@nashif
Copy link
Member

nashif commented Jan 30, 2019

status on this?

@nashif
Copy link
Member

nashif commented Mar 3, 2019

not able to reproduce.

@nashif nashif closed this as completed Mar 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

5 participants