You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I am running Zephyr sample samples/net/sockets/echo_async on stm32f769i_disco.
I can telnet into the target and it echos back as expected.
But when I do a port scan with nmap on the IP address of the target, I get a bus fault (see logs below).
This happens both on Zephyr v3.2.0 and on main branch.
To Reproduce
Steps to reproduce the behavior:
west build samples/net/sockets/echo_async --board=stm32f769i_disco -p
west -v flash -d build/ -r jlink
telnet 192.0.2.1 4242 -> it works
nmap 192.0.2.1 -> nmap fails note! Don't run with sudo! (see explanation below)
telnet 192.0.2.1 4242 -> FAILS
This fails without any modifications other than changing the IP address in prj.conf.
Add CONFIG_LOG=y and CONFIG_LOG_MODE_DEFERRED=y to see the hardfault
Expected behavior
nmap should report the open port (4242).
Application should continue to work regardless of nmap activity.
Impact
Hard-faulting when subjected to a port scan is a critical failure.
This prevents me from upgrading my application from Zephyr v3.0.0 to v3.2.0
(this didn't happen with Zephyr v3.0.0).
Logs and console output
This is the output when I've set CONFIG_LOG=y and CONFIG_LOG_MODE_DEFERRED=y.
*** Booting Zephyr OS build zephyr-v3.2.0-3990-g06d53b1343ba ***
Connection #0 from 10.42.68.123 fd=2
Connection fd=2 closed
Connection #1 from 10.42.68.123 fd=2
[00:00:22.904,000] <err> os: ***** BUS FAULT *****
[00:00:22.904,000] <err> os: Precise data bus error
[00:00:22.904,000] <err> os: BFAR Address: 0x5dddfe44
[00:00:22.904,000] <err> os: r0/a1: 0x00000000 r1/a2: 0x5dddfe40 r2/a3: 0x00000001
[00:00:22.904,000] <err> os: r3/a4: 0x3dce0000 r12/ip: 0x00000025 r14/lr: 0x080148bd
[00:00:22.904,000] <err> os: xpsr: 0x210e2c00
[00:00:22.904,000] <err> os: Faulting instruction address (r15/pc): 0x08014604
[00:00:22.904,000] <err> os: >>> ZEPHYR FATAL ERROR 25: Unknown error on CPU 0
[00:00:22.904,000] <err> os: Current thread: 0x20022138 (unknown)
[00:00:22.966,000] <err> os: Halting system
Looking up in the map file, thread 0x20022138 turns out to be z_main_thread (which runs the socket code).
Environment
OS: Linux
Toolchain: zephyr-sdk-0.15.2
Commit SHA or Version used: v3.2.0 and latest commit on main branch (06d53b1)
Additional context
I first saw this in my own application running on custom hardware (stm32f777ni)
This started happening when I upgraded from Zephyr v3.0.0 to v3.2.0
Note! If you run nmap as privileged user (sudo), the application doesn't fail. This is because the port scanning behaves differently in unprivileged mode. The nmap manual has a lot to say about this, for example:
On Unix boxes, only the privileged user root is generally able to send and receive raw TCP packets.
For unprivileged users, a workaround is automatically employed whereby the connect system call is
initiated against each target port. This has the effect of sending a SYN packet to the target host, in
an attempt to establish a connection.
This should provide some hints to why it fails. I've tried using different nmap scan types and I've also briefly looked at Wireshark logs, but I haven't been able to pinpoint exactly why this happens.
The text was updated successfully, but these errors were encountered:
So isn't this just a memory buffer issue because essentially you're trying to open too many sockets on the board? Have you tried increasing buffer/stack sizes?
@nordicjm, I didn't try this on the sample application, but on my application I have plenty of buffers and generous buffer sizes, and I also tried tripling the stack size.
With the sample application I tried to create more than five connections through telnet, and the application behaves as expected: When trying to establish the sixth connection, telnet stalls until one of the first five connections are closed, then it successfully establishes the new connection. So I don't think it is related to that.
What nmap does in unpriviliged mode, is that it sends SYN packets to consecutive ports. If it succeeds to open a TCP connection, it immediately sends a RST packet to abort it. This did not play well with over-optimised net_context structure.
The net_context assumed that it can safely share memory for FIFO reserved space and user data. The nmap case proved that this was not always the case. As TCP uses the user data to notify errors to upper layers, and receiving RST packet is considered as an error condition, we've ended up in a situation when user data pointer was overwritten, while the net_context could still await on the accept queue (being simply a FIFO). This damaged the FIFO reserved memory, leading to a crash.
Describe the bug
I am running Zephyr sample
samples/net/sockets/echo_async
on stm32f769i_disco.I can telnet into the target and it echos back as expected.
But when I do a port scan with nmap on the IP address of the target, I get a bus fault (see logs below).
This happens both on Zephyr v3.2.0 and on
main
branch.To Reproduce
Steps to reproduce the behavior:
west build samples/net/sockets/echo_async --board=stm32f769i_disco -p
west -v flash -d build/ -r jlink
telnet 192.0.2.1 4242
-> it worksnmap 192.0.2.1
-> nmap fails note! Don't run withsudo
! (see explanation below)telnet 192.0.2.1 4242
-> FAILSThis fails without any modifications other than changing the IP address in prj.conf.
Add
CONFIG_LOG=y
andCONFIG_LOG_MODE_DEFERRED=y
to see the hardfaultExpected behavior
nmap should report the open port (4242).
Application should continue to work regardless of nmap activity.
Impact
Hard-faulting when subjected to a port scan is a critical failure.
This prevents me from upgrading my application from Zephyr v3.0.0 to v3.2.0
(this didn't happen with Zephyr v3.0.0).
Logs and console output
This is the output when I've set
CONFIG_LOG=y
andCONFIG_LOG_MODE_DEFERRED=y
.Looking up in the map file, thread 0x20022138 turns out to be
z_main_thread
(which runs the socket code).Environment
Additional context
Note! If you run nmap as privileged user (sudo), the application doesn't fail. This is because the port scanning behaves differently in unprivileged mode. The nmap manual has a lot to say about this, for example:
This should provide some hints to why it fails. I've tried using different nmap scan types and I've also briefly looked at Wireshark logs, but I haven't been able to pinpoint exactly why this happens.
The text was updated successfully, but these errors were encountered: