Skip to content

TCP: fix deadlock on slow peer #14944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dgloeck
Copy link
Contributor

@dgloeck dgloeck commented Mar 27, 2019

Unlocks the mutex needed by the code for processing ACKs while waiting for TX packets to become available.

Before the mutex can be unlocked, it must be locked by tcp_syn_rcvd(), tcp_synack_received(), and backlog_ack_timeout(), which in turn requires us to unlock it in net_tcp_connect() while waiting for tcp_synack_received() to be called.

Fixes #14571

Copy link
Collaborator

@tbursztyka tbursztyka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@codecov-io
Copy link

codecov-io commented Mar 27, 2019

Codecov Report

Merging #14944 into master will increase coverage by 0.01%.
The diff coverage is 70.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #14944      +/-   ##
==========================================
+ Coverage   52.87%   52.88%   +0.01%     
==========================================
  Files         309      309              
  Lines       45267    45288      +21     
  Branches    10451    10453       +2     
==========================================
+ Hits        23934    23951      +17     
- Misses      16558    16560       +2     
- Partials     4775     4777       +2
Impacted Files Coverage Δ
subsys/net/ip/net_context.c 59.07% <100%> (+0.18%) ⬆️
subsys/net/ip/tcp.c 58.43% <61.11%> (+0.35%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1eead93...ab78f57. Read the comment docs.

@@ -379,6 +379,8 @@ static int prepare_segment(struct net_tcp *tcp,
struct net_context *context = tcp->context;
struct net_buf *tail = NULL;
struct net_tcp_hdr *tcp_hdr;
struct net_if *interface;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these are introduced here, if they are used only in the else {} block (and thus can be very well declared there?).

@@ -2099,7 +2109,9 @@ NET_CONN_CB(tcp_synack_received)
struct net_tcp_hdr *tcp_hdr = proto_hdr->tcp;
int ret;

NET_ASSERT(context && context->tcp);
NET_ASSERT(context);
k_mutex_lock(&context->lock, K_FOREVER);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is lock stuffed between 2 asserts here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because context->tcp can in theory change if someone else is holding the lock.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because context->tcp can in theory change if someone else is holding the lock.

Ok, ack. I bet I'd personally add a comment on that.

Copy link
Collaborator

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updated patch and glad to see that @tbursztyka converged on it too. But I think there's a room to polish this patch, please see previous inline comments.

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick about commit subject. Needs some more testing, which is most probably the trickiest part here.

@jukkar jukkar added area: Networking bug The issue is a bug, or the PR is fixing a bug labels Mar 27, 2019
@jukkar jukkar added this to the v1.14.0 milestone Mar 27, 2019
@jukkar
Copy link
Member

jukkar commented Mar 29, 2019

I did some testing with samples/net/socket/echo_server with default prj.conf + these additional configs:

CONFIG_NET_DEBUG_NET_PKT_ALLOC=y
CONFIG_NET_BUF_POOL_USAGE=y
CONFIG_LOG_IMMEDIATE=y

I used native_posix, so in Linux side I ran net-setup.sh from net-tools project so that zeth interface gets created.

After zephyr is started (the command I used is zephyr/zephyr.exe -attach_uart), I started echo-client in host like this ./echo-client 2001:db8::1 -t

The zephyr console prints this info:

[00:02:09.420,000] <inf> net_echo_server_sample: Waiting for TCP connection on port 4242 (IPv6)...
[00:02:12.120,000] <err> net_pkt: *** ERROR *** pkt 0x0807c7a4 is freed already by _tcp_syn_rcvd():2451 (processing_data():154)
[00:02:12.120,000] <inf> net_echo_server_sample: TCP (IPv6): Accepted connection
[00:02:12.180,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 1 bytes
[00:02:12.300,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 6 bytes
[00:02:12.360,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 4 bytes
[00:02:12.420,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 26 bytes
[00:02:12.480,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 640 bytes
[00:02:12.540,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 592 bytes
[00:02:12.720,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 1 bytes
[00:02:12.780,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 256 bytes
[00:02:12.840,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 640 bytes
[00:02:12.900,000] <dbg> net_echo_server_sample.process_tcp: TCP (IPv6): Received and replied with 640 bytes
[00:02:13.140,000] <inf> net_echo_server_sample: TCP (IPv6): Connection closed

The double free information is worrying even though it does not cause any harm here as the code bails out if it notices this issue. Without this PR, I am not seeing this error.

@dgloeck
Copy link
Contributor Author

dgloeck commented Mar 29, 2019

That is indeed strange. If we pass through line 2451 in _tcp_syn_rcvd(), all functions (processing_data()->process_data()->net_ipv4_input()->net_conn_input()->tcp_syn_rcvd()) should return NET_OK until we are back in processing_data(). There the returned NET_OK should cause the net_pkt_unref() call to be skipped.

@pfalcon pfalcon self-requested a review March 29, 2019 16:17
@pfalcon pfalcon dismissed their stale review March 29, 2019 16:19

My comments are mostly addressed

@pfalcon
Copy link
Collaborator

pfalcon commented Mar 29, 2019

@dgloeck, There're CI failures which look legitimate (i.e. don't look random):

native_posix:tests/net/tcp/net.tcp
starting test - test IPv6 TCP reset packet creation
ASSERTION FAIL [mutex->lock_count > 0U] @ /home/buildslave/src/github.com/zephyrproject-rtos/zephyr/

Same + crash.

Can you look into those?

@pfalcon
Copy link
Collaborator

pfalcon commented Mar 29, 2019

Confirmed that it's reproducible locally:

tests/net/tcp:

SeaBIOS (version rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org)
Booting from ROM..***** Booting Zephyr OS v1.14.0-rc1-1584-geadef21dab58 *****
starting test - test TCP init
passed
starting test - test TCP register/unregister port cb
passed
starting test - test TCP context init
passed
starting test - test IPv6 TCP reset packet creation
ASSERTION FAIL [mutex->lock_count > 0U] @ /home/pfalcon/projects-3rdparty/Embedded/Zephyr/zephyr/kernel/mutex.c:212
	
[00:00:02.530,000] <err> net_conn: Address families different
***** Kernel Panic! *****
Current thread ID = 0x00403500
eax: 0x00019328, ebx: 0x00400148, ecx: 0x000003f8, edx: 0x004126e0
esi: 0x00400140, edi: 0x00412750, ebp: 0x004126ec, esp: 0x004126d0
eflags: 0x00000206 cs: 0x0008
call trace:
eip: 0x00004617
     0x0000d1a3 (0x400148)
     0x0000d3d8 (0x4127bc)
     0x0000ddf9 (0x41279c)
     0x00003140 (0x402500)
     0x00003316 (0x0)
     0x0000f353 (0x4127e4)
     0x00003f91 (0x0)
Terminate emulator due to fatal kernel error

I guess crashdump may just accompany the ASSERT failure.

@dgloeck
Copy link
Contributor Author

dgloeck commented Mar 29, 2019

I know, will fix that this weekend. The tests call internal functions without locking the mutex that is now expected to be locked.

@pfalcon
Copy link
Collaborator

pfalcon commented Mar 29, 2019

Report was wrong, see next comment.

Ok, I tested master 5049694 + this patch against qemu_x86 + e1000 ethernet (i.e. with overlay-e1000.conf) with sample dumb_http_server, because my testing in #13943 showed that sometimes it may hang after several hundred of requests.

So, with this patch, I get 100% reproducible situation (with a dozen of runs) that the sample locks up after exactly 14 requests served:

...

@pfalcon
Copy link
Collaborator

pfalcon commented Mar 29, 2019

Ok, I tested master 5049694 + this patch against qemu_x86 + e1000 ethernet ...

Oh fun! It's the same with pristine master 5049694. So, since 9722214 which I last tested #13943, we've managed to break the stack. Fun, fun!

@jukkar
Copy link
Member

jukkar commented Mar 30, 2019

Ok, I tested master 5049694 + this patch against qemu_x86 + e1000 ethernet (i.e. with overlay-e1000.conf) with sample dumb_http_server, because my testing in #13943 showed that sometimes it may hang after several hundred of requests.

So, with this patch, I get 100% reproducible situation (with a dozen of runs) that the sample locks up after exactly 14 requests served:

The "hang" issues is probably fixed by #15045 (regression)

@dgloeck
Copy link
Contributor Author

dgloeck commented Apr 1, 2019

The tests/net/tcp asserts should be resolved now. @jukkar's double unref with echo-server still needs to be debugged.

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted the "already freed" issue (see comment), after fixing that I did not see any errors.

Unlocks the mutex needed by the code for processing ACKs while waiting
for TX packets to become available.

Before the mutex can be unlocked, it must be locked by tcp_syn_rcvd(),
tcp_synack_received(), and backlog_ack_timeout(), which in turn requires
us to unlock it in net_tcp_connect() while waiting for
tcp_synack_received() to be called.

Fixes zephyrproject-rtos#14571

Signed-off-by: Daniel Glöckner <[email protected]>
@jukkar
Copy link
Member

jukkar commented Apr 4, 2019

Just tested this with native_posix board and valgrind:

  • samples/net/sockets/echo-server running in zephyr, echo-client in host. Did not spot any issues, the device recovers from TCP flooding (cmd ./echo-client 2001:db8::1 -t -F).
  • samples/net/sockets/dumb_http_server, device runs out of memory when running this command ab -n 50 http://192.0.2.1:8080/ from host. Even after stopping apache bench program, the memory situation does not get better, so basically we leak net_bufs. The net allocs command says the allocations are done in 0x08067228/1 used TX prepare_segment():414
  • samples/net/zperf, I did not spot any memory / TCP related issues.

For the testing I used overlay-e1000.conf file in all cases.

Edit: the memory leak that I saw in dumb_http_server, is not present if running same command in master, so it is related to changes by this PR.

@nashif nashif modified the milestones: v1.14.0, future Apr 12, 2019
@pfalcon
Copy link
Collaborator

pfalcon commented May 7, 2019

Ping @dgloeck on the status/progress of this. Thanks.

@jhedberg
Copy link
Member

Closing since there hasn't been any response from the author

@jhedberg jhedberg closed this Jan 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TCP: sending lots of data deadlocks with slow peer
7 participants