Skip to content

Regression for net_offload API in net_if.c? #14479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tautologyclub opened this issue Mar 13, 2019 · 8 comments
Closed

Regression for net_offload API in net_if.c? #14479

tautologyclub opened this issue Mar 13, 2019 · 8 comments
Assignees
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@tautologyclub
Copy link
Contributor

In commit 9464ec3 the following was introduced:

-		status = api->send(iface, pkt);
+		status = net_if_l2(iface)->send(iface, pkt);

Which feels sort of bogus when using offloading devices. As I understand it, it is more or less L2 that is offloaded. Our driver has a NULL l2 field, which I at least THOUGHT was super standard when using net_offload. So what happens is that when we reach this part of the function, a null pointer gets executed and the device resets without warning. Took a little while to track down, but this must be a bug right?

I sort of get the impression that net_offload has taken a lot of hits with the recent versions. @andrewboie @andyross are listed as the code owners. Maybe you guys can chime in on the current state of net_offload? There are no tests for net offloading from what I can see, which is a bit strange to me since a software-emulated device could be implemented quite easily.

@tautologyclub tautologyclub added the bug The issue is a bug, or the PR is fixing a bug label Mar 13, 2019
@tbursztyka
Copy link
Collaborator

When a device is an offloaded one and exposes a net_offload api implementation, the packets are not going through net_send_data() and thus not through net_if_send_data(), but instead directly through net_offload functions. So the commit you noted did not change anything.

It's not L2 which is offloaded, is basically the full ip stack, so L3. net_offload API is used directly from net_context which in turn is used by socket.

That said, offload is indeed in between right now due to net_context API changes. See #13618
It's going to be ready for 1.14.

@tautologyclub
Copy link
Contributor Author

@tbursztyka Well at least for our (offloading) device it definitely gets called, here's the backtrace:

0  net_tc_submit_to_tx_queue (tc=0 '\000', pkt=pkt@entry=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_tc.c:37
#1  0x0003bb44 in net_if_queue_tx (iface=iface@entry=0x2001df80 <__net_if_modem_ca2ut_0>, pkt=pkt@entry=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_if.c:234
#2  0x0003c284 in net_if_send_data (iface=0x2001df80 <__net_if_modem_ca2ut_0>, pkt=pkt@entry=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_if.c:317
#3  0x0003b970 in net_send_data (pkt=pkt@entry=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_core.c:327
#4  0x0003aff6 in context_sendto_new (context=context@entry=0x2000e234 <contexts>, buf=buf@entry=0x20008d98 <_mbedtls_heap+1972>, len=len@entry=113, dst_addr=dst_addr@entry=0x20007d20 <tls_contexts+112>, addrlen=addrlen@entry=8, cb=cb@entry=0x0 <sys_pm_dump_debug_info>, timeout=timeout@entry=-1, token=token@entry=0x0 <sys_pm_dump_debug_info>, user_data=user_data@entry=0x0 <sys_pm_dump_debug_info>) at /..../subsys/net/ip/net_context.c:1678
#5  0x0003b65c in net_context_sendto_new (context=context@entry=0x2000e234 <contexts>, buf=buf@entry=0x20008d98 <_mbedtls_heap+1972>, len=len@entry=113, dst_addr=dst_addr@entry=0x20007d20 <tls_contexts+112>, addrlen=addrlen@entry=8, cb=cb@entry=0x0 <sys_pm_dump_debug_info>, timeout=timeout@entry=-1, token=token@entry=0x0 <sys_pm_dump_debug_info>, user_data=0x0 <sys_pm_dump_debug_info>) at /..../subsys/net/ip/net_context.c:1790
#6  0x0000c674 in zsock_sendto_ctx (ctx=0x2000e234 <contexts>, buf=0x20008d98 <_mbedtls_heap+1972>, len=113, flags=<optimized out>, dest_addr=dest_addr@entry=0x20007d20 <tls_contexts+112>, addrlen=addrlen@entry=8) at /..../subsys/net/lib/sockets/sockets.c:448
#7  0x0000c6cc in sock_sendto_vmeth (obj=<optimized out>, buf=<optimized out>, len=<optimized out>, flags=<optimized out>, dest_addr=0x20007d20 <tls_contexts+112>, addrlen=8) at /..../subsys/net/lib/sockets/sockets.c:1189
#8  0x0000cda4 in dtls_tx (ctx=<optimized out>, buf=<optimized out>, len=<optimized out>) at /..../subsys/net/lib/sockets/sockets_tls.c:464
#9  0x0001db1e in mbedtls_ssl_flush_output (ssl=ssl@entry=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_tls.c:2777
#10 0x0001ed02 in mbedtls_ssl_flight_transmit (ssl=ssl@entry=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_tls.c:3086
#11 0x0001ad9c in ssl_write_client_hello (ssl=ssl@entry=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_cli.c:1098
#12 0x0001b7c4 in mbedtls_ssl_handshake_client_step (ssl=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_cli.c:3544
#13 0x0001e14c in mbedtls_ssl_handshake_step (ssl=ssl@entry=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_tls.c:8064
#14 0x0001e180 in mbedtls_ssl_handshake (ssl=0x20007d30 <tls_contexts+128>) at /..../ext/lib/crypto/mbedtls/library/ssl_tls.c:8088
#15 0x0000d15e in tls_mbedtls_handshake (context=context@entry=0x2000e234 <contexts>, block=block@entry=true) at /..../subsys/net/lib/sockets/sockets_tls.c:780
#16 0x0000d86e in sendto_dtls_client (ctx=ctx@entry=0x2000e234 <contexts>, buf=buf@entry=0x20001168 <sock>, len=len@entry=4, flags=flags@entry=0, dest_addr=dest_addr@entry=0x0 <sys_pm_dump_debug_info>, addrlen=addrlen@entry=0) at /..../subsys/net/lib/sockets/sockets_tls.c:1391
#17 0x0000dea2 in ztls_sendto_ctx (ctx=0x2000e234 <contexts>, buf=0x20001168 <sock>, len=4, flags=0, dest_addr=dest_addr@entry=0x0 <sys_pm_dump_debug_info>, addrlen=addrlen@entry=0) at /..../subsys/net/lib/sockets/sockets_tls.c:1451
#18 0x0000df08 in tls_sock_sendto_vmeth (obj=<optimized out>, buf=<optimized out>, len=<optimized out>, flags=<optimized out>, dest_addr=0x0 <sys_pm_dump_debug_info>, addrlen=0) at /..../subsys/net/lib/sockets/sockets_tls.c:1981
#19 0x0000c70a in _impl_zsock_sendto (sock=<optimized out>, buf=buf@entry=0x20001168 <sock>, len=len@entry=4, flags=flags@entry=0, dest_addr=dest_addr@entry=0x0 <sys_pm_dump_debug_info>, addrlen=addrlen@entry=0) at /..../subsys/net/lib/sockets/sockets.c:467
#20 0x00002282 in zsock_sendto (addrlen=0, dest_addr=0x0 <sys_pm_dump_debug_info>, flags=0, len=4, buf=0x20001168 <sock>, sock=<optimized out>) at /..../projects/aquaboard/build/zephyr/include/generated/syscalls/socket.h:27
#21 zsock_send (flags=0, len=4, buf=0x20001168 <sock>, sock=<optimized out>) at /..../include/net/socket.h:160
#22 send (flags=0, len=4, buf=0x20001168 <sock>, sock=<optimized out>) at /..../include/net/socket.h:276
#23 tx_manager_init () at /..../projects/aquaboard/src/tx-manager.c:751
#24 0x000010d6 in main () at /..../projects/aquaboard/src/main.c:49

it submits process_tx_packet to work queue:

Breakpoint 19, net_if_tx (iface=0x2001df80 <__net_if_modem_ca2ut_0>, pkt=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_if.c:157
157	
(gdb) bt
#0  net_if_tx (iface=0x2001df80 <__net_if_modem_ca2ut_0>, pkt=0x20017598 <_k_mem_slab_buf_tx_pkts+240>) at /..../subsys/net/ip/net_if.c:157
#1  0x0003c1d2 in process_tx_packet (work=<optimized out>) at /..../subsys/net/ip/net_if.c:212
#2  0x0000732e in z_work_q_main (work_q_ptr=0x2000e938 <tx_classes>, p2=<optimized out>, p3=<optimized out>) at /..../lib/os/work_q.c:32
#3  0x0000731c in _thread_entry (entry=0x7325 <z_work_q_main>, p1=<optimized out>, p2=<optimized out>, p3=<optimized out>) at /..../lib/os/thread_entry.c:29
#4  0x2000e938 in rx_classes ()

Pretty sure that status = api->send(iface, pkt); was a call to offload_send before the commit I mentioned. Where, in your opinion, SHOULD the offload API have kicked in in my backtrace?

@tbursztyka
Copy link
Collaborator

You are using the master as I can see, and net_offload does not work anymore yet there. See #13618, but on master net_context_sendto_new (which is then renamed to net_context_send in the mentioned PR) is not calling net_offload API thus why it ends up using the core stack path (net_send_data and all), which obviously does not work.

It's going to be fixed by tomorrow, I just have myself to get back to my hw stash to verify it.

@rljordan-zz rljordan-zz assigned jukkar and tbursztyka and unassigned jukkar Mar 15, 2019
@nashif nashif added the priority: low Low impact/importance bug label Mar 19, 2019
@tautologyclub
Copy link
Contributor Author

Is it fixed? @nashif I certainly wouldn't classify this as low priority. The entire networking stack is currently unusable for us, and presumably for everyone else who relies on the net_offload API.

@jukkar
Copy link
Member

jukkar commented Mar 20, 2019

@tautologyclub can you re-try with latest master?

@tbursztyka
Copy link
Collaborator

@jukkar PR #13618 has not been merged yet, so it is still not working

@tbursztyka
Copy link
Collaborator

#13618 got merged. closing

@tautologyclub
Copy link
Contributor Author

Confirmed to work, bangup job @tbursztyka 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

5 participants