Skip to content

mimxrt1050_evb board: Can't get Ethernet to work #11586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfalcon opened this issue Nov 21, 2018 · 87 comments
Closed

mimxrt1050_evb board: Can't get Ethernet to work #11586

pfalcon opened this issue Nov 21, 2018 · 87 comments
Assignees
Labels
area: Boards area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP priority: low Low impact/importance bug

Comments

@pfalcon
Copy link
Collaborator

pfalcon commented Nov 21, 2018

Describe the bug
This is continuation of the discussion at (merged/closed) #10875 (comment) .

I cannot get Ethernet connection to BOARD=mimxrt1050_evb. When I connect a patchcord between the mimxrt1050_evb and my laptop, I don't get "link active" LED light up on my laptop side, i.e. it behaves as if the cable wasn't connected. Of course, network interface in Linux doesn't have "RUNNING" status in ifconfig. As discussed at the link above, on mimxrt1050_evb, both Etherjack LEDs are lighted up, and never blink.

This same setup works as expected with frdm_k64f. I.e. if I just switch mimxrt1050_evb with frdm_k64f, leaving the same USB and Ethernet cables, it works, switching back it doesn't, again frdm_k64f - works, back to mimxrt1050_evb - doesn't.

To Reproduce

I'm using dumb_http_server as a reference sample to run.

Environment (please complete the following information):
My laptop is Thinkpad X230, with e1000e driver for Ethernet, Ubuntu 18.04:

Linux x230 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:48:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

@agansari, Can you please describe you test setup in detail, i.e. what is connected where, etc.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 21, 2018

As additional info, I tried to vary different things. At one point e1000e device was actually hosed, so switching back to frdm_k64f it didn't work, and there was a weird error in dmesg. That caused me some confusion, but after rebooting the laptop, I have a clear reproducible picture that connected frdm_k64f works, while mimxrt1050_evb - doesn't.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 21, 2018

I tried another Eth cable too ;-).

@agansari
Copy link
Collaborator

Hello @pfalcon I'm using a VM to develop and test. I have a USB-Ethernet adapter that is assigned to the VM as eth1 (eth0 - NAT to host). I'm using the adapter to connect the board to the VM.
I'm testing testing dumb_http_server and it does not work. Looks like Ethernet driver is not correctly initialized in i.MX RT in current baseline.
I'll fix the bug and test dumb_http_server (i need to setup DNS and NAT on eth1).

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 22, 2018

@agansari: Thanks for the info and confirmation, looking towards a fix,

I'll fix the bug and test dumb_http_server (i need to setup DNS and NAT on eth1).

Note that dumb_http_server doesn't access Internet in any way, so DNS or NAT should not be needed. dumb_http_server is well, a simple web server. If everything runs ok, you can just access it from a desktop browser using http://192.0.2.1:8080/ . I use Apache Bench (ab) tool on Linux, because I usually test that a big number of HTTP requests can be handled without anything go wrong (e.g. 1000 requests in row).

@galak galak added bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug labels Nov 22, 2018
@agansari
Copy link
Collaborator

Hello, today I made a rebase on the latest firmware and ethernet works as expected, initilization goes okay, leds blink as they should and also ran the benchmark:


This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.0.2.1 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:        
Server Hostname:        192.0.2.1
Server Port:            8080

Document Path:          /
Document Length:        2122 bytes

Concurrency Level:      1
Time taken for tests:   5.996 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      2181000 bytes
HTML transferred:       2122000 bytes
Requests per second:    166.77 [#/sec] (mean)
Time per request:       5.996 [ms] (mean)
Time per request:       5.996 [ms] (mean, across all concurrent requests)
Transfer rate:          355.21 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.2      1       3
Processing:     4    5   0.4      5       8
Waiting:        3    5   0.4      5       8
Total:          4    6   0.3      6       9

Percentage of the requests served within a certain time (ms)
  50%      6
  66%      6
  75%      6
  80%      6
  90%      6
  95%      6
  98%      7
  99%      7
 100%      9 (longest request)

Can you have a go @pfalcon ? Latest commit I tested on is: 2d6a226

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 23, 2018

@agansari: Thanks for the update and detailed info. Not sure I'll get to it today, but will test it, thanks for the exact rev to use, to be on the same line.

@jeremy-e-mills
Copy link

@agansari: I'm afraid it also does not work for me. I based on commit 2d6a226, built the dumb_http_server and there is no Ethernet activity, both leds are permanently on, no blinking. It does not generate any response to ARP broadcasts for its IP address.
If I build and run the echo_server sample, this shows the same behaviour, however it sends a broadcast ARP request when I ping out from the board using its uart net console (it ignores the response from the target though).

After some dubugging it seems that ENET_ReceiveIRQHandler() is never called from eth_mcux_dispacher_isr() in drivers/ethernet/eth_mcux.c, although when I ping out, ENET_TransmitIRQHandler() is called, so it looks as if the common EMAC interrupt is set up OK but not being activated for received frames.

Not sure where to go from here, finding it a bit difficult to fully understand the code.

I have confirmed that the hardware is OK by loading a MCUXPresso UDP echo server demo project which works fine.

@agansari
Copy link
Collaborator

@jeremy-e-mills have you cleaned the build folder workspace after rebase? Sounds like the old behavior.

cd ~/zephyr/samples/net/sockets/dumb_http_server/build
cd .. && rm -rf build && mkdir build && cd build
cmake -GNinja -DBOARD=mimxrt1050_evk ..
ninja debug

Works on my setup, if there are more issues, there maybe something with my setup.

@jeremy-e-mills
Copy link

@agansari: Yes, I cloned a fresh repository and also am building out of the zephyr tree as am using Eclipse for debug builds. I have deleted my build directories a few times and started again during the investigation. The same behaviour is observed with normal and Eclipse project builds, running XIP from hyperflash and SRAM.

If it's my setup then I'm currently out of ideas. The samples run fine on the K64 board.

@agansari
Copy link
Collaborator

@jeremy-e-mills I only tested with code in TCM memory, don't think XIP works at all. Does your code reach: mimxrt1050_evk_init() ?

@jeremy-e-mills
Copy link

@agansari: Yes, it does and I have stepped through the pinmux set up for the EMAC connections. Sorry, should not perhaps have confused things by mentioning XIP. When I want this I cheat by building for XIP in hyperflash with a 0x2000 text section offset. I then replace the first 0x2000 bytes with the contents of another file containing an XIP header binary file that I created by stripping the first 0x2000 bytes from an XIP MCUXPresso project.
Had to do this to run the echo server demo with additional debug enabled as when these are enabled it won't fit into the default 128KB code optimised ram, my actual debugging has been from RAM using Eclipse.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

@jeremy-e-mills: Just wanted to mention that I find your comments insightful, thanks for mentioned XIP and otherwise giving detailed info on steps you take. I'll try to join the debugging fun as soon as I can, feel a bit tired today, but hope to get to it tomorrow.

@jeremy-e-mills
Copy link

@pfalcon: Hi, glad to have someone else looking into this. I should have contributed earlier but have been a bit reluctant as I'm a newbie to zephyr.
It's all very strange. From debug output the PHY looks happy, it negotiates 100M duplex and can see the Ethernet cable removed and replaced and can send ARP packets, but just does not see any incoming traffic, with no led activity. It's so mad I have been thinking about my build environment but have ruled every thing out (I think!). Will get onto it again tomorrow.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

Latest commit I tested on is: 2d6a226

Ok, tested this now, and get the same picture as I described above: my laptop's Ethernet link active LED doesn't light up, i.e. it doesn't think there's a connection carrier.

@jeremy-e-mills : What about you in this case? Can you confirm that the peer sees cable connection between itself and the board? And can you describe your test setup?

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

So, what I'm doing is testing using samples/net/echo_server with ipv6 disabled. I then enabled:

+CONFIG_ETHERNET_LOG_LEVEL_DBG=y
+CONFIG_ETH_MCUX_PHY_EXTRA_DEBUG=y

And I saw log output below. And then suddenly I noticed that link LED on laptop is up! Still no pings though.

uart:~$ ***** Booting Zephyr OS zephyr-v1.13.0-2149-g2d6a226b2e *****
[00:00:00.031,856] <dbg> eth_mcux.eth_0_init: MAC 00:04:9f:49:38:be
[00:00:00.031,856] <dbg> eth_mcux.eth_mcux_phy_start: phy_state=initial
[00:00:00.031,909] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=initial
[00:00:00.031,960] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=reset
[00:00:00.032,013] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=autoneg
[00:00:00.032,066] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=restart
[00:00:00.032,117] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:01.101,444] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:01.101,495] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:02.110,005] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:02.110,056] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:03.120,006] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:03.120,057] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:04.130,005] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:04.130,058] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:04.130,111] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-duplex
[00:00:04.130,111] <inf> eth_mcux.eth_mcux_phy_event: Enabled 100M full-duplex mode.
[00:00:05.140,006] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:05.140,059] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:06.150,007] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait
[00:00:06.150,060] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:00.001,723] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

Btw, this is the output I get:

[00:00:06.150,060] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=read-status
[00:00:00.001,723] <dbg> eth_mcux.eth_mcux_phy_event: phy_state=wait

I would imagine there's a timing source bug somewhere, can you look into that, @agansari ?

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

@agansari : Btw, this else-if ladder doesn't look too right to me, what if there're multiple interrupts to serve, why make it call ISR again instead of serving all in one go?

        if (EIR & (kENET_RxBufferInterrupt | kENET_RxFrameInterrupt)) {
                ENET_ReceiveIRQHandler(ENET, &context->enet_handle);
        } else if (EIR & (kENET_TxBufferInterrupt | kENET_TxFrameInterrupt)) {
                ENET_TransmitIRQHandler(ENET, &context->enet_handle);
        } else if (EIR & ENET_EIR_MII_MASK) {
                k_work_submit(&context->phy_work);
                ENET_ClearInterruptStatus(ENET, kENET_MiiInterrupt);
        } else if (EIR) {
                ENET_ClearInterruptStatus(ENET, 0xFFFFFFFF);
        }

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

eth_mcux_dispacher_isr() in drivers/ethernet/eth_mcux.c, although when I ping out, ENET_TransmitIRQHandler() is called, so it looks as if the common EMAC interrupt is set up OK but not being activated for received frames.

I confirm that I see this behavior too.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

I confirm that I see this behavior too.

And I confirm that I see ARP request from imx in wireshark on the host side, but of imx doesn't see the reply back:

84	1037.105739249	Freescal_49:38:be	Broadcast	ARP	60		Who has 192.0.2.2? Tell 192.0.2.1
85	1037.105757409	WistronI_c8:94:35	Freescal_49:38:be	ARP	42		192.0.2.2 is at 3c:97:0e:c8:94:35

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

So, link status goes up and down erratically for me - during startup of new debugging session, and not too often. For example, just got it down and can't recover so far.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Nov 27, 2018

Was able to recover. So again, just works erratically w/o too clear pattern.

@jeremy-e-mills
Copy link

The problem is with the PHY setup. I disabled the PHY reset in mimxrt1050_evk_init() and re-built. I then flashed into hyperflash a working Ethernet MCUXPresso project, confirmed LED activity and that it responded to pings. I then debugged the echo server sample project (which now no longer resets the PHY at start). The Ethernet now functions correctly, led activity ARP responses OK etc.

So, two things:-
The configuration of the PHY at startup does not break it if it was previously working.
The configuration of the PHY from its reset state is not working.

@agansari: Why do you set GPI01.10 to be an output in mimxrt1050_evk_init()? It's the INTRP output from the PHY.

@agansari
Copy link
Collaborator

@jeremy-e-mills
GPIO1.10 - ENET_INT from PHY to MCU (GPIO_AD_B0_10)
GPIO1.9 - ENET_RST from MCU to PHY (GPIO_AD_B0_09)

Let me understand, you dont pull the reset now and the PHY will work if it was previously correctly enabled by another firmware? If you run a demo that does not enable PHY before running sample project, does it still work?
I'll try this out as well, use MCUXPresso demos that do enable/disable PHY and then run dumb http sample.

@agansari
Copy link
Collaborator

agansari commented Dec 18, 2018

I just noticed that on iMX RT side, the orange LED in Ethernet jack blinks all the time, with the same frequency (~5Hz, to a naked eye)

@pfalcon yes that isn't the behavior i'm observing, unless there's an actual Rx packet i don't see any led activity, also it's in sync with the networking adapter's leds.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Dec 18, 2018

Not fully fixed by #11882, reopening.

@agansari: Thanks for describing the LED pattern you see, definitely something strange happens on my side. I also tried to plug iMX RT into my home router instead of my laptop, and the blinking pattern is the same.

@agansari
Copy link
Collaborator

Hello @pfalcon I've made some changes to i.MX RT ethernet driver after some SDK updates. Can you test #12465 pull request?
Do you still observe any issues?

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 18, 2019

@agansari:

I've made some changes to i.MX RT ethernet driver after some SDK updates. Can you test #12465 pull request?
Do you still observe any issues?

Yeah, I didn't respond earlier as I was away from the board during the winter holidays. I'm now back in its vicinity, and going to test new PRs.

@mubes
Copy link
Contributor

mubes commented Jan 18, 2019

I'm not sure if this is useful but in NuttX I saw an issue with early IMXRT1050 boards where the phy goes into NANDTree mode. On the EVKA It can be set by pin strapping on pin 21 of the PHY and there are configuration resistors to set the default config (Pg 10 of the circuit diagram). Where it gets odd is that R152 (To Gnd) and R309 (To ENET_3V3) are both provisioned on my board, so unless I'm misunderstanding something pin 21 will be at (ENET_3V3 / 2) during boot - sometimes it'll come up in NANDTree, sometimes sensibly. That would mean the problem could be fixed by removing R152 (which should have been DNP, but it is on the board), but a soft fix is more portable...just make sure the NANDTree bit is clear in the PHY config.

@agansari
Copy link
Collaborator

@mubes I have not observed this on EVKB boards, there were some power issues in EVKA boards and revision 0 of the chip (see https://www.nxp.com/docs/en/nxp/application-notes/AN12146.pdf).
To prevent PHY entering NANDTree we pull up the ENET_INT/NANDTree pin at boot time.

GPIO_PinInit(GPIO1, 10, &enet_gpio_config);

/* pull up the ENET_INT before RESET. */
GPIO_WritePinOutput(GPIO1, 10, 1);

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 21, 2019

Ok, testing with pristine master 26c9c74 . The situation is the same - TX packets from board seen in Wireshark, but the board itself doesn't see RX.

@pfalcon pfalcon reopened this Jan 21, 2019
@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 21, 2019

With #12465, situation ain't much better: #12465 (comment)

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 21, 2019

Now going into "random poking" mode:

To prevent PHY entering NANDTree we pull up the ENET_INT/NANDTree pin at boot time.
/* pull up the ENET_INT before RESET. */
GPIO_WritePinOutput(GPIO1, 10, 1);

@agansari, Remember me complaining that I always get weird ~5Hz blinking of Eth jack's orange LED? if I change that to:

GPIO_WritePinOutput(GPIO1, 10, 0);

- the blinking is gone! I get ~1/3 bright both LEDs instead (which may mean 100+Hz blinking of course), and everything else works the same! I.e., TX packets reach host, RX don't reach board.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 21, 2019

@agansari, @jeremy-e-mills: I wonder if we can approach this mystery in the following manner: Can you send me a know working binary with Ethernet support built for another RTOS (as mentioned above by @jeremy-e-mills), so I can try it on my board?

@mubes
Copy link
Contributor

mubes commented Jan 21, 2019

1/3 bright LEDs is NANDTree mode, so sounds like this is a red herring, sorry. However, you can just check the state of the NANDTree bit in the Operation Mode Strap Override Register (Reg 16 bit 5) and reset it if needed...here is the patch that was made in NuttX for this case;

ret = imxrt_readmii(priv, phyaddr, MII_KSZ8081_OMSO, &phydata);
if (ret < 0)
{
     nerr("ERROR: Failed to read MII_KSZ8081_OMSO\n");
     return ret;
}
imxrt_writemii(priv, phyaddr, MII_KSZ8081_OMSO,   (phydata&(~(1 << 5))));

@AntoineZen
Copy link
Contributor

@mubes I' have tried to remove R152 as you suggested here on my EVKA (Rev A5) board, but this does not solve the issue. The LEDS remains in the same status, and the board is able to Tx but not Rx (confirmed with wireshark).

I also have tried the PR #12465 (the last commit should prevent NAND-Tree), with no better success. With this PR, the link does not even come up!

@agansari
Copy link
Collaborator

@pfalcon I've pushed another commit on pull request #12465 with a configuration similar to what @mubes has done to prevent PHY entering NAND Tree. I have not observed this issue on my board.

@agansari
Copy link
Collaborator

@AntoineZen did you get the latest push? You may have gotten the first commit that did not work.

@AntoineZen
Copy link
Contributor

AntoineZen commented Jan 22, 2019

@agansari I'have seen that the commit from PR #12465 have changed! (How is that possible? Did you push-force ?) I have tested again and now Ethernet is working! Great work ! (seems that our comment have crossed ;-) )

@agansari
Copy link
Collaborator

Yes, I've force pushed after testing the patch :) Thank you for testing.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 22, 2019

I've pushed another commit on pull request #12465 with a configuration similar to what @mubes

Thanks @mubes and @agansari, but no luck for me with the latest version of #12465 still.

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 22, 2019

To avoid any possibilities, tried following patch:

--- a/drivers/ethernet/eth_mcux.c
+++ b/drivers/ethernet/eth_mcux.c
@@ -394,11 +394,13 @@ static void eth_mcux_phy_setup(void)
                kENET_MiiReadValidFrame);
        status = ENET_ReadSMIData(ENET);
 
-       if (status & PHY_OMS_NANDTREE_MASK) {
+printk("nandtree: %x\n", status & PHY_OMS_NANDTREE_MASK);
+
+//     if (status & PHY_OMS_NANDTREE_MASK) {
                status &= ~PHY_OMS_NANDTREE_MASK;
                ENET_StartSMIWrite(ENET, phy_addr, PHY_OMS_OVERRIDE_REG,
                        kENET_MiiWriteValidFrame, status);
-       }
+//     }

Prints 0. But I noticed that @mubes code above reads and write the same reg, while @agansari you read STATUS, write OVERRIDE. Is it intended?

@agansari
Copy link
Collaborator

@pfalcon both should work fine, OVERRIDE is also RW, STATUS is RO, did you check it out reading OVERRIDE? Status 0 means it's not the case on your side, you can also try to write without the if statement.

Also I see an issue if board is powered up and phy's not yet stabilized, ETH may not work. Resetting the board makes it work because phy was stable for long enough.

Attached a bare-metal example binary.
evkbimxrt1050_enet_txrx_transfer.zip

@pfalcon
Copy link
Collaborator Author

pfalcon commented Jan 22, 2019

did you check it out reading OVERRIDE?

I did.

Status 0 means it's not the case on your side, you can also try to write without the if statement.

I did, per the patch above.

Also I see an issue if board is powered up and phy's not yet stabilized, ETH may not work. Resetting the board makes it work because phy was stable for long enough.

If you mean resetting by Reset button, I did.

Attached a bare-metal example binary.

Thanks! Will look into it a bit later.

@MaureenHelm
Copy link
Member

@pfalcon are you still seeing this issue?

@pfalcon
Copy link
Collaborator Author

pfalcon commented Feb 22, 2019

@MaureenHelm, well, yeah, nothing changes on my side. I actually didn't look into it lately, was preempted with "make sockets work on a real-world 3rd-party app" work.

@galak
Copy link
Collaborator

galak commented Mar 7, 2019

@pfalcon any update on this?

@pfalcon
Copy link
Collaborator Author

pfalcon commented Apr 2, 2019

Ok, so we got together with @agansari at Linaro Connect to look into that. After we made sure that we have all the hardware setup (jumpers, etc.) and Zephyr version, the same, and after I upgraded to the latest west tool, it worked out for me. We do noted that there's a chance that Eth bootstrap might fail on startup, but I definitely had it working. Thanks for patience here.

@pfalcon pfalcon closed this as completed Apr 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Boards area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

7 participants