-
Notifications
You must be signed in to change notification settings - Fork 7.5k
I2S transfers causes exception/crash in xtensa/Intel S1000 #13223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@dcpleung can you please take a look. I still have no jtag access. |
I can reproduce the exception with
|
I looked into it a bit more. The exception seems to be caused by any DMA transfers. When I commented out the Also, when I disabled ASM2, turned on CONFIG_DEBUG and CONFIG_NO_OPTIMIZATIONS, From these observations, I don't believe it is an issue with memory slab. Using |
I disabled the DMA test and found out that, even without the |
Assigning this back to driver and test owners. |
Any updates? |
@dcpleung @sathishkuttan This is really weird but i am seeing exactly the opposite behaviour with the HEAD at 766edd4. The static K_MEM_SLAB_DEFINE works whereas the dynamic k_mem_slab doesn't. k_mem_slab_alloc fails the first time itself (line number 621 in file drivers/i2s/i2s_cavs.c) citing "insufficient memory". I didn't see any exception though. Can you guys please confirm this behaviour at your end? PS: I made 2 changes to the current HEAD.
|
This is getting trickier. Looks like there are some timing related issues in the i2s test/driver. All my analysis below is for K_MEM_SLAB_DEFINE. |
I just tested it again on my setup.
From what I see here, the Also I haven't seen any exception or crashes. |
This comment has been minimized.
This comment has been minimized.
When |
Yes, verified your observations to be correct. Also, the stack check errors don't go away upon doubling the i2s_test thread's as well as the interrupt stack's size to 4096.
This to me, looks like something timing dependent. May be it messes up some sequence in the code which then results in some kind of a stack corruption. |
I don't have your printk changes so I cannot reproduce on my side. Here is what I have been trying today @ https://github.com/dcpleung/zephyr/tree/sue_creek_i2s_test. I had to comment out the SPI flash test because the SPI flash on my board is bad (can't even talk to it via dediprog). With or without CONFIG_STACK_SENTINEL and CONFIG_LOG (4 cases), the app worked without exceptions. |
I have uploaded my branch at https://github.com/rgundi/zephyr/tree/i2s_crash. The various results are in the attached file. I did not see any silent failures/exceptions with these tests. All I saw was Stack failures and/or mem_alloc failures. |
The two |
@rgundi Could you try https://github.com/dcpleung/zephyr/commits/sue_creek_i2s_test again? I have updated the branch with a few new patches. I tried the different scenarios in your xlsx file and they are all passing with my changes. |
@rgundi printk should not be used once the interface has started because it can delay the reloading of DMA resulting in transfer errors. @rgundi @dcpleung based on some of your observations, my hypothesis would be that the crashes/exceptions are a result of unintended eviction of data from the cache. Since the entire internal memory is is configured with writeback data cache attribute, some data structures that are used by the program may share the same cache line as the mem_slab buffers. When data cache is invalidated, the data structure contents that have not been written back to memory will be evicted and replaced by the contents of memory causing non-deterministic behavior. |
Tested with your branch and i concur with your findings. All the scenarios passed for me as well. I had to change your latest patch a bit though as i was not able to get JTAG working reliably with that patch.
I see that tests specific to DMA were failing but the I2S bidirectional transfer was successful. This testing was with JTAG connected. |
I ran 2 experiments.
|
So, my inference is that there are/were 3 separate problems.
|
This is easy to reproduce. Further characterization in progress.
There's exception on the commit ID 69b6cc8 (with workaround for #13710) even after enabling XTENSA_ASM2. That means we don't have an answer to why there's no exception seen on the latest HEAD.
We realized xthal_dcache_region_writeback_inv was not the right API to use and debugged further. Looks like aligning i2s_mem_slab to XCHAL_DCACHE_LINESIZE solves the issue. Below are the map file snippets around i2s_mem_slab for both the cases. With alignment: Without alignment: When there's no alignment to XCHAL_DCACHE_LINESIZE, invalidating the i2s_mem_slab buffer may also end up invalidating the thread stacks surrounding this object (dma_thread and i2s_thread stacks). Whenever these threads are accessed the next time, garbage data will be pulled from the memory hence resulting in a non-deterministic behavior. So, wherever we have a buffer which would get invalidated (mostly the Rx buffers for dma, i2s etc), we would need to align it to XCHAL_DCACHE_LINESIZE. |
That was not the realization. It's the right API if the problem is due to valid data eviction from the cache upon invalidation.
In order to root cause, can you post the memory addresses of the buffers as well as the k_mem_slab structure to see if there were in the same cache "line"? |
With K_MEM_SLAB_DEFINE onto top of 7352d2b:
Without
Where the |
Good info.
for the record, can you update the comment with the start & end address of audio_buffers for the case "With |
There is no |
OK, I just wanted the address of the buffers and didn't meant to say specifically the
Agree, this shows that the 1st 32 bytes of a cache line will be that of the buffer and the 2nd 32 bytes of the same cache line will be that of the stack.
It's not aligned to cache line, though. |
This would explain why turning on
With Note that as long as |
Also, with 7352d2b and |
Upon configuring the memmap_cache_attr to 0x4212FFF2, I do not see the issue anymore with both static K_MEM_SLAB_DEFINE and the dynamic k_mem_slab (latest HEAD a942fcc). The cache attribute is set to "write through" for the memory addresses in this case. |
The i2s_cavs.c driver manipulates cache lines before commencing any DMA transfers. With write-back cache, if the DMA receive buffer is not aligned to the cache lines, the data around the buffer will be invalidated and may never written to memory. Since the driver takes an external memory slab as buffer and there is no easy way to force cache line alignment on the application side, set the cached region to write-through to avoid potential issue. Fixes zephyrproject-rtos#13223 Signed-off-by: Daniel Leung <[email protected]>
The i2s_cavs.c driver manipulates cache lines before commencing any DMA transfers. With write-back cache, if the DMA receive buffer is not aligned to the cache lines, the data around the buffer will be invalidated and may never written to memory. Since the driver takes an external memory slab as buffer and there is no easy way to force cache line alignment on the application side, set the cached region to write-through to avoid potential issue. Fixes #13223 Signed-off-by: Daniel Leung <[email protected]>
Describe the bug
When a memory slab is defined using the
K_MEM_SLAB_DEFINE
macro, exceptions/crashes are seen on an Intel S1000 boardTo Reproduce
Steps to reproduce the behavior:
tests/boards/intel_s1000_crb/src/i2s_test.c
i2s_mem_slab
andaudio_buffers
definitionsK_MEM_SLAB_DEFINE(i2s_mem_slab, BLOCK_SIZE_BYTES, NUM_I2S_BUFFERS, 4);
PRIORITY, 0, K_MSEC(100));
CONFIG_LOG=n
intests/boards/intel_s1000_crb/prj.conf
Expected behavior
LEDs blink, DMA transfers successful, I2S transfers successful.
Screenshots or console output
Environment (please complete the following information):
Additional context
If an explicit
k_mem_slab
object is defined and initialized usingk_mem_slab_init
, there is no exception/crash.The text was updated successfully, but these errors were encountered: