Skip to content

Bluetooth: Mesh: 2nd time commissioning configuration details (APP Key) not get saved on SoC flash #12574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vikrant8052 opened this issue Jan 18, 2019 · 22 comments · Fixed by #14735
Assignees
Labels
area: Bluetooth area: Settings Settings subsystem bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug

Comments

@vikrant8052
Copy link
Contributor

Board used for testing : nRF52840_PCA10056
App : samples/boards/nrf52/mesh/onoff_level_lighting_vnd_app

Description:

Let firmware has flash using following commands ...

nrfjprog --eraseall -f nrf52
nrfjprog --program zephyr.hex -f nrf52
nrfjprog --reset -f nrf52

After this. if we do provision & configuration of such Bluetooth Mesh DEVICE then here
everything works fine. Even after reset or power reset firmware could retrieve Mesh
configuration details from persistent storage.

But after this if we do un-provisioning of NODE using smartphone App or
by executing bt_mesh_reset(), then next time onward after device get
newly provision & configure, it does not retrieve all configuration parameters
from SoC persistent storage after device get reset.

(Note: After 2nd time Provisioning & Configuration and before reset everything works perfectly normal)

After power reset, we have to reconnect with every NODE with Provisioner App & reassign App key
to Models. Sometimes even have to reassign Subscription Group address.

@vikrant8052 vikrant8052 added the bug The issue is a bug, or the PR is fixing a bug label Jan 18, 2019
@jhedberg
Copy link
Member

I tested many cycles of what you describe when I implemented storage (settings) support for mesh, and it was working fine then. I wonder if this is a regression, a new use case that wasn't previously tested, or some bug in your application.

@jhedberg jhedberg self-assigned this Jan 19, 2019
@jhedberg
Copy link
Member

If you have the chance, please enable CONFIG_BT_DEBUG_SETTINGS=y and CONFIG_BT_MESH_DEBUG_SETTINGS=y and try to see if you spot anything strange.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Jan 21, 2019

log.txt

@jhedberg
For your reference, I've attached debug log file.

@vikrant8052
Copy link
Contributor Author

@jhedberg
Have you found previously attached debug log useful ?

FYI, I can't check mentioned scenario with samples/boards/nrf52/mesh/onoff-app because BUS FAULT ...

***** Booting Zephyr OS zephyr-v1.13.0-3474-gdcad256106 *****
Initializing...
***** BUS FAULT *****
Instruction bus error
***** Hardware exception *****
Current thread ID = 0x20001cf4
Faulting instruction address = 0x11f38020
Fatal fault in essential thread! Spinning...

@jhedberg
Copy link
Member

@Vikrant8051 have you tried reproducing with gdb and getting a backtrace? Also, have you verified that the usual suspect, i.e. stack overflow, isn't the cause?

@vikrant8052
Copy link
Contributor Author

@jhedberg

I do testing now multiple time & would like to conclude that issue is only with
App key. After 2nd time provisioning & reset, App key for Models not get
properly bind to models or not it get retrieve from persistent storage.

If there is bug in app itself, then why it is only related
with App key ?

@vikrant8052
Copy link
Contributor Author

@jhedberg

have you tried reproducing with gdb and getting a backtrace?

Sorry, I don't know how to do that.

Also, have you verified that the usual suspect, i.e. stack overflow, isn't the cause?

I am trying now.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Jan 24, 2019

@jhedberg

Debug log after enabling CONFIG_INIT_STACKS=y ...

log2.txt

I didn't find any stack overflow event.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Jan 24, 2019

@jhedberg
nrfMesh gives us option to select 1 App key out of 3.

"If we used every time first App key out of available keys then & then only
we are facing mentioned issue" OR "in other words APP_Key index should be changed
for every new provisioning event" .... then & then only there is no need of re-assigning of
APP_Key to models after reset.

@jhedberg
Copy link
Member

jhedberg commented Jan 24, 2019

Debug log after enabling CONFIG_INIT_STACKS=y ...
I didn't find any stack overflow event.

That only shows the Bluetooth stacks, but not e.g. the system workqueue stack, which we use a lot from the Bluetooth code. The best way I've discovered to get full stack info is by enabling the kernel shell module and then using the "kernel stacks" shell command. This requires the following Kconfig options:

CONFIG_THREAD_STACK_INFO=y
CONFIG_THREAD_MONITOR=y
CONFIG_KERNEL_SHELL=y
CONFIG_INIT_STACKS=y
CONFIG_THREAD_NAME=y

If you don't have the possibility of running a shell, you could look into what C API the "kernel stacks" shell command uses, and add that call somewhere to your code.

As for gdb, isn't that just a matter of running make/ninja debug (or was it debugserver?)

@vikrant8052
Copy link
Contributor Author

After enabling above mentioned Kconfig options, I am getting intermittently following Kernel OOPS

***** Kernel OOPS! *****
Current thread ID = 0x2000098c
Faulting instruction address = 0x17e12
Fatal fault in ISR! Spinning...

@vikrant8052
Copy link
Contributor Author

@jhedberg no overflow detected.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Jan 24, 2019

@jhedberg
Now I just edited samples/bluetooth/mesh app by adding only .....

struct foo {
	u8_t b:1;
};

struct foo foo;

#define BT_MESH_MODEL_OP_GEN_ONOFF_STATUS	BT_MESH_MODEL_OP_2(0x82, 0x04)

static void gen_onoff_get(struct bt_mesh_model *model,
			  struct bt_mesh_msg_ctx *ctx,
			  struct net_buf_simple *buf)
{
	struct net_buf_simple *msg = NET_BUF_SIMPLE(2 + 3 + 4);

	bt_mesh_model_msg_init(msg, BT_MESH_MODEL_OP_GEN_ONOFF_STATUS);
	net_buf_simple_add_u8(msg, foo.b++);

	if (bt_mesh_model_send(model, ctx, msg, NULL, NULL)) {
		printk("Unable to send GEN_ONOFF_SRV Status response\n");
	}
}

Here too facing same issue which happens after 2nd time provisioning & reset.
We have to reassign App key to Gen. OnOff Server to get response from it.

So I think it is now safe to assume that it is not a App level issue.

@vikrant8052
Copy link
Contributor Author

@jhedberg have you able to replicate mentioned issue with samples/bluetooth/mesh ?

@jhedberg
Copy link
Member

@jhedberg have you able to replicate mentioned issue with samples/bluetooth/mesh

@Vikrant8051 no. I'm planning to use tests/bluetooth/mesh_shell once I find the time (the shell is usually the easiest way to reproduce this kind of issues).

@galak galak added priority: medium Medium impact/importance bug area: Bluetooth labels Jan 29, 2019
@vikrant8052
Copy link
Contributor Author

@jhedberg
Do you have any update on this ?

@jhedberg
Copy link
Member

jhedberg commented Feb 7, 2019

@Vikrant8051 are you able to reproduce this with mesh_shell? I tried quickly with it a few days ago, but I wasn't able to reproduce. That said, it's a bit hard to track from the PR description and various comments what exactly is the info that doesn't get stored in flash after a reset + reprovision&reconfigure. Is it deterministic? Is it always the same piece of info. I tried basic provisioning + app key add, and at least the key was always correctly stored and restored, even after several mesh & power reset cycles.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Feb 7, 2019

@jhedberg
I'm not comfortable with mesh_shell. So answer is no.

#12574 (comment)

For sake of simplicity, please add code mentioned in above link in samples/bluetooth/mesh app. Ideally, after every reset (if device is in provisioned state & APP key is bind with OnOff Server) we should receive GET response from Server side.

Fresh firmware -> provision + configure -> test1 -> Reset1 -> test2 -> unprovision -> provision + configure -> test3 -> Reset2.

(Configuration includes process of binding APP key with Gen. OnOff Server Model.)

After test1, Reset1, test2, test3 everything works perfectly normal. That means we receives GET response from Server side.

But after Reset2, I am not getting any response from OnOff server. We have to reassign App key to Gen. OnOff Server. Then after this we starts receiving GET response from Server. But after power reset, we have to again assign APP key & this goes on.

@vikrant8052
Copy link
Contributor Author

vikrant8052 commented Feb 7, 2019

Let's assume everything is working perfectly normal with shell. But why thing is not working with actual
App ?

Please let me know about your observations, once you flash samples/bluetooth/mesh (after adding suggested changes) in any board.

From my point of view, it is better option because we could test it with any available board.

I tried it with nrf52840_pca10056 as well as nrf51_pca10028 & observe same result in both cases.

@vikrant8052 vikrant8052 changed the title Bluetooth: Mesh: 2nd time configuration details not get saved on SoC flash Bluetooth: Mesh: 2nd time commissioning configuration details (APP Key) not get saved on SoC flash Feb 13, 2019
@carlescufi
Copy link
Member

@Vikrant8051 @jhedberg is this still an issue?

@jhedberg
Copy link
Member

jhedberg commented Mar 20, 2019

@Vikrant8051 have you found out the cause of this yet?

I think I'm able to reproduce it with mesh_shell now, however I'm still trying to get more info by fine-tuning the logs.

Question: was this with the FCB or NFFS settings backend? Have you tried both? At least one indication I have looks like a possible bug with settings+fcb since I can see a value reported as stored but I don't see it getting retrieved after a power cycle.

@jhedberg jhedberg added the area: Settings Settings subsystem label Mar 20, 2019
@jhedberg
Copy link
Member

@Vikrant8051 ok, I think I've got this figured out. There were several bugs I found. PR coming soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth area: Settings Settings subsystem bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants