-
Notifications
You must be signed in to change notification settings - Fork 139
CoucbaseLite-IOS, _changes?longpoll with SG channels management #2078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I build test scripts of above and I can now reproduce the issue. I need to dig further to verify that I am not abused by scripts and that there is no mistake but I get something like 2 errors out of 20 requests. |
Using sync function from #1865 You start everything doing ./testSGlongpoll |
I think this highlights a gap in the solution that was provided for #1865. Currently a longpoll (or continuous) changes feed is doing the following, when coming out of a waiting state:
The gap here is for changes that are indexed/cached between steps 2 and 3 - they can result in missed channel backfills in the scenario you've outlined. It may be sufficient to reverse the order of 2 and 3. I need to do some work to verify there aren't any problematic scenarios with those two steps reversed - and whether that's a better solution to the problem described in #2068. |
Thanks Adam, for your feedback! |
@Crapulax I've pushed an initial version of the fix to feature/issue_1865, that ensures the cached sequence check happens before the user is reloaded. I think this should be sufficient to ensure that the user context is always up to date with what's been cached, and we don't have a situation where we're returning data that's raced ahead of our user context. The reverse situation is now possible - the user context may be ahead of what we've cached - but I don't think that's problematic based on an additional update to the existing backfill handling to avoid duplicate backfill. |
Hellor Adam, I will check the result tomorrow and let you know. Steve |
Hello Adam, There is no concrete change with the latest commit. I have also, in some cases, more documents than expected. I do not recall such behavior with previous version. Just to make sure what I do is correct. When using long poll and i get an answer of the form Seq:lowseq ("last_seq":"289367:289338") what should me my next longpoll since? 289367 or 289338? Well if I use the lowseq (289338) I enter into an infinite loop so I guess it is correct to use Seq(289367), isn't it? Let me know, how we can progress further. Thanks, |
You should be using the full last_seq as the since value (e.g. since="289367:289338"). That sequence value represents a changes entry that's midway through a backfill. |
oh. ok I have made the change on my test scripts and unfortunately results are similar, many errors. |
I think there's a gap in the notification handling that I didn't catch yesterday, because your grant docs aren't visible to the user. When the changes feed is prevented from running ahead of the cached sequence post-userdoc notification, there's no trigger to wake it up from the subsequent waiting state. Looking into possibilities now. As a side note - I appreciate that you've identified these issues, and am working to get them closed out - they are definitely valid issues. However, I suspect that for your particular use case you could avoid these race conditions with minor changes to the way you're granting channel access. For example, if you did your channel grant based on a user update to the invite document, I think you'd avoid:
I'm thinking something like this in your config:
|
We are on the same page. What you suggest looks fully feasible. I have left the office, but will make some tests with such change first things tomorrow. Thanks, I keep you updated. |
I've pushed one more update to feature/issue_1865, to handle the notification handling scenario I mentioned above. The current approach feels like a bit of a temporary workaround, but it might be all that I can do in the short term - the longer term solution may need to wait until 2.0. |
First results looks promising! On tests without your last fix and adding grant doc to a user channel I have no document losts (but some duplicates).
I am going to do two things now (with granting doc added to user's channel and use SG without your latest fix): I guess that getting some duplicates documents is not an issue for couchbaselite IOS/Android. Brgds, |
With additional testing I had a few occurence of missing docs. I will analyze later logs to see in which case. |
Hello, We had again twice the issue today in Alpha environment with new invite mechanism. I will spend time this weekend to fix it. Brgds, |
I guess I have a correction. I need to make some massage on it (you are not going to like it as it is:)), but now I am not losing docs during my tests. I have somehow validated the effect of the fix by:
I had like to remove duplicates now. |
I do not see how to avoid duplicates in a long poll case because the low sequence is not linked to a channel. And by the way all docs higher than this sequence value needs to be sent. Am I missing something? |
See #2117 |
Hello,
Origin of the work, around issues #1865 and #2074, was to replicate an issue occuring time to time on our alpha environment. Today with fix for #1865, all logs activated + additional traces we had again the issue. Looking to the logs I realised that couchbaselite is using longpoll, meaning that a since value is provided in the requests which close itself very quickly (two docs let say).
Before digging further in the logs and providing whatever would be necessary for your investigation, I have the strange feeling that below case is not handled, am I right?
And here some logs extract:
The text was updated successfully, but these errors were encountered: