-
Notifications
You must be signed in to change notification settings - Fork 957
XCMP and DMP improvements #8860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
XCMP and DMP improvements #8860
Conversation
b65eafa
to
ae8fcc5
Compare
ae8fcc5
to
c1811ae
Compare
3ee5165
to
f797f1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looking good. Though, left comments on a lot of things :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looking good. There are still a lot of missing docs for public methods.
@@ -64,7 +66,7 @@ pub mod v0 { | |||
PartialEq, | |||
TypeInfo, | |||
)] | |||
pub struct ParachainInherentData { | |||
pub struct RawParachainInherentData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the name 🙈
Also while thinking about this, all the new stuff you introduced here does not need to be in this crate. All that can just live in the parachain-system
crate. The ParachainInherentData
(RawParachainInherentData right now) is the only thing that needs to stay here, because it is the thing that is exchanged between the node and the runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Changed the names back:
RawParachainInherentData
-> ParachainInherentData
ParachainInherentData
-> BasicParachainInherentData
Also moved all the new structures to parachain-system
except HashedMessage
because it simplifies some interactions with MessageQueueChain
.
match new.hrmp_inbound.valid_watermarks.binary_search(&hrmp_watermark.watermark()) { | ||
Ok(pos) => { | ||
// Exact match, so this is OK in all cases. | ||
let _ = new.hrmp_inbound.valid_watermarks.drain(..pos + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change we need here is that we drop the + 1
or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aka we don't need your new code? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need an adjustment for the case:
Err(pos) => match hrmp_watermark {
HrmpWatermarkUpdate::Head(_) => {
// Updates to Head are always OK.
let _ = new.hrmp_inbound.valid_watermarks.drain(..pos);
},
}
Here we should also retain the current watermark:
// The current watermark will remain valid until updated.
if new.hrmp_inbound.valid_watermarks.first() != Some(&hrmp_watermark) {
new.hrmp_inbound.valid_watermarks.insert(0, hrmp_watermark)
}
With my changes I tried to avoid having multiple branches and to do this in a more generic way. I can change it back if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we should also retain the current watermark:
Sounds reasonable, but it would just mean that we need to add the insert
to this branch? (I would keep the original code)
Also it would be nice to test these corner cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, adjusted the original code.
There are the constraints_disallowed_trunk_watermark
and constraints_always_allow_head_watermark
that cover these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change we need here is that we drop the + 1 or not?
It looks the same to me.
Why do we need to insert anything if we don't find the Head watermark? valid_watermarks
should contain these watermarks coming from the relay chain runtime (for real hrmp messages). The Head watermark is always valid, regardless of whether we have any queued messages at that block. And if at one candidate we reach the Head (current relay chain number) without finding it in the valid_watermarks, this means we don't have any more messages to process. Therefore, the next candidate has no good reason to not advance the hrmp watermark to its new relay parent (if it has some messages, the relay chain will contain this valid hrmp watermark). Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, good point ! I missed this. Removed the additional insert
.
if sent_at < last_processed_msg.sent_at { | ||
last_processed_msg_idx = Some(idx); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be impossible to hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in theory. But I think it's better to check.
mqc_heads.entry(*sender).or_default().extend_with_hashed_msg(msg); | ||
|
||
if msg.sent_at == last_processed_msg.sent_at { | ||
last_processed_msg.reverse_idx += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going in reverse but that does not matter, right? The meaning of reverse_idx
is more like unprocessed_count
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going in reverse here when counting, but it's actually a reverse index inside the group of messages with the provided sent_at
. I added some examples to the InboundMessageId
:
/// For the collection
/// `msgs = [{sent_at: 0}, {sent_at: 1}, {sent_at: 1}, {sent_at: 1}, {sent_at: 1}, {sent_at: 3}]`
///
/// `InboundMessageId {sent_at: 1, reverse_idx: 0}` points to `msgs[4]`
/// `InboundMessageId {sent_at: 1, reverse_idx: 3}` points to `msgs[1]`
/// `InboundMessageId {sent_at: 1, reverse_idx: 4}` points to `msgs[0]`
/// A malicious collator could provide an invalid collection. This function checks that the | ||
/// current collection respects the advancement rule. Specifically, it checks that the current | ||
/// collection contains as many full messages as possible. | ||
pub fn check_advancement_rule(&self, collection_name: &str, max_size: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn check_advancement_rule(&self, collection_name: &str, max_size: usize) { | |
pub fn check_all_messages_included(&self, collection_name: &str, max_size: usize) { |
? Since its not really advancement rule, more like censorship resistance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right, it's not really the advancement rule. Renamed it to check_enough_messages_included
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is panick-ing in this case better than just accepting this block?
Is the reasoning that the collator will not end up getting a block reward?
The most they can do is delay the message processing anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we need to have this for censorship resistance.
for msg in &self.full_messages { | ||
size = size.saturating_add(msg.data().len()); | ||
} | ||
let min_size = ((max_size as f64) * 0.75) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a sanity check to this pallet that we do not shoot ourselves in the foot here.
There is no guarantee here that a parachain wont have a max message size configured of 500KiB or something stupid... actually not sure if we can even check this since its in the host config.
Maybe we can read the abridged host config? Not sure how complicated that is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a comment for the reasoning behind this 0.75 coefficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The abridged host config doesn't contain the max downward message size. This value is only present on the relay chain from what I understand.
Also a sanity check would only check this on code changes right ? But I think the config can be changed dynamically on chain ? So it wouldn't cover all possible problems.
I think the safest thing would be to check only if there is at least 1 full message in the collection. Or to have min_size = 50% of max size
. @ggwpez @bkchr WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said here: #8860 (comment), I'm wondering if it is worth panicking for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should have at least a basic check here for censorship resistance. If we want to be super safe we can just check that self.full_messages.len() > 1
. For the moment I'll change the coefficient to 0.5.
add a comment
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, nice work!
I assume it was tested end-to-end with zombienet?
for msg in &self.full_messages { | ||
size = size.saturating_add(msg.data().len()); | ||
} | ||
let min_size = ((max_size as f64) * 0.75) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a comment for the reasoning behind this 0.75 coefficient
/// A malicious collator could provide an invalid collection. This function checks that the | ||
/// current collection respects the advancement rule. Specifically, it checks that the current | ||
/// collection contains as many full messages as possible. | ||
pub fn check_advancement_rule(&self, collection_name: &str, max_size: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is panick-ing in this case better than just accepting this block?
Is the reasoning that the collator will not end up getting a block reward?
The most they can do is delay the message processing anyway
match new.hrmp_inbound.valid_watermarks.binary_search(&hrmp_watermark.watermark()) { | ||
Ok(pos) => { | ||
// Exact match, so this is OK in all cases. | ||
let _ = new.hrmp_inbound.valid_watermarks.drain(..pos + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change we need here is that we drop the + 1 or not?
It looks the same to me.
Why do we need to insert anything if we don't find the Head watermark? valid_watermarks
should contain these watermarks coming from the relay chain runtime (for real hrmp messages). The Head watermark is always valid, regardless of whether we have any queued messages at that block. And if at one candidate we reach the Head (current relay chain number) without finding it in the valid_watermarks, this means we don't have any more messages to process. Therefore, the next candidate has no good reason to not advance the hrmp watermark to its new relay parent (if it has some messages, the relay chain will contain this valid hrmp watermark). Am I missing something?
.collect(); | ||
|
||
// The current watermark will remain valid until updated. | ||
if let Some(last_watermark) = HrmpWatermarks::<T>::get(&recipient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as in the inclusion_emulator, do we really need this?
This is only really used in the inclusion emulator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's needed. If the hrmp watermark remains the same for a couple of blocks, check_modifications
will always fail.
I tried without this and either the parachain was stallig or the parachain block time was increasing for a while. Don't remember exactly, but anyway, it's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, nice work!
I assume it was tested end-to-end with zombienet?
@alindima Thanks for the review ! Yes, it was tested end-to-end with zombienet.
.collect(); | ||
|
||
// The current watermark will remain valid until updated. | ||
if let Some(last_watermark) = HrmpWatermarks::<T>::get(&recipient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's needed. If the hrmp watermark remains the same for a couple of blocks, check_modifications
will always fail.
I tried without this and either the parachain was stallig or the parachain block time was increasing for a while. Don't remember exactly, but anyway, it's needed.
/// A malicious collator could provide an invalid collection. This function checks that the | ||
/// current collection respects the advancement rule. Specifically, it checks that the current | ||
/// collection contains as many full messages as possible. | ||
pub fn check_advancement_rule(&self, collection_name: &str, max_size: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we need to have this for censorship resistance.
for msg in &self.full_messages { | ||
size = size.saturating_add(msg.data().len()); | ||
} | ||
let min_size = ((max_size as f64) * 0.75) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should have at least a basic check here for censorship resistance. If we want to be super safe we can just check that self.full_messages.len() > 1
. For the moment I'll change the coefficient to 0.5.
add a comment
Done
match new.hrmp_inbound.valid_watermarks.binary_search(&hrmp_watermark.watermark()) { | ||
Ok(pos) => { | ||
// Exact match, so this is OK in all cases. | ||
let _ = new.hrmp_inbound.valid_watermarks.drain(..pos + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, good point ! I missed this. Removed the additional insert
.
Related to #489
This PR changes the parachain receiving logic for XCMP and DMP by adding some offchain processing before forwarding the messages to the parachain
set_validation_data
inherent. This enables us to relax the advancement rule.