-
Notifications
You must be signed in to change notification settings - Fork 472
adapter: move timestamp oracle impl to it's own module, put behind trait object #22112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adapter: move timestamp oracle impl to it's own module, put behind trait object #22112
Conversation
This PR has moderate risk. Make sure to carefully review the file hotspots. It's still a good idea to have this change reviewed, ensuring enough test coverage, and observability. What's This?
Buggy File Hotspots:
|
/// | ||
/// All subsequent values of `self.read_ts()` will be greater or equal to | ||
/// `write_ts`. | ||
async fn apply_write(&mut self, lower_bound: T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should lower_bound
be write_ts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
/// Returns the current system time while protecting against backwards time | ||
/// jumps. | ||
/// | ||
/// The caller is responsible for providing the previously recorded system time | ||
/// via the `previous_now` parameter. | ||
/// | ||
/// If `previous_now` is more than `TIMESTAMP_INTERVAL_UPPER_BOUND * | ||
/// TIMESTAMP_PERSIST_INTERVAL` milliseconds ahead of the current system time | ||
/// (i.e., due to a backwards time jump), this function will block until the | ||
/// system time advances. | ||
/// | ||
/// The returned time is guaranteed to be greater than or equal to | ||
/// `previous_now`. | ||
// TODO(aljoscha): These internal details of the oracle are leaking through to | ||
// multiple places in the coordinator. | ||
pub(crate) fn monotonic_now(now: NowFn, previous_now: mz_repr::Timestamp) -> mz_repr::Timestamp { | ||
let mut now_ts = now(); | ||
let monotonic_now = cmp::max(previous_now, now_ts.into()); | ||
let mut upper_bound = catalog_oracle::upper_bound(&mz_repr::Timestamp::from(now_ts)); | ||
while monotonic_now > upper_bound { | ||
// Cap retry time to 1s. In cases where the system clock has retreated | ||
// by some large amount of time, this prevents against then waiting for | ||
// that large amount of time in case the system clock then advances back | ||
// to near what it was. | ||
let remaining_ms = cmp::min(monotonic_now.saturating_sub(upper_bound), 1_000.into()); | ||
error!( | ||
"Coordinator tried to start with initial timestamp of \ | ||
{monotonic_now}, which is more than \ | ||
{TIMESTAMP_INTERVAL_UPPER_BOUND} intervals of size {} larger than \ | ||
now, {now_ts}. Sleeping for {remaining_ms} ms.", | ||
*TIMESTAMP_PERSIST_INTERVAL | ||
); | ||
thread::sleep(Duration::from_millis(remaining_ms.into())); | ||
now_ts = now(); | ||
upper_bound = catalog_oracle::upper_bound(&mz_repr::Timestamp::from(now_ts)); | ||
} | ||
monotonic_now | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only really makes sense when used with the EpochMillis
timeline. That timeline isn't mentioned anywhere in this module and feels like it's an implementation detail of other parts of the adapter. For those reasons it feels a little out of place in this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I also didn't know what to do about this one. In the end I moved it here with the oracle code because it uses internals of the oracle like upper_bound
and the TIMESTAMP_PERSIST_INTERVAL
and TIMESTAMP_INTERVAL_UPPER_BOUND
. Plus, once we remove this oracle we have more of the code that we need to remove localized in this one file.
But I'm happy to move that back to timeline.rs
. And I probably have to add more code here that does a monotonic_now
for both oracles, in the interim period where we have both oracles/where we switch over. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to leave here if we plan on removing this later.
Coverage looks good here as well. 👍 |
501e64a
to
4cb3a77
Compare
It was not used outside that struct (anymore?) and made the code slightly less readable/introduced one more hop to go through when looking at code/introduced one additional thing that has to be named.
We need that name free for the future TimestampOracle trait.
We do this because we want to provide a more decoupled trait in future commits and the persist_fn leaks the underlying persistence/durability mechanism. We introduce a TimestampPersistence trait that we give to DurableTimestamp which encapsulates the persistence functionality.
…ct in usage sites Further preparation for eventually hooking up a different implementation of TimestampOracle.
4cb3a77
to
2fc78eb
Compare
Builds on #22091, which is not yet merged, but I wanted to keep the momentum going.
Part of MaterializeInc/database-issues#6635
Tips for reviewer
#21671 has this commit and all the follow-up commits for full MaterializeInc/database-issues#6635, it might be good to look at that for context.
Commits have good messages that explain the rationale. But the gist is that this prepares us for introducing the
PostgresTimestampOracle
behind that trait, next to the existing oracle. And the code that uses it doesn't have to know.Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.