-
Notifications
You must be signed in to change notification settings - Fork 2.2k
pathfinding: probability for bimodal distribution #6815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of changes to the current probability estimator. It's not easy to see what the actual effect on payment success will be. Left a few comments, but for me it is hard to be certain that the change set is an improvement.
An alternative approach would be to make a much smaller incremental step and use most of the existing algorithm with the addition of penalizing amounts close to the channel capacity harder.
routing/probability_bimodal.go
Outdated
// value determines how quickly the bimodal distribution drops off from | ||
// the edges of a channel. A larger value means, that the drop off is | ||
// slow. | ||
DefaultScaleMsat = lnwire.MilliSatoshi(300000000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a way around this constant, for example by making it relative to the channel capacity? Or would that be contrary to empirical findings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found a good way to get rid of it, making it capacity dependent has the tendency to favor large channels, which is why I thought keeping this value constant would be better, but I can certainly have another look.
Thank you a lot for the feedback! Just to clarify, the approach would be to provide an alternative opt-in implementation. Right, without a benchmark we won't know if there's an improvement, however, there are some limits in the math that make me confident there will be improvements (for example, if the algo learns that there's a success for a 100ksat payment, it will also favor that pair for 200k sat). In the next I'll focus on establishing a benchmark method.
I was thinking about that and wanted to do that after this PR, but I agree, it would be a quicker way to benefit users if we split this PR and introduce such a penalty first. I would take the approach of the first four commits and build on top of that. Passing the capacity as a parameter is due to performance reasons (haven't compared to looking up capacity from the graph). In the long run, I think we may need to restructure mission control to provide the capacity and other graph-derived data efficiently (maintained by |
Yeah FWIW doing something like that isn't mutually exclusive with this PR (can even be done concurrently with just those first few commits). In fact, the work in this PR makes adding modified estimators much easier. |
3ca81b7
to
1935544
Compare
Added a suggestion for how to handle configuration. I added subgroups for the different estimators, which is a breaking change for the apriori parameters. Shares initial commits with #6857, so if questions come up, would appreciate review comments in that PR concerning these commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial pass through to catch up with the latest state of the PR, will review your comments on the tracking issue and those prior preso slides to re-build my context.
7a5926e
to
3aaac1e
Compare
Thanks for the reviews! Rebased on #6857. Probability estimators are now swappable at run time to allow for custom A/B testing. I probably need to remove |
One approach would be to make a benchmark, then expose profile capture during the benchmark. This way, we'd be able to see exactly where things are slower and if we need any memoization or w/e. You could also just try to do path finding attempts locally on the CLI, then capture a profile right before you execute the command to see where things are slowing down. |
Can be rebased now that the dependent PR has been merged! |
3aaac1e
to
7496139
Compare
routing/probability_bimodal.go
Outdated
results NodeResults, toNode route.Vertex) float64 { | ||
|
||
// TODO(bitromortac): think more about this case. | ||
return 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC all this needs to tell us is if the channel is online/offline or not. As we want to skip things entirely, since we made the decision to not do any sort of decay for the local channels, as we could end up sort of reducing them all to zero prob, which means no payments attempted at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that channels should already be discarded beforehand by the payment constraints. I think the issue here is that we know in principle that a payment should work over a local channel, but they may fail for other reasons than liquidity issues. This would lead to an infinite loop if we would keep a constant high probability. I added some mechanics that avoids a failing peer for some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am half done with the review, will continue tomorrow but I will leave already some comments 👍
routing/probability_bimodal.go
Outdated
|
||
// DefaultBimodalNodeWeight defines how strongly other channels of the | ||
// node to route from should be taken into account. A number between 0 | ||
// and 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ... between 0 and 1, both values included.
Even so we can just say here that this is the default value for X and add that explanation to where this value is used (BimodalCfg -> BimodalNodeWeight
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I updated the docs of default values and of the configuration.
routing/probability_bimodal.go
Outdated
|
||
// getLiquidityProbability computes the probability to reach a node based | ||
// on the liquidity distribution in the LN. | ||
func (p *probabilityBimodal) getLiquidityProbability(now time.Time, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: Do we need to pass now
to getLiquidityProbability
and getPairProbability
? can we get the time inside the function and have a cleaner API? Is there any problem if we do not pass it at all and use time.Now()
whenever the now time.Time
is?
I understand that passing it as a parameter "freezes" it and we use the same value in all the flow, but I am not sure if we gain anything from it.
P.S: I also do not know why we have the now() time.Time
private method in the MissionControl
do you know if this was added for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is due to performance issues, time.Now
seems to be costly and it's also easier for testing, but need to have another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah better to have time.Time
be passed in which can also help to facilitate unit tests.
Unless the entries in the top two lines were inadverntly swapped, then if I'm reading this correctly, the bimodal is actually faster than the apriori? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐲
routing/missioncontrol.go
Outdated
// calculations. | ||
ProbabilityEstimatorCfg | ||
// AprioriConfig is the config we will use for probability calculations. | ||
AprioriConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the rename
// AprioriEstimator returns node and pair probabilities based on historical | ||
// payment results. | ||
// payment results. It uses a preconfigured success probability value for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 re the extra docs here
toNode route.Vertex) float64 | ||
|
||
// Config returns the estimator's configuration. | ||
Config() estimatorConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only used internally, so this doesn't actually need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the routerrpc
package
routing/probability_bimodal.go
Outdated
// DefaultBimodalHalfLife is the default value for BimodalHalfLife. | ||
// We will forget about previous learnings about channel liquidity on | ||
// the timescale of about a week. | ||
DefaultBimodalHalfLife = time.Duration(7*24) * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting here re the longer timescale for forgetting compared to the apriori model, I wonder if we should also modify that default value? Either way, can be tuned further once this lands and we enter the eval phase with the virtual network simulator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I changed this from the last round, wanted to be a bit more opinionated here. I'd leave the apriori default to maintain the current behavior.
// channel is taken into account. | ||
BimodalNodeWeight float64 | ||
|
||
// BimodalScaleMsat describes the scale over which channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solid docs here 👍
@@ -462,6 +462,9 @@ type Config struct { | |||
|
|||
// ActiveNetParams contains parameters of the target chain. | |||
ActiveNetParams chainreg.BitcoinNetParams | |||
|
|||
// Estimator is used to estimate routing probabilities. | |||
Estimator routing.Estimator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// results and always assume the configured a priori probability for | ||
// untried connections. A value of zero will ignore the a priori | ||
// probability completely and only base the probability on historical | ||
// results, unless there are none available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The breaking change here is worth pointing out in isolation in the release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a Configuration
section and mentioned the breaking changes.
"failrelax", | ||
).Seconds()) | ||
} | ||
|
||
// We switch between estimators and set corresponding configs. If | ||
// estimator is not set, we ignore the values. | ||
if ctx.IsSet("estimator") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for being able to switch between estimators on the fly!
} | ||
|
||
// BenchmarkAprioriPairProbability benchmarks the probability calculation. | ||
func BenchmarkAprioriPairProbability(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for adding this set of benchmarks!
I think we can leave them in the tree as well, so this commit doesn't necessarily need to be temp.
|
||
toNode := route.Vertex{byte(0)} | ||
for i := 0; i < b.N; i++ { | ||
estimator.PairProbability(now, results, toNode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should capture the return variable in a package level variable to ensure things aren't being optimized away.
See this blog post: https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go
This might be an older Go quirk though, can't hurt to add it in though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, read about that somewhere, I added it, but that didn't change the numbers here.
|
||
// We only populate fields based on the current estimator. | ||
switch v := cfg.Estimator.Config().(type) { | ||
case routing.AprioriConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this commit would benefit from first making all the config restructuring changes for apriori. Maybe even in the first half of the commit stack don't even mention bimodal anywhere. And then when everything is in place, add the bimodal commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on the route to restructure the commits, but then saw that I would need to recreate the protos twice, which made me stop doing it if that's ok for you. I think adding the extra bimodal cases should not add too much of an overhead (it's rather the reshuffling of lines for apriori that introduces some).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have chosen to indeed regenerate twice for the sake of getting a clean commit that adds bimodal, but of course no blocker.
AprioriWeight: routingConfig.AprioriWeight, | ||
// We only initialize a probability estimator if there's no custom one. | ||
var estimator routing.Estimator | ||
if cfg.Estimator != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this can never happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why not? The idea would be to take the estimator that's passed in by a user that has it compiled in, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in the code as it currently is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @bitromortac!
We just need to fix a couple of issues and clear out a few comments to ready 🚀
; values are in [0, 1]. (default: 0.5) | ||
; routerrpc.aprioriweight=0.3 | ||
; The time interval with which the MC store state is flushed to the DB. | ||
; routerrpc.mcflushinterval=1m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default?
@@ -46,6 +46,8 @@ type pathFinder = func(g *graphParams, r *RestrictParams, | |||
[]*channeldb.CachedEdgePolicy, float64, error) | |||
|
|||
var ( | |||
DefaultEstimator = AprioriEstimatorName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: DefaultEstimator is the default estimator used when xyz
(this ones are always hard for me, copilot is your friend here)
* we rename the current probability estimator to be the "apriori" probability estimator to distinguish from a different implementation later * the AprioriEstimator is exported to later be able to type switch * getLocalPairProbability -> LocalPairProbabiltiy (later part of an exported interface) * getPairProbability -> getPairProbabiltiy (later part of an exported interface)
5290f12
to
7b62376
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you all for this round of reviews! I made the missioncontrol api backwards compatible and it retains its old behavior of not setting and overwriting defaults.
I changed the probability formula slightly, it now only evaluates math.Exp
three times. I found that substituting math.Pow
with math.Exp
together with changing to a decay time instead of a halflife time (that uses base 2), the bimodal benchmark time could be cut about into half.
My previous total benchmark numbers were swapped indeed 😅, the new ones are for a larger number of node pairs (18518 pairs) and full graph traversal enabled (note that this is worst time complexity, in reality this should be closer to ~200ms):
bimodal: Pathfinding perf metrics: nodes=6189, edges=77389, time=731.343375ms
apriori: Pathfinding perf metrics: nodes=6186, edges=77387, time=647.745178ms
In real situations the two methods can't be directly compared as run times are graph-traversal dependent (Dijkstra ends early if the source is found).
routing/probability_bimodal.go
Outdated
// DefaultBimodalHalfLife is the default value for BimodalHalfLife. | ||
// We will forget about previous learnings about channel liquidity on | ||
// the timescale of about a week. | ||
DefaultBimodalHalfLife = time.Duration(7*24) * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I changed this from the last round, wanted to be a bit more opinionated here. I'd leave the apriori default to maintain the current behavior.
toNode route.Vertex) float64 | ||
|
||
// Config returns the estimator's configuration. | ||
Config() estimatorConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the routerrpc
package
// results and always assume the configured a priori probability for | ||
// untried connections. A value of zero will ignore the a priori | ||
// probability completely and only base the probability on historical | ||
// results, unless there are none available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a Configuration
section and mentioned the breaking changes.
|
||
toNode := route.Vertex{byte(0)} | ||
for i := 0; i < b.N; i++ { | ||
estimator.PairProbability(now, results, toNode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, read about that somewhere, I added it, but that didn't change the numbers here.
lnrpc/routerrpc/router.proto
Outdated
ProbabilityModel defines which probability estimator should be used in | ||
pathfinding. | ||
*/ | ||
ProbabilityMonel Model = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this compile? (ProbabilityMonel
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, wonder how that sneaked in 🙂, also saw the issue with the sample-lnd.conf
check. Will collect these and push later if something else comes up.
We introduce a probability `Estimator` interface which is implemented by the current apriori probability estimator. A second implementation, the bimodal probability estimator follows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment about the proto.
I don't really have an opinion about the bimodel algorithm itself. Interested to find out how you are going to evaluate whether it is better. May be hard for users to actually see any differences if they switch over.
routing/probability_bimodal.go
Outdated
// DefaultBimodalDecayTime is the default value for BimodalDecayTime. | ||
// We will forget about previous learnings about channel liquidity on | ||
// the timescale of about a week. | ||
DefaultBimodalDecayTime = time.Duration(7*24) * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed spectacular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string("are you talking about the extra type cast")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe, no increasing decay to a week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to have a longer decay time. In the apriori model, we don't forget about successes, only about failures, but here failures and successes are treated symmetrically, so it would be a pity to forget about those in short time (we could make those asymmetric if we wanted to as discussed earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think worth it to give this a shot, it's also configurable so people can set it to other values. In retrospect maybe using a 12 hour decay (user offline for a week and has no memory) was too conservative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, it may be worth adding clear steps to the PR description explaining how users can do a real life side-by-side comparison of both estimators with a "mission control state restore" in between. |
Sounds like a good candidate for a blog post or lnd mailing list post. Either of those would be easier to consume/distribute compared to a PR description. |
I've looked up // importSnapshot takes an existing snapshot and merges it with our current
// state if the result provided are fresher than our current results. It returns
// the number of pairs that were used. Not sure if this is usable for a side-by-side comparison, because it merges mission control state rather than replacing it. I think it is worth spending some extra time on figuring this out and documenting it somewhere to really unlock the value of this PR for users. Otherwise it might be yet another set of pathfinding config flags for which users can't really decide what to do with. |
Possible to first do a reset, then a merge.
I think the value is unlocked right after the merge. We can update the PR description later in a non-blocking manner. As mentioned a PR description isn't as searchable/indexable as a mailing list post or blog post. |
Implements a new probability estimator based on a probability theory framework. The computed probability consists of: * the direct channel probability, which is estimated based on a depleted liquidity distribution model, formulas and broader concept derived after Pickhardt et al. https://arxiv.org/abs/2103.08576 * an extension of the probability model to incorporate knowledge decay after time for previous successes and failures * a mixed node probability taking into account successes/failures on other channels of the node (similar to the apriori approach)
We use a more general `Estimator` interface for probability estimation in missioncontrol. The estimator is created outside of `NewMissionControl`, passed in as a `MissionControlConfig` field, to facilitate usage of externally supplied estimators.
We add new lnd.conf configuration options for both probability estimators. Note that this is a breaking change for the existing apriori parameters.
The active probability estimator can be switched dynamically using the `Set/GetMissionControl` API, maintaining backward compatibility. The lncli commands `setmccfg` and `getmccfg` are updated around this functionality. Note that deprecated configuration parameters were removed from the commands.
$ go test -bench=PairProbability ./routing goos: linux goarch: amd64 pkg: github.com/lightningnetwork/lnd/routing cpu: AMD Ryzen 5 3600 6-Core Processor BenchmarkBimodalPairProbability-4 2050206 599.8 ns/op BenchmarkAprioriPairProbability-4 9524289 126.7 ns/op * math.Exp: goos: linux goarch: amd64 pkg: github.com/lightningnetwork/lnd/routing cpu: AMD Ryzen 5 3600 6-Core Processor BenchmarkExp-4 143843622 8.700 ns/op * bimodal benchmark profile: Showing nodes accounting for 2000ms, 76.05% of 2630ms total Dropped 156 nodes (cum <= 13.15ms) Showing top 10 nodes out of 141 flat flat% sum% cum cum% 1110ms 42.21% 42.21% 1110ms 42.21% math.archExp 230ms 8.75% 50.95% 230ms 8.75% runtime.memclrNoHeapPointers 180ms 6.84% 57.79% 930ms 35.36% github.com/lightningnetwork/lnd/routing.(*BimodalEstimator).calculateProbability 110ms 4.18% 61.98% 130ms 4.94% runtime.scanobject 90ms 3.42% 65.40% 90ms 3.42% memeqbody 60ms 2.28% 67.68% 940ms 35.74% github.com/lightningnetwork/lnd/routing.(*BimodalEstimator).directProbability 60ms 2.28% 69.96% 230ms 8.75% github.com/lightningnetwork/lnd/routing.cannotSend 60ms 2.28% 72.24% 60ms 2.28% runtime.madvise 50ms 1.90% 74.14% 480ms 18.25% github.com/lightningnetwork/lnd/routing.(*BimodalEstimator).primitive 50ms 1.90% 76.05% 60ms 2.28% runtime.mapiternext * apriori benchmark profile: Showing nodes accounting for 1570ms, 77.34% of 2030ms total Dropped 138 nodes (cum <= 10.15ms) Showing top 10 nodes out of 151 flat flat% sum% cum cum% 340ms 16.75% 16.75% 340ms 16.75% math.archExp 290ms 14.29% 31.03% 980ms 48.28% github.com/lightningnetwork/lnd/routing.(*AprioriEstimator).getNodeProbability 190ms 9.36% 40.39% 260ms 12.81% runtime.mapiternext 190ms 9.36% 49.75% 190ms 9.36% runtime.memclrNoHeapPointers 160ms 7.88% 57.64% 340ms 16.75% github.com/lightningnetwork/lnd/routing.(*AprioriEstimator).calculateProbability 110ms 5.42% 63.05% 110ms 5.42% aeshashbody 110ms 5.42% 68.47% 130ms 6.40% runtime.scanobject 80ms 3.94% 72.41% 420ms 20.69% github.com/lightningnetwork/lnd/routing.capacityFactor 60ms 2.96% 75.37% 60ms 2.96% runtime.madvise 40ms 1.97% 77.34% 40ms 1.97% runtime.isEmpty (inline)
7b62376
to
d78e756
Compare
Pushed some minor changes mostly related to API backward compatibility. I updated the PR description to give some more context and included a hint on how one could measure differences between the two estimators (exactly, with a reset of mc in between). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so excited about this 🚀
Great work @bitromortac
This PR implements an additional probability estimator ("bimodal") based on a probability theory framework using channel capacities and previous learnings from payments as inputs. The specific assumed liquidity distribution of channel liquidity is based on research in #5988.
The computed probability consists of:
We introduce an
Estimator
interface for probability computation. The interface can also be implemented externally and passed into the server's config struct. With this PR, we have two alternative probability estimator implementations, the current default one (apriori
) and a new opt-in one (bimodal
). Each of those estimators have their own configuration parameters that can be supplied inlnd.conf
. Additionally, estimators can be hot-swapped by using the functionality aroundlncli setmccfg/getmccfg
.The new estimator is expected to make somewhat smarter decisions with respect to previous learnings about payment failures and successes over channels. Performance-wise, we expect the bimodal estimator to be slower due to heavier usage of formulas (~20% slower in computational time, but larger differences may appear due to execution path dependence). The exact analysis of behavior and impact on payment reliability is still open. Users that would like to do their own comparison could use their preconditioned mission control and carry out probes for either estimator. For a fairer comparison it is important to start with equal inputs, which is why a chain of API calls restoring mission control state (
QueryMissionControl
->payment->ResetMissionControl
->XImportMissionControl
) is recommended.