Description
Description
Summary
I have opened a PR here: #80604 to demonstrate this issue. The PR is not meant to be merged, it's merely a demonstration of the issue that I'm reporting. The test code runs and demonstrates the issue well. The production code is incomplete, but shows a potential path to fixing the issue.
The problem is that DefaultHttpClientFactory doesn't handle the case where an factory throws an exception well.
Typically, an HttpClientHandler will be recycled after two minutes (https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Http/src/DependencyInjection/HttpClientBuilderExtensions.cs#L499), but, in the case where the building of the handler throws an exception, the exception will be cached indefinitely.
In the definition of a Lazy object:
Exception caching When you use factory methods, exceptions are cached. That is, if the factory method throws an exception the first time a thread tries to access the [Value](https://learn.microsoft.com/en-us/dotnet/api/system.lazy-1.value?view=net-7.0) property of the [Lazy<T>](https://learn.microsoft.com/en-us/dotnet/api/system.lazy-1?view=net-7.0) object, the same exception is thrown on every subsequent attempt.
This Lazy object is accessed in the CreateHandler
method https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Http/src/DefaultHttpClientFactory.cs#L118 , and immediately after it is accessed, the code will start a timer with the StartHandlerEntry
timer. So, if line 118 throws an exception, a timer will never be set, and there will never be an attempt to re-initialize the object and the handler will be indefinitely in a bad state and irrecoverable.
Example scenario
As an example of how this might occur, consider a factory which takes runtime configuration as input. The runtime configuration changes to a bad state, causing an exception in the handler. The runtime configuration is then updated to a good state, but the factory will never be called again, so the application must be restarted to clear the Lazy object from memory.
Proposed solutions
- One solution is to ensure that the HandlerTimer is set before the factory is called. This will ensure that even if exceptions occur, we will always try to re-initialize it after the two-minute default. In the current implementation, the timer depends on the ActiveHandlerTrackingEntry, so this would require a bit of a re-write in the way the timers are handled. (See the code in the PR https://github.com/dotnet/runtime/compare/main...amittleider:runtime:HttpClientFactory_ExceptionsPersist?expand=1#diff-940dba6ffc5ae548fca7105241869c1a78442c04b968f129ce645407022e83cbR118 . This code doesn't compile, but it helps to illustrate the proposal).
- Another solution would be to make a new type of Lazy object that has a timer pre-built in. Like this, there is no need to handle timers at all within the DefaultHttpClientFactory.
- Last solution may be to modify the thread safety mode to use
LazyThreadSafetyMode.PublicationOnly
, which will not cache exceptions.
Labels
@dotnet/area-extensions-dependencyinjection
@dotnet/ncl
Reproduction Steps
See the test code in the PR: https://github.com/dotnet/runtime/pull/80604/files#diff-7ee446a98cb0ad2039642e909ac0732b37e7f764b534651cd88a3ac910c5b382R20
Expected behavior
See the test code in the PR: https://github.com/dotnet/runtime/pull/80604/files#diff-7ee446a98cb0ad2039642e909ac0732b37e7f764b534651cd88a3ac910c5b382R45
Actual behavior
See the test code in the PR: https://github.com/dotnet/runtime/pull/80604/files#diff-7ee446a98cb0ad2039642e909ac0732b37e7f764b534651cd88a3ac910c5b382R53
Regression?
No response
Known Workarounds
Never throw exceptions in HttpClientHandlerFactories.
Configuration
No response
Other information
No response