[Bug] HttpClient is not fault tolerant for network issues #935
Labels
Enhancement
A request or suggestion to improve some aspect of the library
P2
Normal priority items, should be done after P1
SDK-Consistency
Items that deal with consistency between all MSALs
Library version used
1.15.0
Java version
21
Scenario
ConfidentialClient - web api (AcquireTokenOnBehalfOf)
Is this a new or an existing app?
The app is in production, I haven't upgraded MSAL, but started seeing this issue
Issue description and reproduction steps
At the moment DefaultHttpClient is not fault tolerant for network issues/glitches. It just handles the situation where HTTP 500 or above is thrown. This is causing troubles for all users not knowing they can "write" their own IHttpClient. This fact is forcing anyone to reinvent the wheel again and again. It would be beneficial if the basic implementation would be more robust to withstand these issues
e.g. Proxy failed with 503 but MSAL4J just failed instead of a retry mechanism to kick in from HttpHelper class
com.microsoft.aad.msal4j.MsalClientException: java.io.IOException: Unable to tunnel through proxy. Proxy returns "HTTP/1.1 503 Service Unavailable" at com.microsoft.aad.msal4j.HttpHelper.executeHttpRequest(HttpHelper.java:56) at com.microsoft.aad.msal4j.OAuthHttpRequest.send(OAuthHttpRequest.java:48) at com.microsoft.aad.msal4j.TokenRequestExecutor.executeTokenRequest(TokenRequestExecutor.java:39) at com.microsoft.aad.msal4j.AbstractClientApplicationBase.acquireTokenCommon(AbstractClientApplicationBase.java:256) at com.microsoft.aad.msal4j.AcquireTokenByAuthorizationGrantSupplier.execute(AcquireTokenByAuthorizationGrantSupplier.java:63) at com.microsoft.aad.msal4j.AcquireTokenByClientCredentialSupplier.acquireTokenByClientCredential(AcquireTokenByClientCredentialSupplier.java:86) at com.microsoft.aad.msal4j.AcquireTokenByClientCredentialSupplier.execute(AcquireTokenByClientCredentialSupplier.java:49) at com.microsoft.aad.msal4j.AuthenticationResultSupplier.get(AuthenticationResultSupplier.java:69) at com.microsoft.aad.msal4j.AuthenticationResultSupplier.get(AuthenticationResultSupplier.java:18) at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source) at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: java.io.IOException: Unable to tunnel through proxy. Proxy returns "HTTP/1.1 503 Service Unavailable" at java.base/sun.net.www.protocol.http.HttpURLConnection.doTunneling(Unknown Source) at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source) at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source) at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source) at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(Unknown Source) at com.microsoft.aad.msal4j.DefaultHttpClient.executeHttpPost(DefaultHttpClient.java:59)
We were experiancing IOException, SocketTimeoutException and ConnectException. None of them are handled correctly as they are thrown in DefaultHttpClient but before
readResponseFromConnection
is executed.I will share PR request shortly
Relevant code snippets
Expected behavior
Retry the request 3 times nevertheless if its 503 or some network glitch.
Identity provider
Microsoft Entra ID (Work and School accounts and Personal Microsoft accounts)
Regression
No response
Solution and workarounds
Create a fault-tolerant resource handler which will retry whole request/response chain in case of expected failures.
Workarounds for now is to write IHttpClient yourself and reinvent the same wheel for all customers which are using MSAL4J and needs more robust error handling
The text was updated successfully, but these errors were encountered: