Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a handler for checking connection status using Ping frame in HTTP/2 #3612

Open
wants to merge 11 commits into
base: 1.2.x
Choose a base branch
from

Conversation

raccoonback
Copy link
Contributor

@raccoonback raccoonback commented Feb 2, 2025

Description

In some cases, HTTP/2 connections may remain open even when they are no longer functional. By introducing a periodic Ping frame health check, we can proactively detect and close unhealthy connections.

The Http2ConnectionLivenessHandler schedules and sends Ping frames at a user-defined interval to verify the connection's liveness. If an acknowledgment (ACK) is not received within the specified time, the connection is considered unhealthy and is closed automatically.

However, if other frames are actively being sent or received, the scheduler does not send a Ping frame. This is because the server may delay ACK responses for various reasons. To prevent unnecessary connection termination, Ping frames are only sent when no read or write activity is detected.

Additionally, a configurable retry threshold for Ping frame transmission has been introduced. If a Ping frame fails to receive an ACK response, it will be retried up to the specified threshold before considering the connection unhealthy. This allows fine-tuning of the failure detection mechanism, balancing between aggressive failure detection and avoiding premature disconnections.

Scheduler Flow

http2_ping_flows_1

Ping ACK Flow

http2_ping_flow

Key Changes

  1. Added Http2ConnectionLivenessHandler handler to check connection health with Ping frames at a configurable interval.
  2. Introduced a retry threshold setting to limit the number of Ping transmission attempts before marking the connection as unhealthy.
  3. Specify the ping frame ack allowable range through Http2SettingsSpec.
HttpClient.create()
	.protocol(HttpProtocol.H2)
	.secure()            
	.http2Settings(
		 builder -> builder.pingAckTimeout(Duration.ofMillis(600))  // Max wait time for ACK
			.pingScheduleInterval(Duration.ofMillis(300))  // Interval for sending PING frames
			.pingAckDropThreshold(2)  // Maximum retries before considering the connection dead
         );

Related Issue

@raccoonback raccoonback marked this pull request as ready for review February 2, 2025 08:17
@raccoonback raccoonback force-pushed the issue-3301 branch 9 times, most recently from 171dfc5 to 202d036 Compare February 4, 2025 00:11
@raccoonback raccoonback marked this pull request as draft February 5, 2025 16:36
@raccoonback raccoonback marked this pull request as ready for review February 7, 2025 17:12
* @param pingAckTimeout the interval in between consecutive PING frames
* and ACK responses. Must be a positive value.
*/
default Builder pingAckTimeout(Duration pingAckTimeout) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets the interval for sending HTTP/2 PING frames and receiving ACK responses.

Comment on lines +139 to +141
default Builder pingScheduleInterval(Duration pingScheduleInterval) {
return this;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets the execution interval for the scheduler that sends HTTP/2 PING frames

Comment on lines +167 to +169
default Builder pingAckDropThreshold(Integer pingAckDropThreshold) {
return this;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets the threshold for retrying HTTP/2 PING frame transmissions. (default. 0)

Comment on lines +98 to +169
/**
* Sets the interval for sending HTTP/2 PING frames and receiving ACK responses.
*
* <p>
* This method configures the time interval at which PING frames are sent to the peer.
* The interval should be chosen carefully to balance between detecting connection issues
* and minimizing unnecessary network traffic.
* </p>
*
* <p>
* If the interval is set too short, it may cause excessive network overhead.
* If set too long, connection failures may not be detected in a timely manner.
* </p>
*
* @param pingAckTimeout the interval in between consecutive PING frames
* and ACK responses. Must be a positive value.
*/
default Builder pingAckTimeout(Duration pingAckTimeout) {
return this;
}

/**
* Sets the execution interval for the scheduler that sends HTTP/2 PING frames
* and periodically checks for ACK responses.
*
* <p>
* This method configures the time interval at which the scheduler runs
* to send PING frames and verify if ACK responses are received within
* the expected timeframe.
* Proper tuning of this interval helps in detecting connection issues
* while avoiding unnecessary network overhead.
* </p>
*
* <p>
* If the interval is too short, it may increase network and CPU usage.
* Conversely, setting it too long may delay the detection of connection failures.
* </p>
*
* @param pingScheduleInterval the interval in at which the scheduler executes.
* Must be a positive value.
*/
default Builder pingScheduleInterval(Duration pingScheduleInterval) {
return this;
}

/**
* Sets the threshold for retrying HTTP/2 PING frame transmissions.
*
* <p>
* This method defines the maximum number of attempts to send a PING frame
* before considering the connection as unresponsive.
* If the threshold is exceeded without receiving an ACK response,
* the connection may be closed or marked as unhealthy.
* </p>
*
* <p>
* A lower threshold can detect connection failures more quickly but may lead
* to premature disconnections. Conversely, a higher threshold allows more retries
* but may delay failure detection.
* </p>
*
* <p>
* If this value is not specified, it defaults to 0, meaning only one attempt to send a PING frame is made without retries.
* </p>
*
* @param pingAckDropThreshold the maximum number of PING transmission attempts.
* Must be a positive integer.
* The default value is 0, meaning no retries will occur and only one PING frame will be sent.
*/
default Builder pingAckDropThreshold(Integer pingAckDropThreshold) {
return this;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is declared as a default method for backward compatibility.

Comment on lines +62 to +64
private final long pingAckTimeoutNanos;
private final long pingScheduleIntervalNanos;
private final int pingAckDropThreshold;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the configuration value provided by the user.

Comment on lines +232 to +234
private boolean isOutOfTimeRange() {
return pingAckTimeoutNanos < Math.abs(lastReceivedPingTime - lastSendingPingTime);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It determines whether the Ping ACK was received within pingAckTimeoutNanos.

Comment on lines +268 to +284
private class PingWriteListener implements ChannelFutureListener {

@Override
public void operationComplete(ChannelFuture future) throws Exception {
if (future.isSuccess()) {
if (log.isDebugEnabled()) {
log.debug("Wrote PING frame to {} channel.", future.channel());
}

isPingAckPending = true;
lastSendingPingTime = System.nanoTime();
}
else if (log.isDebugEnabled()) {
log.debug("Failed to wrote PING frame to {} channel.", future.channel());
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new Ping frame is successfully sent, the status is changed to 'ping awaiting', and lastSendingPingTime is updated to the current time.

Comment on lines +696 to +699
.addBefore(NettyPipeline.ReactiveBridge, NettyPipeline.H2LivenessHandler,
new Http2ConnectionLivenessHandler(codec.encoder(), pingAckTimeout, pingScheduleInterval, pingAckDropThreshold))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configure Http2ConnectionLivenessHandler.

Comment on lines +391 to +392
REGISTRY.clear();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although not related to the purpose of this PR, handling has been implemented for failing tests.
To prevent interference with other tests, the metrics are initialized.

Comment on lines +653 to +655
Mockito.verify(spyLogger, times(0))
.warn(Mockito.eq("Connection pool creation limit exceeded: {} pools created, maximum expected is {}"),
Mockito.eq(2), Mockito.eq(1));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although not related to the purpose of this PR, handling has been implemented for failing tests.
In PooledConnectionProvider, the connectionPoolCount is handled atomically to ensure that the log warning log.warn("Connection pool creation limit exceeded: {} pools created, maximum expected is {}") is not called multiple times.
As a result, the test outcomes have been adjusted.

@violetagg violetagg added the type/enhancement A general enhancement label Feb 10, 2025
@violetagg violetagg added this to the 1.2.4 milestone Feb 10, 2025
@violetagg
Copy link
Member

@raccoonback Can you rebase against branch 1.2.x and switch the target branch for the PR to be 1.2.x. We branched and now in main it is the next development 1.3.x while we want this feature for version 1.2.x

@raccoonback raccoonback changed the base branch from main to 1.2.x February 11, 2025 23:47
@raccoonback
Copy link
Contributor Author

@violetagg
Hello!
I have rebased the branch onto 1.2.x and changed the target branch of the PR to 1.2.x.

- Added a method to configure the execution interval of the scheduler
  that sends HTTP/2 PING frames and periodically checks for ACK responses
- Introduced a retry threshold setting to limit the number of PING transmission attempts
  before considering the connection as unresponsive
- Default values:
  - Scheduler interval must be explicitly set
  - Retry threshold defaults to 0 (no retries, only one PING attempt)

Signed-off-by: raccoonback <[email protected]>
@violetagg
Copy link
Member

@raccoonback I'll review this PR later this week

@violetagg violetagg self-requested a review February 17, 2025 15:25
@violetagg
Copy link
Member

violetagg commented Feb 20, 2025

@raccoonback Nice idea! I think that we can reuse our current reactor.netty.http.server.IdleTimeoutHandler (we need to check whether the current position in the pipeline will be ok or whether we need to move it) and extend it with the HTTP/2 functionality ACK requirement. At the moment it handles idle timeout for HTTP/2 and H2C on the server. The more ChannelHandler you have in the pipeline the worse is the performance so we always try to keep the ChannelHandlers number to the minimum if possible. Also all methods in reactor.netty.http.server.IdleTimeoutHandler are invoked on the event loop which guarantees that the functionality is thread-safe.

I think this feature is interesting for both server and client, similar to TCP keep-alive that we support for both of them.

@raccoonback
Copy link
Contributor Author

raccoonback commented Feb 20, 2025

@violetagg
Hello, and thanks for your suggestion! 😃
I have a few questions for clarification.

  1. You suggested reusing reactor.netty.http.server.IdleTimeoutHandler.
    Currently, the implementation works by scheduling a PING frame transmission when there is no read/write activity, then checking whether an ACK is received within the allowed time. (It also supports retries up to a certain threshold.)
    '
    I understand that you’re suggesting leveraging IdleTimeoutHandler to detect when both read and write operations are idle and incorporating it into this logic. Is that correct?
    Specifically, do you mean modifying the handler so that when both read and write operations are idle, a PING frame is sent, and if no ACK is received within the retry threshold, the connection is closed?
    '
    If I’ve misunderstood, could you clarify what exactly you mean by "reuse"? 😮

  2. Can I understand that you are proposing to extend this feature not only on the client side, but also on the server side by IdleTimeoutHandler?

    • I thought that this Issue was intended to introduce PING frame support only for the client-side at first.

Looking forward to your feedback! 😊

@violetagg
Copy link
Member

violetagg commented Feb 21, 2025

@violetagg Hello, and thanks for your suggestion! 😃 I have a few questions for clarification.

  1. You suggested reusing reactor.netty.http.server.IdleTimeoutHandler.
    Currently, the implementation works by scheduling a PING frame transmission when there is no read/write activity, then checking whether an ACK is received within the allowed time. (It also supports retries up to a certain threshold.)
    '
    I understand that you’re suggesting leveraging IdleTimeoutHandler to detect when both read and write operations are idle and incorporating it into this logic. Is that correct?

Yes
Currently we use it only for checking that there is no read, because we add this handler between the requests and we are sure there is no write operation. However this handler extends io.netty.handler.timeout.IdleStateHandler and for HTTP/2 we can configure it to check for both read and write.

Specifically, do you mean modifying the handler so that when both read and write operations are idle, a PING frame is sent, and if no ACK is received within the retry threshold, the connection is closed?

Yes
When we receive channelIdle we can add the logic for ACK

'
If I’ve misunderstood, could you clarify what exactly you mean by "reuse"? 😮
2. Can I understand that you are proposing to extend this feature not only on the client side, but also on the server side by IdleTimeoutHandler?

Yes

  • I thought that this Issue was intended to introduce PING frame support only for the client-side at first.

Looking forward to your feedback! 😊

We may need to move reactor.netty.http.server.IdleTimeoutHandler to the reactor.netty.http package so that it can be used by both server and client.

I think that this feature is interesting for both client and server. Wdyt?

@raccoonback
Copy link
Contributor Author

@violetagg

Thank you for the detailed explanation!
I will update reactor.netty.http.server.IdleTimeoutHandler to check the Ping ACK internally when usingHTTP/2 and H2C.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP/2 PING frame handling
2 participants