Skip to content

Optimise path decoding in RoutingUtils #47395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025
Merged

Conversation

ia3andy
Copy link
Contributor

@ia3andy ia3andy commented Apr 16, 2025

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

@franz1981 I'll try to submit a follow up PR fixing it even when there is % in the path..

@franz1981
Copy link
Contributor

In theory URLDecoder can do the same here (slow path) - but what looks like a waste to me is that Vertx already spent cycles doing something here which can be replaced by a single method that does it at once.
And in theory the better home for this util method IMO is in vertex itself.
I am surprised they don't need any of this, to avoid security issues @cescoffier

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

@franz1981 all the vertx tooling for this is not exposed as public API

@franz1981
Copy link
Contributor

Let's change it @ia3andy ❤️
@vietj can we summon you buddy?

@franz1981
Copy link
Contributor

Julien is on PTO, let's wait him to come back

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

I think it's a bit more complicated than checking for '%' as we also need the remove dots '../' and replace '' with '/'. Let's hold a bit

@ia3andy ia3andy changed the title Check for % before converting to URI Optimise path decoding in RoutingUtils Apr 16, 2025
@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

This is what's in the internal API that we would need:
io.vertx.core.internal.net.RFC3986

@gsmet
Copy link
Member

gsmet commented Apr 16, 2025

Same here, I would expect Vert.x to do this for us, I'm surprised we have to handle this manually.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

I don't think we can avoid the dots removal :-/ @franz1981 when this PR is green could you run the perf check again? It will probably be less effective than before the breaking commit but still better than using URI all the time..

@franz1981
Copy link
Contributor

In any case I am on PTO, right now, so cannot perform tests. My suggestion is to use the JMH module we have in quarkus and write a bench..and I can help review it 🙏

@ia3andy ia3andy force-pushed the issue-47384 branch 2 times, most recently from 13b267f to b211f3d Compare April 16, 2025 09:07
@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

Ok, we are good!

There is no need to remove the dots are it's already done by vertx in the ctx.normalizedPath(). It takes care of decoding unreserved characters including '.' (https://datatracker.ietf.org/doc/html/rfc3986#section-2.3) and doing the dot removal with a check on slow path.

Now in getNormalizedAndDecodedPath, we re-check for '%' to also decode the special characters such as é.

This should fix the issue.

This comment has been minimized.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

@vietj @tsegismont https://github.com/vert-x3/vertx-web/blob/master/vertx-web/src/main/java/io/vertx/ext/web/handler/impl/StaticHandlerImpl.java#L174

It seems doing the remove dots there is not needed here as it's already done as explained in my previous comment.

Copy link

quarkus-bot bot commented Apr 16, 2025

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit 537c43c.

Failing Jobs

Status Name Step Failures Logs Raw logs Build scan
✔️ JVM Integration Tests - JDK 17 Logs Raw logs 🔍
✔️ JVM Integration Tests - JDK 17 Windows Logs Raw logs 🔍
JVM Integration Tests - JDK 21 Build Failures Logs Raw logs 🔍

Full information is available in the Build summary check run.
You can consult the Develocity build scans.

Failures

⚙️ JVM Integration Tests - JDK 21 #

- Failing: integration-tests/observability-lgtm 

📦 integration-tests/observability-lgtm

io.quarkus.observability.test.LgtmReloadTest.testReload - History - More details - Source on GitHub

java.lang.IllegalStateException: 
Bad response: 404 >> 404 page not found

	at io.quarkus.observability.test.utils.GrafanaClient.handle(GrafanaClient.java:51)
	at io.quarkus.observability.test.utils.GrafanaClient.user(GrafanaClient.java:81)
	at org.awaitility.core.AbstractHamcrestCondition.lambda$new$0(AbstractHamcrestCondition.java:48)
	at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:248)
	at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:235)

Flaky tests - Develocity

⚙️ JVM Tests - JDK 21

📦 extensions/smallrye-reactive-messaging/deployment

io.quarkus.smallrye.reactivemessaging.hotreload.ConnectorChangeTest.testUpdatingConnector - History

  • Expecting actual: ["-4","-5","-6","-7","-8","-9","-10","-11"] to start with: ["-3", "-4", "-5", "-6"] - java.lang.AssertionError
java.lang.AssertionError: 

Expecting actual:
  ["-4","-5","-6","-7","-8","-9","-10","-11"]
to start with:
  ["-3", "-4", "-5", "-6"]

	at io.quarkus.smallrye.reactivemessaging.hotreload.ConnectorChangeTest.testUpdatingConnector(ConnectorChangeTest.java:36)

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

@franz1981 this should be safe to merge, can you approve?

Copy link
Contributor

@franz1981 franz1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying we shouldn't merge this as a quick fix but I'm still wondering why we have to handle things ourselves.

My understanding is that the issue is only with GeneratedStaticResourceBuildItem so I'm wondering if we should somehow alter the registered endpoint for these and escape it when we register it, rather than doing things at runtime when the request arrives.

My understanding is that we should be able to deal with this once and for all when we register the resource and then have a standard handling as if the files were on disk.

I suppose I might be missing something so happy to have an explanation :).

@gsmet
Copy link
Member

gsmet commented Apr 16, 2025

FWIW, I have been looking at both 7899c9d and the original issue and I'm not clear on what the original patch was exactly supposed to fix.

Sorry for asking dumb questions but I would appreciate if we could take a step back and clarify things for posterity.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

@gsmet the test in the original fix should explain the purpose.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

Here was the original issue: #45775

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

Also the issue is not for GeneratedStaticResourceBuildItem but for all static resources.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

My understanding is that the issue is only with GeneratedStaticResourceBuildItem so I'm wondering if we should somehow alter the registered endpoint for these and escape it when we register it, rather than doing things at runtime when the request arrives.

This is how the vertx static handler works. @cescoffier or @tsegismont might have answer to this question.

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 16, 2025

Here is where we need this: https://github.com/quarkusio/quarkus/blob/main/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/StaticResourcesRecorder.java#L98-L107

Also we are using this method from RoutingUtils from other extensions which rely on path to get resources (Qute Web for example)

@franz1981
Copy link
Contributor

I got the feeling that similarly to how we scan the endpoints to decide how to configure the right handlers for the types required (see

)
we should be able there to check if the endpoint path matches the one which should contains static resources and save any check to happen since they won't deliver anything, but I don't remember by heart

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 17, 2025

While I agree we could investigate ways to have vertx deal with this logic in the first place, until then we should apply this fix and backport it as it solves a perf issue which affect all requests (including rest).

The fix is pretty safe:
• It uses URIDecoder#decodeURIComponent, just like the static handler implementation.
• There’s a test for special characters and dots to validate the behaviour.
• If there’s no %, the behaviour is the same as before the initial patch.

@gsmet gsmet merged commit 0f53641 into quarkusio:main Apr 17, 2025
56 of 57 checks passed
@quarkus-bot quarkus-bot bot added this to the 3.23 - main milestone Apr 17, 2025
@gsmet gsmet modified the milestones: 3.23 - main, 3.22.0 Apr 22, 2025
@gsmet gsmet modified the milestones: 3.22.0, 3.21.4 Apr 22, 2025
@tsegismont
Copy link
Contributor

@vietj @tsegismont https://github.com/vert-x3/vertx-web/blob/master/vertx-web/src/main/java/io/vertx/ext/web/handler/impl/StaticHandlerImpl.java#L174

It seems doing the remove dots there is not needed here as it's already done as explained in my previous comment.

Thanks @ia3andy for the heads-up

tsegismont added a commit to tsegismont/vertx-web that referenced this pull request Apr 22, 2025
The dot segments are already taken care of by the call to context.normalizedPath().

See quarkusio/quarkus#47395 (comment)

Signed-off-by: Thomas Segismont <[email protected]>
tsegismont added a commit to tsegismont/vertx-web that referenced this pull request Apr 22, 2025
The dot segments are already taken care of by the call to context.normalizedPath().

See quarkusio/quarkus#47395 (comment)

Signed-off-by: Thomas Segismont <[email protected]>
@tsegismont
Copy link
Contributor

@vietj @tsegismont https://github.com/vert-x3/vertx-web/blob/master/vertx-web/src/main/java/io/vertx/ext/web/handler/impl/StaticHandlerImpl.java#L174
It seems doing the remove dots there is not needed here as it's already done as explained in my previous comment.

Thanks @ia3andy for the heads-up

@ia3andy in fact, it is required because rc.normalizedPath() will not remove dot segments in case of an encoded backslah (see https://github.com/vert-x3/vertx-web/blob/0e4a13c5dcfe1edd0512f279b272554c47ef4c4b/vertx-web/src/test/java/io/vertx/ext/web/tests/handler/StaticHandlerTest.java#L955)

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 23, 2025

Having to remove dots twice feels quite wrong. Maybe we should consider adding an exception to the RFC decode step to also handle \, since they’re eventually converted to / anyway?

@ia3andy
Copy link
Contributor Author

ia3andy commented Apr 23, 2025

Or maybe we should remove support for windows path, because URI paths \ are not meant to be used as path delimiter (as they are not unreserved).

@ia3andy ia3andy deleted the issue-47384 branch April 23, 2025 08:52
@tsegismont
Copy link
Contributor

I believe it happens in two steps because a route handler (other than a static file server) can be interested in having a normalized path without fully decoding uri.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regression while resolving path
5 participants