Skip to content

uri_captures are url decoded #7913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
agringeri opened this issue Sep 30, 2021 · 1 comment · Fixed by #8140
Closed
1 task done

uri_captures are url decoded #7913

agringeri opened this issue Sep 30, 2021 · 1 comment · Fixed by #8140

Comments

@agringeri
Copy link

agringeri commented Sep 30, 2021

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

2.5.0

Current Behavior

URI captures (ngx.ctx.router_matches.uri_captures) are URL decoded, which make them difficult to use in plugins. In prior versions of Kong, uri_captures were not decoded, whereas in 2.5.0 they are.

For example, in 2.0.4, a uri_capture looked like this:

{
  [0] = "/foo/bar/some%20resource%20document"
}

In 2.5.0:

{
  [0] = "/foo/bar/some resource document"
}

I didn't see anything in the changelog that would indicate that this change was intentional. Otherwise, it would be a breaking change for plugins using capturing groups.

Expected Behavior

For a given request
https://example.com/foo/bar/some%20resource%20document

the uri_capture would contain:

{
  [0] = "/foo/bar/some%20resource%20document"
}

Steps To Reproduce

1. Running Kong 2.5.0:
2. Create plugin that uses ngx.ctx.router_matches.uri_captures
3. kong.log(ngx.ctx.router_matches.uri_captures[0])

Anything else?

No response

@yankun-li-kong
Copy link
Contributor

yankun-li-kong commented Nov 10, 2021

Hi, I have found the reason is that below line added from 2.3.x causing this issue
req_uri = normalize(req_uri, true)

So I add a new parameter to decide whether to normalize request uri
Please check more details in my PR.

yankun-li-kong added a commit to yankun-li-kong/kong that referenced this issue Nov 28, 2021
Add a new parameter normalize-req-uri to decide whether to normalize the request URI or not.

New feature for Kong#7913
yankun-li-kong added a commit to yankun-li-kong/kong that referenced this issue Nov 28, 2021
Use a self parameter(normalize_req_uri) of kong.router to load the new parameter.
It will be easier to add a new route parameter to control the URI normalization behavior in the future.

New feature for Kong#7913
yankun-li-kong added a commit to yankun-li-kong/kong that referenced this issue Nov 28, 2021
Modify the 'if condition' for the case self.normalize_req_uri is nil to pass the original unit test.
Add unit tests.

New feature for Kong#7913
yankun-li-kong added a commit to yankun-li-kong/kong that referenced this issue Nov 28, 2021
The normalize-req-uri may affect route matching and the upstream request.
Replace normalize-req-uri with normalize_uri_captures to only affect captured request URI.
If normalize_uri_captures is set to on, the behavior is same as previous.
If normalize_uri_captures is set to off, it will try to the capture the request URI without URI normalization.
Check more details in normalize_uri_captures part from kong.conf.default.

New feature for Kong#7913
yankun-li-kong added a commit to yankun-li-kong/kong that referenced this issue Dec 2, 2021
Rename normalize_uri_captures to decode_uri_captures

New feature for Kong#7913
fffonion pushed a commit that referenced this issue Jul 26, 2022
We decide to let `normalize` function to decode URL-encoded string as much as possible.

**PLEASE REFERER TO**: #8140 (comment)

### Issues resolved

Fix #7913, FTI-2904

Outdated discussion:

----------------------

This is alternative to PR #8139 where we actually fix the normalization function to not do excessive percent-decoding on normalization.

When we added normalization kong.tools.uri.normalize, that function does percent-decoding on everything, except for the reserved characters.

That means that we basically percent-decode more than just the ranges of ALPHA (%41–%5A and %61–%7A), DIGIT (%30–%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E). (so called Unreserved Characters)

Alternative Implementation: See #8139
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants