Skip to content

HTTPS Proxy: Fix large body request. #1191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 18, 2020

Conversation

eloycoto
Copy link
Contributor

@eloycoto eloycoto commented May 6, 2020

When request need to be stored in a temp file, the
ngx.req.get_body_data() will return nil. On some uses cases, users
report that large body is not correctly set. So things are failing on
their end.

This commit will check if the file has been created, read the data from
there and set the body correctly.

Fix THREESCALE-3863

Signed-off-by: Eloy Coto [email protected]

@eloycoto eloycoto requested a review from a team as a code owner May 6, 2020 16:04
local temp_file_path = ngx.req.get_body_file()
ngx.log(ngx.INFO, "HTTPS Proxy: Request body is too big, read the content from path='", temp_file_path, "'")
if temp_file_path then
request.body = util.read_file(temp_file_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not safe for big files, because it loads everything in memory. It needs to be an iterator that reads the body chunk by chunk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I'm not sure about this, but Lua already make this on read_all option:

Call:
https://github.com/lua/lua/blob/61a4e64a6667bedaa882571c48a173ef5a4ba73b/liolib.c#L583

Read_all
https://github.com/lua/lua/blob/61a4e64a6667bedaa882571c48a173ef5a4ba73b/liolib.c#L526-L538

So, I'm not sure what do you want me to do here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But apicast.util.read_file is

local function read(file)
local handle, err = open(file)
local output
if handle then
output = handle:read("*a")
handle:close()
else
return nil, err
end
return output
end

Right?
It reads the whole file into memory.

It should not read the whole file in memory, but rather return an iterator that reads each chunk and returns it.

Like https://github.com/ledgetech/lua-resty-http#get_client_body_reader. Actually, that looks exactly like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,

I was looking at this, but at some point, the information need to be saved in
memmory, maybe I'm missing something, but let explain the body reader that you sent to me:

There are two functions involved:

  • _body_reader: This is using coroutines, and this is correct, all here
    looks good, it returns a iterator and all is ok

https://github.com/ledgetech/lua-resty-http/blob/a6bd2e0eb1390e330e4fb10a48cced5a1f21fb66/lib/resty/http.lua#L444-L505

  • _read_body: In this case, when reader() is called, all the information is
    saved in a table, and after that is all in a single variable. This, in my
    opinion, is a bit worse, because you're allocate a table instead of append of
    the string (But I need to do a performance check here ;-), not a strong word)

https://github.com/ledgetech/lua-resty-http/blob/a6bd2e0eb1390e330e4fb10a48cced5a1f21fb66/lib/resty/http.lua#L512-L536

So, I was thinking to have a metatable with some kind of magic method, but
AFAIK, always the content should be stored in memory. I was thinking something
like Filereader in Golang, but I didn't found a way to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nginx buffers large bodies (more than a few kb) into files, so they don't need to be stored in memory.
Reading the whole file into memory can lead to DoS as it can have megabytes or gigabytes.
So a reading iterator will read only a chunk of the file at the time and iterating over it.

I have no idea if the iterator I linked would actually work when the body is buffered into a file. It was a mere suggestion of what should happen when reading a file from a filesystem - read it in chunks as an iterator. Maybe penlight has something?

@eloycoto eloycoto force-pushed the THREESCALE-3863 branch 3 times, most recently from b6b56ca to 21cd858 Compare May 11, 2020 15:24
body = ngx.req.get_body_data(),
proxy_uri = proxy_uri
}

if not request.body then
local temp_file_path = ngx.req.get_body_file()
ngx.log(ngx.INFO, "HTTPS Proxy: Request body is too big, read the content from path='", temp_file_path, "'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a mere debug log level. Not fitting few kb does not really mean "too big".

@@ -90,4 +90,6 @@ function _M.to_hash(table)
return t
end

_M.read_file = read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needed?

@@ -0,0 +1,32 @@
local co_yield = coroutine.yield
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenResty has own version of yield openresty/lua-resty-core@97b5825
Would be good to investigate and see what is the one we should use.

The API is injected in https://github.com/openresty/lua-nginx-module/blob/2a190736a58674086c3a27bf71a7993383fffb55/src/ngx_http_lua_coroutine.c#L288

Also since openresty/lua-nginx-module#1239 is in it might be worth looking at how it affects us.

Also lunarmodules/Penlight#265 (comment) is quite true :)

if not chunk then
break
end
chunk_size = chunk_size + chunk_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is increasing the chunk size with every iteration, right?
Can you test this manually calling the iterator?

@eloycoto eloycoto force-pushed the THREESCALE-3863 branch 3 times, most recently from e6eab79 to 2020861 Compare May 13, 2020 10:23
@eloycoto eloycoto requested a review from mikz May 13, 2020 10:25
Copy link
Contributor

@mikz mikz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Unit test of the file reader is easy to do no? Manually calling the iterator and checking the chunks would prevent the issue discovered in the previous review.

Also, an integration test should be fairly easy.

@eloycoto eloycoto force-pushed the THREESCALE-3863 branch 4 times, most recently from 5784e80 to c7cf99b Compare May 14, 2020 11:42
@eloycoto eloycoto requested a review from mikz May 14, 2020 11:52
Copy link
Contributor

@mikz mikz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it verifies that the body goes through the proxy in the integration test, but other than that 👍

t/http-proxy.t Outdated
assert.equal('https', ngx.var.scheme)
assert.equal('$TEST_NGINX_RANDOM_PORT', ngx.var.server_port)
assert.equal('test', ngx.var.ssl_server_name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to test the received body? At least its size?

@eloycoto
Copy link
Contributor Author

The same problem as the code, If I call ngx.req.get_body_data in the upstream API, it'll return nil, and I need to read the temp file to validate that it's working correctly, so you're testing with the same tools, no?

Also, http_proxy timeout if body is nil, so test it's working as expected.

@jsmadis could you take care of this in the testsuite?

Regards

@mikz
Copy link
Contributor

mikz commented May 14, 2020

You can read it as one chunk using penlight for example. Or use the echo module from nginx to print the body?

@eloycoto eloycoto requested a review from mikz May 14, 2020 20:06
When request need to be stored in a temp file, the
ngx.req.get_body_data() will return nil. On some uses cases, users
report that large body is not correctly set. So things are failing on
their end.

This commit will check if the file has been created, read the data from
there and set the body correctly.

Fix THREESCALE-3863

Signed-off-by: Eloy Coto <[email protected]>
Copy link
Contributor

@mikz mikz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ better than doing it myself. Loving it 🤟

@eloycoto eloycoto merged commit 357f0d4 into 3scale:master May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants