Skip to content

No error logging on coroutine.resume (pull-request for issue #951) #952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

bungle
Copy link
Member

@bungle bungle commented Jan 12, 2017

I hereby granted the copyright of the changes in this pull request to the authors of this lua-nginx-module project.

See more details on #951.

@bungle bungle changed the title No error logging on coroutine.resume (Pull-request for issue #951) No error logging on coroutine.resume pPull-request for issue #951) Jan 12, 2017
@bungle bungle changed the title No error logging on coroutine.resume pPull-request for issue #951) No error logging on coroutine.resume (pull-request for issue #951) Jan 12, 2017
@agentzh
Copy link
Member

agentzh commented Jan 12, 2017

@bungle Will you add some tests to the existing test suite to justify this change? Thank you!

@agentzh
Copy link
Member

agentzh commented Jan 12, 2017

@bungle Also, please ensure the Travis CI checks are passing in this PR.

@bungle
Copy link
Member Author

bungle commented Jan 12, 2017

Yes, I will do that tomorrow.

@bungle
Copy link
Member Author

bungle commented Jan 13, 2017

@agentzh, it seems to pass the tests now. I don't think we need another tests for this as I did change the original tests to work with the patch, and they also test that no error is written to log on coroutine.resume where resumed thread errors.

--- error_log eval
["stack traceback:", "coroutine 0:", "coroutine 1:", "coroutine 2:"]
--- no_error_log
[error]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the indentation of this test case is to test the Lua land backtrace. After your change, we no longer have any tests designed for this purpose.

I'm not sure if silencing the error logging is a good idea. Maybe we can make it a lower log level like "warn" or even "info"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, better check the return values of coroutine.resume for better test coverage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coroutine.resume is a pcall and its purpose is to catch the errors aka silence the errors and let users handle them. I will make that test better so it checks the return value.

@@ -4,7 +4,7 @@ use Test::Nginx::Socket::Lua;

repeat_each(2);

plan tests => repeat_each() * (blocks() * 3 + 5);
plan tests => repeat_each() * blocks() * 3 + 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better keep the parens otherwise we need to update this plan every time we change the repeat_each number. Also, the repeat each number is automatically increased on the "check leak" test mode of Test::Nginx::Socket, and your change will break the test plan in that test mode.

@@ -281,7 +281,7 @@ GET /lua
end)
end

N = 10
N = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better avoid unrelated changes to make your changeset atomic. It's better for version control.

Copy link
Member Author

@bungle bungle Jan 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my editor did it automatically. I will add that space back.

@@ -1315,4 +1315,3 @@ co yield: 2

--- no_error_log
[error]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the editor did this automatically. I will add the new line back.

@bungle
Copy link
Member Author

bungle commented Jan 15, 2017

One thing though, is this code path only used with when user's code calls coroutine.resume, and not for say Nginx managed light threads e.g. ngx.thread or for the main light thread?

@@ -1437,9 +1437,6 @@ ngx_http_lua_run_thread(lua_State *L, ngx_http_request_t *r,

ctx->cur_co_ctx = next_coctx;

ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
"lua coroutine: %s: %s\n%s", err, msg, trace);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned earlier, should we just lower the log level here? Like info?

Copy link
Member Author

@bungle bungle Jan 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do so, this will be different than what Lua / LuaJIT does. They don't write anything. I mean, OpenResty doesn't log anything when wrapping code with pcall. If you want to log errors even when they are raised in a code wrapped with pcall, then I would consider having this logged as well, and of course we need to change pcall then as well. This would differ from what users' expect, though. My only concern is that I'm not sure if that removed log entry removes error logging also from some case where it shouldn't. That's why I asked do anyone of you know if it is only handling when users' code calls coroutine.resume (that's the only place where this logging should be silenced).

But I probably wouldn't worry too much if this is lowered to INFO or even DEBUG. Some library writers can rely on this, e.g. exiting coroutines with error and then handling the errors themselves. If this means that the logs will be filled with these messages, then that could be a problem. E.g. my routing library is relying on coroutines and it will by default log the errors from coroutine.resume, so this would mean that errors are logged twice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bungle pcall is a more explicit try/catch in the Lua world while many Lua programmers may overlook the error handling in coroutine.resume.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agentzh, I agree, but that's how the Lua is. Those that use coroutine.resume should know that it is effectively a pcall. There is some history to this as pcall has not been fully compatible with coroutines (http://lua-users.org/wiki/PcallAndCoroutines).

@agentzh
Copy link
Member

agentzh commented Jan 15, 2017

@thibaultcha @doujiang24 What do you think of this?

@thibaultcha
Copy link
Member

thibaultcha commented Jan 18, 2017

Sorry for the delay, I was on a trip.

I have spent some time testing this change as well as giving some thoughts about @bungle's concerns, and so far here is what I gathered:

  • As we try to stay as close as possible to the standard Lua/LuaJIT's behavior, I believe this change to be aligned with this goal. We should not unexpectedly write to the error logs when Lua would not write to stderr itself.
  • Ultimately, lowering the log level would be a compromise, but as with all compromises, it would leave both parties unsatisfied :)
  • As for @bungle's concern in coroutine.resume and error logging #951 about not providing a stacktrace to the error() call, I tried some cases on my side and it would seem that this behavior is particular to resty-cli, which executes all user code in a pcall(), as seen here. I would be in favor of replacing this with an xpcall in order to provide a full stacktrace to the user and avoid confusions, but that is a different topic than this discussion.

So far with this patch, error() writes the appropriate errors and stacktraces from the various contexts I tried it in (content, log, timer and user threads) and it indeed removes the logging in coroutines only. I feel like more testing would be necessary though.

@agentzh
Copy link
Member

agentzh commented Jan 18, 2017

Okay, I'm fine with removing the logging in this case completely. We just need to ensure that a backtrace is still generated in other cases, i.e., outside a user coroutine.resume() call. In addition, we can avoid generating the backtrace in the coroutine.resume() case if we can since the backtrace is later discarded completely anyway and the generation is a little bit expensive.

@thibaultcha
Copy link
Member

Done as part of #1239, closing this PR, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants