-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] allow to register any custom logger (fixes #4783) #4880
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for this!
I've added (fixes #4783)
to the pull request title here so GitHub will automatically close #4783 if this is merged, and so the connection between this PR and that issue is clear to anyone reading the release notes (PR titles become release note items in this project, see here for example).
Can you please add a unit test covering this new behavior, to be sure it isn't accidentally broken by future refactorings? See the diff from https://github.com/microsoft/LightGBM/pull/3820/files#diff-57a0d4d593bc1ebac04a99d2e477a178c209de8c50e6cffdb5f9195b26dd6e68 for an example.
Thanks for the thorough guidance! I considered updating the unit test, however not sure what to test. How about add test to make sure the method check is satisfied? Maybe something like this: def test_register_invalid_logger():
class LoggerWithoutInfoMethod:
def warning(self, msg: str) -> None:
print(msg)
def error(self, msg: str) -> None:
print(msg)
class LoggerWithoutWarningMethod:
def info(self, msg: str) -> None:
print(msg)
def error(self, msg: str) -> None:
print(msg)
class LoggerWithoutErrorMethod:
def info(self, msg: str) -> None:
print(msg)
def warning(self, msg: str) -> None:
print(msg)
def _test_raise_type_error(invalid_logger) -> None:
try:
lgb.register_logger(invalid_logger)
except Exception as err:
assert isinstance(err, TypeError)
assert str(err) == "Logger must provide 'info', 'warning' and 'error' method"
_test_raise_type_error(LoggerWithoutInfoMethod())
_test_raise_type_error(LoggerWithoutWarningMethod())
_test_raise_type_error(LoggerWithoutErrorMethod()) |
I think tests covering the following would give us confidence that this feature is working:
For tests that check for errors, please use
|
Is there any way to directly test the logging utility in
OK, this is much better than my clumsy implementation. |
python-package/lightgbm/basic.py
Outdated
def error(self, msg: str) -> None: | ||
warnings.warn(msg, stacklevel=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this was a typo, right?
def error(self, msg: str) -> None: | |
warnings.warn(msg, stacklevel=3) | |
def error(self, msg: str) -> None: | |
warnings.error(msg, stacklevel=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there's no error()
in warnings
? I seldom use this module, so correct me if I'm wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, indeed! I confused this with error()
function from logging module. Sorry.
https://docs.python.org/3/library/logging.html#logging.Logger.error
For our dummy logger I think we can just raise an exception then, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the default logger, in the future when error()
is actually needed, we still need to implement it. So I'm inclined to leave a dummy yet usable implementation as default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a .error()
method being added at all? Per your description in #4783
Since only
info
andwarning
are used to log messages, any logger class providing these two methods would suffice.
And I can confirm that such a method isn't used anywhere.
git grep -E "\.error"
I have a strong preference for not having any "well we might need this in the future" code in the package, as it increases the size of the package and amount of code to be maintained without affecting the user experience. I think this PR should not add a .error()
implementation at all. @StrikerRUS what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's remove requirements to have .error()
method in custom logger class in this PR. But please note that there will be a breaking change in case we'll want to require it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed .error()
related code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
please note that there will be a breaking change in case we'll want to require it in the future
Yes, I understand. I think the case where the lightgbm
library would want to emit an ERROR-level log but not raise an exception is unlikely to come up.
Even the logger on the C++ side does not have a method for ERROR-level messages: https://github.com/microsoft/LightGBM/blob/17d4e007852b4f0b34b211a674cb26e6751c98e2/include/LightGBM/utils/log.h.
I find that maybe I can directly use Also I'm wondering whether we should export |
@RustingSword are you still interested in pursuing this PR? If you can update this to the latest I'm sorry, I realized today that you were waiting for feedback from us on #4880 (comment).
I don't believe those methods should be exported or that calling LightGBM's logger like |
@jameslamb I rebased on master. If we don't expose logging methods, then there's no direct way to test logger. Should I duplicate the tests used in test_register_logger ? Personally I think the test is coupled with other functionalities, such as training, which have been tested in other module, e.g. test_engine.py. |
Since this project uses squash merging, just using Can you please remove those commits and instead merge
Since you are injecting a logger, I think it's possible to design one for a test which modifies a global variable, and then to test the content of those global variables. pseudo-code showing what I mean import lightgbm as lgb
info_logs = []
class _TestLogger:
def info(self, msg: str) -> None:
global info_logs
info_logs.append(msg)
lgb.register_logger(_TestLogger())
# directly invoke `lgb.basic._log_info()`
lgb.basic._log_info("hello")
assert info_logs == ["hello"] Besides that, you could have one similar test that performs training with |
Sorry, I will do a force update on this branch to merge master instead of rebase. The suggested test method is great, will implement later. |
Sounds good, thanks! We are VERY sorry it took so long to respond to your questions. Thanks very much for still being willing to contribute! |
I'm totally agree with this statement. |
@RustingSword
|
Sorry, missed this one, fixed. |
When I'm running pytest locally, the expected log messages in
I'm not sure whether this should be updated, since there seems no related test error in the ci pipeline. |
Are you using the latest version of LightGBM? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look great, thank you very much! Please just see my one remaining comment about the .error()
methood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me, thanks very much for the idea and the contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RustingSword Thank you so much for your work!
Just one nit about docstring style:
python-package/lightgbm/basic.py
Outdated
info_method_name: str, optional (default="info") | ||
Method used to log info messages. | ||
warning_method_name: str, optional (default="warning") | ||
Method used to log warning messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info_method_name: str, optional (default="info") | |
Method used to log info messages. | |
warning_method_name: str, optional (default="warning") | |
Method used to log warning messages. | |
info_method_name : str, optional (default="info") | |
Method used to log info messages. | |
warning_method_name : str, optional (default="warning") | |
Method used to log warning messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Maybe we can add a corresponding linter rule? Or even better, use a unified formatter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea! Do you know any one?
pydocstyle
linter we use here doesn't catch such inconsistency.
Line 81 in d163c2c
pydocstyle --convention=numpy --add-ignore=D105 --match-dir="^(?!^external_libs|test|example).*" --match="(?!^test_|setup).*\.py" . || exit -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this is a known issue with pydocstyle. I tried numpydoc
as mentioned in the comment, however it is not satisfying either. First, numpydoc
is an extension to sphinx
, and the validation rules cannot be set in command line options when used as a standalone tool. Second, numpydoc
is not intelligent enough. We can see this in above issue, where a single missing whitespace will result in multiple false positives.
smalltest.no:PR01:Parameters {'a'} not documented
smalltest.no:PR02:Unknown parameters {'a: int'}
These two errors are due to parsing error caused by the missing whitespace before colon.
So, I don't have suggested tool suited for this 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah no problem, thanks very much for investigating!
@jameslamb @StrikerRUS Thank you both for your excellent guidance! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Allow to register any custom logger, as long as it provides
info
,warning
anderror
method. Refer to #4783.