-
Notifications
You must be signed in to change notification settings - Fork 398
fix(pu): fix noise layer's usage #866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -57,6 +58,8 @@ def __init__( | |||
- dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \ | |||
if ``None`` then default disable dropout layer. | |||
- init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \ | |||
- noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- to use
- use
NoiseLinearLayer
to boost exploration
@@ -57,6 +58,8 @@ def __init__( | |||
- dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \ | |||
if ``None`` then default disable dropout layer. | |||
- init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \ | |||
- noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \ | |||
Default ``False``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default to
@@ -384,6 +386,12 @@ def _forward_collect(self, data: Dict[int, Any], eps: float) -> Dict[int, Any]: | |||
data = default_collate(list(data.values())) | |||
if self._cuda: | |||
data = to_device(data, self._device) | |||
# Use the add_noise parameter to decide noise mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to noisy_net
field and add it into default config (in the policy level), don't use xxx.get
@@ -248,6 +248,8 @@ def _forward_learn(self, data: List[Dict[str, Any]]) -> Dict[str, Any]: | |||
.. note:: | |||
For more detailed examples, please refer to our unittest for DQNPolicy: ``ding.policy.tests.test_dqn``. | |||
""" | |||
set_noise_mode(self._learn_model, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use noisy_net to control this line
Another question: how to deal with target_model in noisy net
Description
This PR fixes [Issue #850](#850) by adding configurable support for NoisyNet in collect mode for DQN. It introduces a new
force_noise
flag inNoiseLinearLayer
and a helper functionset_noise_mode
. A new config parameter (collect.add_noise
) is added so that if set to True during collection, noise is injected; otherwise, it is disabled.Related Issue
TODO
force_noise
control inNoiseLinearLayer
set_noise_mode
helper to update noise settings_forward_collect
and_forward_eval
methods based oncollect.add_noise
Check List