Skip to content

fix(pu): fix noise layer's usage #866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

fix(pu): fix noise layer's usage #866

wants to merge 2 commits into from

Conversation

puyuan1996
Copy link
Collaborator

Description

This PR fixes [Issue #850](#850) by adding configurable support for NoisyNet in collect mode for DQN. It introduces a new force_noise flag in NoiseLinearLayer and a helper function set_noise_mode. A new config parameter (collect.add_noise) is added so that if set to True during collection, noise is injected; otherwise, it is disabled.

Related Issue

TODO

  • Implement force_noise control in NoiseLinearLayer
  • Add set_noise_mode helper to update noise settings
  • Update DQN _forward_collect and _forward_eval methods based on collect.add_noise
  • Update DQN config with the new parameter

Check List

  • Merge the latest version of the source branch/repo and resolve all conflicts
  • Pass style check
  • Pass all tests

@puyuan1996 puyuan1996 added the bug Something isn't working label Apr 17, 2025
@@ -57,6 +58,8 @@ def __init__(
- dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \
if ``None`` then default disable dropout layer.
- init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \
- noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • to use
  • use NoiseLinearLayer to boost exploration

@@ -57,6 +58,8 @@ def __init__(
- dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \
if ``None`` then default disable dropout layer.
- init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \
- noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \
Default ``False``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default to

@@ -384,6 +386,12 @@ def _forward_collect(self, data: Dict[int, Any], eps: float) -> Dict[int, Any]:
data = default_collate(list(data.values()))
if self._cuda:
data = to_device(data, self._device)
# Use the add_noise parameter to decide noise mode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to noisy_net field and add it into default config (in the policy level), don't use xxx.get

@@ -248,6 +248,8 @@ def _forward_learn(self, data: List[Dict[str, Any]]) -> Dict[str, Any]:
.. note::
For more detailed examples, please refer to our unittest for DQNPolicy: ``ding.policy.tests.test_dqn``.
"""
set_noise_mode(self._learn_model, True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use noisy_net to control this line

Another question: how to deal with target_model in noisy net

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants