fix(pu): fix noise layer's usage #866

puyuan1996 · 2025-04-17T07:28:01Z

Description

This PR fixes [Issue #850](#850) by adding configurable support for NoisyNet in collect mode for DQN. It introduces a new force_noise flag in NoiseLinearLayer and a helper function set_noise_mode. A new config parameter (collect.add_noise) is added so that if set to True during collection, noise is injected; otherwise, it is disabled.

Related Issue

[Issue #850](Noisy Net Issue #850)

TODO

Implement force_noise control in NoiseLinearLayer
Add set_noise_mode helper to update noise settings
Update DQN _forward_collect and _forward_eval methods based on collect.add_noise
Update DQN config with the new parameter

Check List

Merge the latest version of the source branch/repo and resolve all conflicts
Pass style check
Pass all tests

PaParaZz1 · 2025-04-18T06:55:29Z

ding/model/template/q_learning.py

@@ -57,6 +58,8 @@ def __init__(
            - dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \
                if ``None`` then default disable dropout layer.
            - init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \
+            - noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \


to use

use NoiseLinearLayer to boost exploration

PaParaZz1 · 2025-04-18T06:55:42Z

ding/model/template/q_learning.py

@@ -57,6 +58,8 @@ def __init__(
            - dropout (:obj:`Optional[float]`): The dropout rate of the dropout layer. \
                if ``None`` then default disable dropout layer.
            - init_bias (:obj:`Optional[float]`): The initial value of the last layer bias in the head network. \
+            - noise (:obj:`bool`): Whether use ``NoiseLinearLayer`` as ``layer_fn`` in Q networks' MLP. \
+                Default ``False``.


PaParaZz1 · 2025-04-18T06:59:14Z

ding/policy/dqn.py

@@ -384,6 +386,12 @@ def _forward_collect(self, data: Dict[int, Any], eps: float) -> Dict[int, Any]:
        data = default_collate(list(data.values()))
        if self._cuda:
            data = to_device(data, self._device)
+        # Use the add_noise parameter to decide noise mode.


rename to noisy_net field and add it into default config (in the policy level), don't use xxx.get

PaParaZz1 · 2025-04-18T07:00:12Z

ding/policy/dqn.py

@@ -248,6 +248,8 @@ def _forward_learn(self, data: List[Dict[str, Any]]) -> Dict[str, Any]:
        .. note::
            For more detailed examples, please refer to our unittest for DQNPolicy: ``ding.policy.tests.test_dqn``.
        """
+        set_noise_mode(self._learn_model, True)


use noisy_net to control this line

Another question: how to deal with target_model in noisy net

puyuan added 2 commits April 17, 2025 07:18

fix(pu): fix noise layer's usage

5a01fde

polish(pu): polish comments

454334c

puyuan1996 added the bug Something isn't working label Apr 17, 2025

PaParaZz1 requested changes Apr 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pu): fix noise layer's usage #866

fix(pu): fix noise layer's usage #866

puyuan1996 commented Apr 17, 2025

PaParaZz1 Apr 18, 2025

PaParaZz1 Apr 18, 2025

PaParaZz1 Apr 18, 2025

PaParaZz1 Apr 18, 2025

fix(pu): fix noise layer's usage #866

Are you sure you want to change the base?

fix(pu): fix noise layer's usage #866

Conversation

puyuan1996 commented Apr 17, 2025

Description

Related Issue

TODO

Check List

PaParaZz1 Apr 18, 2025

Choose a reason for hiding this comment

PaParaZz1 Apr 18, 2025

Choose a reason for hiding this comment

PaParaZz1 Apr 18, 2025

Choose a reason for hiding this comment

PaParaZz1 Apr 18, 2025

Choose a reason for hiding this comment