DQN for Solving a Maze in Less than 10 minutes Training

digitado ⋅ 28 de March de 2026

Is it possible to train a DQN to solve a maze with non-convex obstacles in a long-horizon navigation task (in 10 minutes or less)?

The rules are:

You can not use old data except for the replay buffer
The inputs are only the x and y coordinates of the state and the distance of the agent to the goal
Step size should not exceed 2% of the total maze size
You must start from the same initial state
The implementation has to be a DQN
The training should take no longer than 10 minutes

I have tried Double DQN, Noisy DQN, and prioritized experience replay. I have tried different combinations of rewards (-ve reward for every step, high +ve reward for reaching the goal, high -ve reward for hitting an obstacle). I even tried making the reward in terms of the distance to the goal.

I tried different epsilon-greedy decay methods.

No matter what I did, the agent just could not learn to reach the goal.

I think the main problem is that the agent doesn’t always reach the goal during training. Sometimes, it does not reach it at all. How can I solve this?

Overall, is this problem solvable anyway? Especially given the time constraint? If so, how? Any advice please?

submitted by /u/Now200
[link] [comments]

Like 0

Liked Liked