How to implement RL on trash recognizer robot
Hi!
I’m currently working on a robot that recognizes trash and sends it to a server.
It’s a basic robot with four wheels, motors, and several sensors (ultrasonic sensors in four directions, a gyroscope, accelerometers, etc.). It also has a camera and a Raspberry Pi on top.
To recognize trash, I use YOLO, and when it detects trash, it sends a picture to the server.
Right now, I’m using a simple algorithm to explore the area with the robot, but I would like to replace it with a PPO-based approach.
I already tried using the following inputs:
(front_dist, left_dist, right_dist, x_pos, y_pos, x_cell, y_cell, angle_to_the_nearest_cell)
(A cell is a 100 cm × 100 cm square.)
For the outputs, I used a softmax over two actions: move (25 cm) and turn (30°).
And for the rewards:
- NEW_CELL_REWARD = 3 (when it discovers a new cell)
- MOVE_REWARD = -0.3 (for each movement)
- PENALTY_REWARD = -50 (when it hits a wall or object)
- END_GAME_REWARD = 50 (when all cells are discovered)
However, the robot doesn’t explore the room efficiently. Even after around 1000 episodes, its behavior still looks random and unfocused.
I would also like it to output the amount it should turn, but I’m not sure how to implement that.
submitted by /u/Independent-Key-1329
[link] [comments]