Control of an octoplcopter based on RL

Hello everyone, I’m working on my master thesis which involves controlling a coaxial octocopter to land on a moving target using Simulink and MATLAB. My system has 20 observations and generates 3 actions: thrust, input for x, and input for y. The complexity increases as I have two PD controllers with inputs added to Ux and Uy for controlling roll and pitch in a cascade controller setup. I’m confident that my model is correct, but I’ve been struggling for two weeks to get the agent to converge. It’s consistently stuck in a suboptimal policy with a constant positive Q value (I’m using DDPG). Despite trying multiple reward function modifications to make the UAV track the trajectory, the reward of each episode remains very negative and doesn’t improve. Is it feasible to make this work given the high number of observations and the complexity of the system? Or should I consider simplifying the environment? Any suggestions or insights would be greatly appreciated. Thank you

submitted by /u/OkFig243
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top