SAC + HER can’t exceed success rate around 0.8

Dear all, I work on the algorithm for autonomous navigation for small unmanned vessels. I use sb3 SAC + HER buffer for training. The rules of the env are quite simple.

The ship after the reset is in the middle of the observation space [0.5, 0,5]. Then the destination is randomly chosen on the space [0-1,0-1]. The reward is calculated as -distance from the current position and target (euclidean). Success is defined as the ship being close to the target point (radius of 0.05 from the target). The done is when the ship gets to the target or hits the wall (and gets the negative reward of -1). The action space is spaces.Box(low=np.array([-1, 0]), high=np.array([1, 1]) and it maps to change of the heading change and speed as action = np.array([action[0]*10, action[1] * 10], dtype=np.float32). The model is defined as: model = SAC(“MultiInputPolicy”, env, buffer_size=buffer_size, replay_buffer_class=HerReplayBuffer, replay_buffer_kwargs=dict(
n_sampled_goal=4,
goal_selection_strategy=’future’,
copy_info_dict=True ),
verbose=1,
batch_size=batch_size,
gamma=gamma,
learning_rate=learning_rate,
policy_kwargs=dict(net_arch=net_arch),
tensorboard_log=log_dir,
learning_starts=8000)

I tried to optimize the hyperparameters in optuna library with these values:

buffer_size = trial.suggest_categorical(‘buffer_size’, [100000])

batch_size = trial.suggest_categorical(‘batch_size’, [64,128])

gamma = trial.suggest_loguniform(‘gamma’, 0.95, 0.99)

learning_rate = trial.suggest_loguniform(‘learning_rate’, 6.7e-4, 8.5e-4)

net_arch = trial.suggest_categorical(‘net_arch’, [[1024, 1024, 1024],[2048,2048,2048],[1024,1024,1024,1024]])

After several trials, I can not exceed the success rate of estimately 0.85. The env is simple and I did it according to https://highway-env.farama.org/environments/parking/. The motion model in my env is trivial. Please give me some advice as I stuck on this for several weeks. Thanks!!!

https://preview.redd.it/xdznm9q05fsc1.png?width=1913&format=png&auto=webp&s=b9c500b213b32993d5b5bf7dd4fd25772280b581

https://preview.redd.it/smglb5i22fsc1.png?width=962&format=png&auto=webp&s=af42ede221c5e64048f113e4c162618ed4efd02d

https://preview.redd.it/5xosgnl42fsc1.png?width=3002&format=png&auto=webp&s=f2a2c3068451c98224fbd97cd4e6e8d67b84b816

submitted by /u/Sharp-Record1600
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top
en_USEnglish