Need help in rocket landing environment using RL

Hey guys, I hope y’all doin well. I am new to RL and I am working on a project which uses pyflyt rocket landing gym , I am currently training with PPO agent but I am not getting good results with it. One good thing is rocket is at least falling very close to the landing pad . Can you guys please help me out with ideas I can try and algorithms I can use? Thanks in advance!

My current PPO hyperparameters are these

policy_kwargs = { “net_arch”: [256, 256, 128], # Neural network architecture } ppo_params = { “tensorboard_log”: “./”, “policy_kwargs”: policy_kwargs, “learning_rate”: 0.0003, “clip_range”: 0.2, “batch_size”: 4096, “n_steps”: 4096, “gamma”: 0.99, “gae_lambda”: 0.95, “n_epochs”: 10, “ent_coef”: 0.01, “vf_coef”: 0.5, “max_grad_norm”: 0.5, }

