[R][D]Should Exploration step be considered when updating value function?

I asked GPT and it says some methods update all steps’ value function while others only update those not related to exploration step, I know these two methods are quite different but they both work, why is that? What’s their essential distinction? Any opinion is welcome and any discussion would be great!

submitted by /u/CrisYou
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top
en_USEnglish