Decision Transformer for infinite horizon env

I’m trying to apply the Decision Transformer for my job( optimize the network traffic control algorithm).

-State: network traffic statistics

-Reward funtion/Goal: reward function to minimize the network traffic delay and maximize the throughout.

-Action: Select the optimal transmission rate

There is no terminal state unless the user stop the algorithm.

I think that all state’s accumulated reward without the discount factor always is the infinite value.

In this case, can i use the decision transformer??

How can I select the RTG(rewards-to-go) properly?

submitted by /u/Final-Confusion4484
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top
en_USEnglish