How good are humans in RL tasks?

Consider the traditional pole balance task. If we remove all prior information that a human has about the task, which would be better: human or computer?

So if we’d give the human two buttons, and four inputs (as numbers or maybe colors), and we didn’t tell them what the task is about except that they have to always maximize a fifth value (reward), how many episodes would the human have to play to figure out a good strategy?

My quess is that if all prior information about the task/goal is removed, humans might be worse than good RL algorithms. Does anyone know of any research related to this?

submitted by /u/Ilmari86
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top