Single Agent or Multi-agent setting

Hello community,

I am currently working on a case where agents can have different output structures. Additionally, I’m utilizing a graph neural network to depict the state, which comprises numerous categories that may vary in number among agents. Given this scenario, what configuration would be most suitable? In a multi-agent setting, in our case, it’s common for agents not to request actions simultaneously. At any given decision point, we may have one or multiple agents seeking actions while others are busy. In such instances, what input should the critic network receive?

Thank you in advance

