How Scalable is Multi Agent RL for Inventory Optimization?

I’m a Data Scientist / ML Engr who’s a novice in RL. I’ve just recently started researching about it – no hands on just yet.

Anyway, my team has use case coming up centered around warehouse inventory management comprising of 1000s of unique products. I’m thinking we might be able to apply MARL for this (e.g. https://github.com/microsoft/maro) to minimize overage/underage. But first I’d like to know how scalable is MARL?

If I undersrand it correctly… the actions of agents are decentralized (thus parallelizable) however the tuning of the policy is centralized and therefore not parallelizable(?)

Is it possible to train say 10k agents (1 for each unique product) in a reasonable amount of time or would that take too long? / cost too much?

submitted by /u/WhyDoTheyAlwaysWin
[link] [comments]

Leave a Reply

The Future Is A.I. !
To top
en_USEnglish