[D]Current SOTA for varying action spaces or state dependent action spaces

Hello. I generally dont deal with RL in my work. I work mostly with ML/DL algorithms. But the recent problem statement made me wonder if RL can be used. The problem currently is that u are currently at a state and have information of all past states and know what are the current actions that can be taken from this state ( these actions keep changing over time per state but they are countable number of actions), then what RL algorithm would you use to do this task. Again I am a noob in this.

