Lstm/gru vs. past observations

I have a pomdp controll problem which I solved using the TD3 Algorithm and gru or lstm (both worked, but Lstm was slightly better).

Is it possible to not use recurrent neural networks and instead give the agent its past observations again in order to get knowledge of the dynamics of the system?

I searched for papers using past observations instead of RNN for pomdp controll probles, but all paper I found used RNN.

I hope my questions is understandable.

