# Inverted pendulum using learning

### To-do list:

• Train a fully connected, one-layer network on (essentially noisy) positional data from CartPole-v0 to learn first and second derivatives of cart position $x_t$ and pole angle $\theta_t$ (this is given in OpenAI). The output of this network will be the probability of a derivative $P(\dot{x_t})$ and $P(\dot{\theta_t})$ and not the derivative itself.
• Use the output of the network on the fly combined with Q-learning to keep the pole stable.