control systems with MUD points

Go to file

Michael Pilosov a8c0ad14ac update sd assumption		2022-03-20 20:00:00 -06:00
.gitignore	Initial commit	2022-03-21 01:14:27 +00:00
data.pkl	data file	2022-03-21 01:28:46 +00:00
main.py	update sd assumption	2022-03-20 20:00:00 -06:00
README.md	update docs	2022-03-20 19:59:52 -06:00
requirements.txt	requirements	2022-03-20 19:33:36 -06:00
sample.py	generate sample data	2022-03-21 01:28:21 +00:00

README.md

mud-games

control systems with MUD points

installation

pip install -r requirements.txt

usage

A data.pkl file is provided for your convenience with input / output samples.

The inputs are the parameters to a 4x1 matrix which is multiplied against the observations of the state in order to make a decision for the next action (push left or right). The output of the vector inner-product is binarized by comparison to zero as a threshold value.

The parameter space is standard normal. There is no assumed error in observations, so the "data variance" is designed to reflect the acceptable ranges for the parameters:

From gym:

The cart x-position (index 0) can be take values between (-4.8, 4.8), but the episode terminates if the cart leaves the (-2.4, 2.4) range.
The pole angle can be observed between (-.418, .418) radians (or ±24°), but the episode terminates if the pole angle is not in the range (-.2095, .2095) (or ±12°)

The target "signal" is zero for all four dimensions of the observation space. The presumed "data variance" should actually correspond to the acceptable bands of signal (WIP).

python main.py

generate data

You can generate your own data with:

python data.py

Note: if you change the presumed sample space in data.py, you should make the corresponding changes to the initial distribution in main.py.

improvements

Using the following presumptions, we can establish better values for the "data variance": The angular momentum of the pole is the most important thing to stabilize.