-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add POMDP example and change HMM example to work with DiscreteTransition
and DirichletCollection
#12
base: main
Are you sure you want to change the base?
Conversation
Many thanks @wouterwln This is really great stuff! I've run the example and played around a bit. I'll lay out my thoughts and queries regarding your explanations below. Note: I have only looked at the POMDP example thus far.
init = @initialization begin
q(A) = DirichletCollection(diageye(25) .+ 0.1)
q(B) = DirichletCollection(ones(25, 25, 4))
end instead of init = @initialization begin
q(A) = DirichletCollection(diageye(36) .+ 0.1)
q(B) = DirichletCollection(ones(36, 36, 4))
end Given that the WindyGridWorld is a
...
m_A = mean(p_A),
m_B = mean(p_B)
... I get that you say: "The real reason we did this is because we do not want messages from the future to influence the model parameters, instead only learning the model parameters from past data. " Fair enough and thank you for that context, otherwise I would have no idea what was going on here. I still have no appreciation as to why future messages would influence inference over the model parameters, were this step to be omitted. I don't even know why that would be such a bad thing, in principle. Perhaps because the experimental setup requires it in this specific case? There are one or two trivial spelling mistakes but that's a real nit-pick. What I think would help me most now is to see the docs on Many thanks again! I'm keen to help out wherever on this. |
], | ||
"source": [ | ||
"include(\"env.jl\")\n", | ||
"env = RxEnvironment(WindyGridWorld((0, 1, 1, 1, 0), [], (4, 3)))\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is only defining a (5, 5) grid? yet the rest of the text assumes a (6, 6) grid?
"source": [ | ||
"## Model Setup\n", | ||
"\n", | ||
"First, we'll define our POMDP model structure. We will use the `DiscreteTransition` node in `RxInfer` to define the state transition model. The `DiscreteTransition` node is a special node that accepts any number of `Categorical` distributions as input, and outputs a `Categorical` distribution. This means that we can use it to define a state transition model that accepts the previous state and the control as `Categorical` random variables, but we can also use it to define our observation model! Furthermore, the `DiscreteTransition` node can be used both for parameter inference and for inference-as-planning, isn't that neat?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any such node in RxInfer
, though there is a Transition
node defined in ReactiveMP.
"output_type": "display_data" | ||
} | ||
], | ||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25 or 36?
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now, in order to use this model, we have to define the priors for the model parameters. The WindyGridworld environment has a 6-by-6 grid, so we need to instantiate a prior 36-by-36 transition matrices for every control! That's quite a lot of parameters, but as we will see, `RxInfer` will handle this just fine. We will give our agent a control space of 4 actions, so we need to instantiate 4 transition matrices. Furthermore, we have to transform the output from the environment to a 1-in-36 index, and the controls from a 1-in-4 index to a direction tuple." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the text assumes a (6, 6)
grid, while the code actually uses a (5, 5)
.
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"`RxEnvironments.jl` is a package that allows us to easily communicate between our agent and our environment. We can senc actions to the environment, and the environment will automatically respond with the corresponding observations. In order to access these in our model, we can subscribe to the observations and then use the `data` function to access the last observation." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"sync" not "senc"?
@FraserP117 will check the POMDP tutorial, but I think it is in a mergeable state already