Conversation
Codecov Report
@@ Coverage Diff @@
## master #618 +/- ##
==========================================
+ Coverage 60.63% 61.18% +0.55%
==========================================
Files 156 159 +3
Lines 9069 9159 +90
Branches 1241 1242 +1
==========================================
+ Hits 5499 5604 +105
+ Misses 3260 3238 -22
- Partials 310 317 +7
Continue to review full report at Codecov.
|
| out = input_var | ||
| for model in self._models[:-1]: | ||
| out = model.build(out, name=name) | ||
| self.model = self._models[-1] |
There was a problem hiding this comment.
could you remind me why the last model is self.model?
There was a problem hiding this comment.
This is actually a temp-fix from my previous implementation. We don't need self.model anymore, as they now become models. Nice catch.
| """ | ||
| strides = [1, stride, stride, 1] | ||
|
|
||
| if padding not in ['SAME', 'VALID']: |
There was a problem hiding this comment.
i think TensorFlow would also throw a ValueError. Any reason you want to raise the error here?
There was a problem hiding this comment.
I was not aware of that. Then I think it's fine to let TensorFlow handle it.
| CNN Model. | ||
|
|
||
| Args: | ||
| filter_dims: Dimension of the filters. |
There was a problem hiding this comment.
Please add types to these parameters
e.g. filter_dims (tuple[int]): ...
ryanjulian
left a comment
There was a problem hiding this comment.
This LGTM mostly. Other than basic comments, please resolve:
- Whether the inputs/outputs API should be on Policy or Model
- The question of the add_model API for policy -- can't this just be a simple derived model class instead?
|
|
||
| def build_models(self, input_var, name=None): | ||
| out = input_var | ||
| for model in self._models: |
There was a problem hiding this comment.
what if my models are not sequential?
Perhaps instead you can define a model class Sequential
class Sequential(Model):
def __init__(self, *models):
self._models = models
def _build(self, input_var, name=None):
out = input_var
for model in self._models:
out = model.build(out, name=name)
return out| return self._models[0].networks['default'].input | ||
|
|
||
| @property | ||
| def outputs(self): |
There was a problem hiding this comment.
maybe this should just an an API on Model instead?
There was a problem hiding this comment.
Yes. We should add this into Model. Something like
class Model(...):
...
@property
def input(self):
return self.networks['default'].input
@property
def output(self):
return self.networks['default'].outputand for the Sequential model, we can override as
class Sequential(Model):
...
@property
def input(self):
return self._models[0].networks['default'].input
@property
def output(self):
return self._models[-1].networks['default'].output| It only works with akro.tf.Discrete action space. | ||
|
|
||
| Args: | ||
| env_spec: Environment specification. |
There was a problem hiding this comment.
please include types in docstrings
| @overrides | ||
| def get_action(self, observation): | ||
| """Return a single action.""" | ||
| flat_obs = self.observation_space.flatten(observation) |
There was a problem hiding this comment.
does this flatten the 2D image, or only the batch?
There was a problem hiding this comment.
currently it flatten the 2D image, since we are doing self.obs_dim = env_spec.observation_space.flat_dim in the policy.
I actually think this is a mistake, it should not flatten the observation here. For example, in pixel environment we want to pass the original image input with shape (w, h, c) to the policy. Therefore, we should do self.obs_dim = env_spec.observation_space.shape instead.
This is missed because the CNNModel was mocked out.
There was a problem hiding this comment.
cam you fix it?
i don't think the image should be flattened. that seems wrong.
| of intermediate dense layer(s). | ||
| hidden_b_init: Initializer function for the bias | ||
| of intermediate dense layer(s). | ||
| output_nonlinearity: Activation function for |
There was a problem hiding this comment.
please take a moment to add types to this docstring
| pool_stride: The stride of the pooling layer(s). | ||
| pool_shapes: Dimension of the pooling layer(s). | ||
| pool_strides: The strides of the pooling layer(s). | ||
| padding: The type of padding algorithm to use, from "SAME", "VALID". |
| num_filters=self._num_filters, | ||
| strides=self._strides, | ||
| padding=self._padding, | ||
| name="cnn") |
| __all__ = [ | ||
| "Policy", | ||
| "StochasticPolicy", | ||
| "CategoricalConvPolicy", |
There was a problem hiding this comment.
please take a moment to replace all these with single quotes (when you visit a file)
| hidden_w_init=tf.glorot_uniform_initializer(), | ||
| hidden_b_init=tf.zeros_initializer()): | ||
| """ | ||
| CNN model. Based on 'NHWC' data format: [batch, height, width, channel]. |
ryanjulian
left a comment
There was a problem hiding this comment.
LGTM. please submit once docstrings are updated. see my suggestion about how to name inner models in a Sequential
| Sequential Model. | ||
|
|
||
| Args: | ||
| name: Variable scope of the Sequential model. |
| CNN. Based on 'NHWC' data format: [batch, height, width, channel]. | ||
|
|
||
| Args: | ||
| input_var: Input tf.Tensor to the CNN. |
| CNN model. Based on 'NHWC' data format: [batch, height, width, channel]. | ||
|
|
||
| Args: | ||
| input_var: Input tf.Tensor to the CNN. |
| pool_strides(tuple[int]): The strides of the pooling layer(s). For | ||
| example, (2, 2) means that all the pooling layers have | ||
| strides (2, 2). | ||
| padding: The type of padding algorithm to use, |
| inputs: Tensor input(s), recommended to be position arguments, e.g. | ||
| def build(self, state_input=None, action_input=None, name=None). | ||
| It would be usually same as the inputs in build(). | ||
| name: Variable scope of the inner model, if exist. |
There was a problem hiding this comment.
please update the docstrings with types
| def inputs(self): | ||
| return self.networks['default'].inputs | ||
|
|
||
| @property |
There was a problem hiding this comment.
you should document these new properties with docstrings
| strides(tuple[int]): The stride of the sliding window. For example, | ||
| (1, 2) means there are two convolutional layers. The stride of the | ||
| filter for first layer is 1 and that of the second layer is 2. | ||
| name: Variable scope of the cnn model. |
There was a problem hiding this comment.
please provide a type for every parameter
| return ['sample', 'mean', 'log_std', 'std_param', 'dist'] | ||
|
|
||
| def _build(self, state_input): | ||
| def _build(self, state_input, name=None): |
There was a problem hiding this comment.
please take a moment to update the docstrings here with types.
| self._layer_normalization = layer_normalization | ||
|
|
||
| def _build(self, state_input): | ||
| def _build(self, state_input, name=None): |
There was a problem hiding this comment.
please take a moment to update the docstring here with types
| self._name = name | ||
| self._env_spec = env_spec | ||
| self._variable_scope = tf.VariableScope(reuse=False, name=name) | ||
| self._models = [] |
There was a problem hiding this comment.
please take a moment to make these docstrings complete
|
|
||
| @overrides | ||
| def dist_info_sym(self, obs_var, state_info_vars=None, name=None): | ||
| """Symbolic graph of the distribution.""" |
There was a problem hiding this comment.
please provide full docstrings for all methods (unless the parent class provides a docstring which is equivalent)
| hidden_nonlinearity=hidden_nonlinearity, | ||
| output_nonlinearity=output_nonlinearity, | ||
| layer_normalization=layer_normalization) | ||
| layer_normalization=layer_normalization, |
There was a problem hiding this comment.
please take a moment to add types to these docstrings
| self.model = MLPModel( | ||
| output_dim=action_dim, | ||
| name=name, | ||
| name='MLPModel', |
There was a problem hiding this comment.
please take a moment to add types to these docstrings
| std_output_nonlinearity=std_output_nonlinearity, | ||
| std_parameterization=std_parameterization, | ||
| layer_normalization=layer_normalization) | ||
| layer_normalization=layer_normalization, |
There was a problem hiding this comment.
please take a moment to add types to these docstrings
| ] | ||
|
|
||
| def _build(self, state_input): | ||
| def _build(self, state_input, name=None): |
There was a problem hiding this comment.
please take a moment to add types to these docstrings
| For example, (32, 32) means this MLP consists of two | ||
| hidden layers, each with 32 hidden units. | ||
| name (str): Network name, also the variable scope. | ||
| hidden_nonlinearity: Activation function for |
There was a problem hiding this comment.
They are functions. Make it hidden_nonlinearity(function)?
| name (str): Model name, also the variable scope. | ||
| padding (str): The type of padding algorithm to use, | ||
| either 'SAME' or 'VALID'. | ||
| hidden_nonlinearity: Activation function for |
| Args: | ||
| name: Variable scope of the Sequential model. | ||
| name (str): Model name, also the variable scope. | ||
| models (list[garage.Model]): The models to be connected |
There was a problem hiding this comment.
oh yes. It only takes garage.tf.models.Model.
| hidden_sizes (list[int]): Output dimension of dense layer(s). | ||
| For example, (32, 32) means the MLP of this policy consists | ||
| of two hidden layers, each with 32 hidden units. | ||
| hidden_nonlinearity: Activation function for |
There was a problem hiding this comment.
there are tf.Operation right?
There was a problem hiding this comment.
I don't think they are tf.Operation. In python doing type(tf.nn.relu) returns <class 'function'>.
There was a problem hiding this comment.
it's a function which returns tf.Tensor https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/python/ops/nn_ops.py#L2011
See https://mypy.readthedocs.io/en/latest/cheat_sheet_py3.html#functions for a guide on how to write the type hint for a function
ryanjulian
left a comment
There was a problem hiding this comment.
Looking great. Thanks for updating the docstrings. I think activation functions are just tf.Tensor. Feel free to submit once they are all cleared up.
| For example, (32, 32) means the MLP of this policy consists | ||
| of two hidden layers, each with 32 hidden units. | ||
| hidden_nonlinearity: Activation function for | ||
| intermediate dense layer(s). |
| For example, (32, 32) means the MLP of this policy consists of two | ||
| hidden layers, each with 32 hidden units. | ||
| hidden_nonlinearity: Activation function for | ||
| intermediate dense layer(s). |
a963ba1 to
4d51f86
Compare
caafe45 to
a444a91
Compare
Time to work on CNN. This PR does the following:
self.add_model()andself.build_models()in policy. This is needed for stacking multiple models. In the future, when we eventually also make policyis-a-model, we will also put this notion into the rest of the models.