Conversation
ryanjulian
left a comment
There was a problem hiding this comment.
overall this is very very good.
please make the test of lstm() a little more comprehensive and less reliant on comparisons to another library. that way we will also detect a breakage if keras breaks. it should also be a useful test if someone decides to replace the keras layer with something else.
| i.obs_var, | ||
| i.policy_state_info_vars, | ||
| name='policy_dist_info') | ||
| name='policy_dist_info_entropy') |
There was a problem hiding this comment.
i don't think this is 'policy_dist_info_entropy' -- dist_info contains the parameters of the distribution (e.g. mean and std for a gaussian, probabilities for a categorial, etc.)
There was a problem hiding this comment.
I renamed it because we can't call dist_info_sym with the same name twice (because the underlying model will build twice with the same name, which is not allowed). I think Chang is going to fix it by changing the name_scope to variable_scope. @CatherineSue
There was a problem hiding this comment.
Changing it from name_scope to variable_scope won't fix the error. The cause is in Model instead of here.
| state_include_action=True, | ||
| forget_bias=True, | ||
| layer_normalization=False): | ||
| assert isinstance(env_spec.action_space, Discrete), ( |
There was a problem hiding this comment.
this is probably better as a ValueError
| from tests.fixtures import TfGraphTestCase | ||
|
|
||
|
|
||
| class TestLSTM(TfGraphTestCase): |
There was a problem hiding this comment.
can you test a least a couple sets of known inputs/outputs here?
I know that this is implemented using tf.keras.layers.LSTM, but if we are going to rely on that we should at least also have one smoke test to make sure that values are correct.
most of the work can be done by mocking out the keras layer and ensuring we are calling the correct set of keras APIs
it is also possible to assert some basic truths which should be true for all LSTMs, e.g. is the output length the correct length for the input? etc.
There was a problem hiding this comment.
i think overall this testing strategy is okay, but please add some simple checks too, e.g.
- output length vs input length
- gradient path (or lack thereof) between pairs of inputs/outputs
- non-zero hidden states
- etc.
perhaps you can take a look as the tests from lasagne for some ideas: https://github.com/Lasagne/Lasagne/blob/master/lasagne/tests/layers/test_recurrent.py
|
It seems you may have forgotten to cover the https://codecov.io/gh/rlworkgroup/garage/commit/07d0d3155ce9961affb152589e9dc3545f487831#D12-207 |
| with tf.variable_scope(self._variable_scope): | ||
| outputs, _, _, _, _, _ = self.model.build( | ||
| all_input_var, | ||
| *self.model.networks['default'].inputs[1:4], |
There was a problem hiding this comment.
I think this line is hard to read.
There was a problem hiding this comment.
yeah this is pretty crazy. what even is inputs[1:4]?
There was a problem hiding this comment.
It is crazy. I will fix it.
07d0d31 to
d7d6d50
Compare
Codecov Report
@@ Coverage Diff @@
## master #642 +/- ##
==========================================
+ Coverage 61.49% 62.32% +0.83%
==========================================
Files 160 163 +3
Lines 9247 9444 +197
Branches 1249 1262 +13
==========================================
+ Hits 5686 5886 +200
+ Misses 3254 3240 -14
- Partials 307 318 +11
Continue to review full report at Codecov.
|
f034d66 to
1aa4980
Compare
| return loss, pol_mean_kl | ||
|
|
||
| def _build_entropy_term(self, i): | ||
| def _build_entropy_term(self, i, policy_dist_info): |
There was a problem hiding this comment.
please don't make these functions dependent on anything but i
There was a problem hiding this comment.
it is very easy to make a loss function which is a tangle of calls to other functions. we instead make sure all components of the loss function are only dependent on i (the input spec, basically) which makes it easy to follow.
this will be much easier once the loss function is a model.
There was a problem hiding this comment.
I see. I modified that since _build_entropy_term actually reuses the same policy_dist_info symbolic tensor with _build_policy_loss. I will come up with another way.
There was a problem hiding this comment.
We have to reuse it if we don't do dist info sym with a different name. Because each network name is unique. That's why I add "_entropy" after that.
There was a problem hiding this comment.
We can just append a "_2" to the name in the second call of dist info sym.
| # We store the variable_scope to reenter later when we reuse it | ||
| with tf.variable_scope( | ||
| self._name, reuse=False) as self._variable_scope: | ||
| with tf.variable_scope(self._name) as self._variable_scope: |
There was a problem hiding this comment.
please explicitly assign the new a scope into a member variable -- do not capture it as a coincidence of a context manager
There was a problem hiding this comment.
Forgot to modify it here. Will fix.
| if spec: | ||
| c = namedtuple(network_name, | ||
| [*spec, 'input', 'output', 'inputs', 'outputs']) | ||
| in_spec = self.network_input_spec() |
There was a problem hiding this comment.
this logic about spec is getting kind of complicated and i'm not sure i understand what's going on.
can you explain what the purpose of this code is? perhaps it would be good to break it into a well-documented helper function.
can you also help me understand the purpose of the input and output specs?
There was a problem hiding this comment.
Basically the specs tells us what the inputs and outputs are. There are 4 cases:
Case 1: Single input/output model
Then we don't need any specs, since we don't need extra attribute for the networks. We only need input,inputs,output,outputs which are the default ones in network.
Case 2: Single input/Multiple output model
We set network_output_spec() so that we can have assign "names" to the outputs, e.g.
def network_output_spec(self):
return ['state', 'action']Then we can do model.networks['default'].state and model.networks['default'].action
In this case, we don't need network_input_spec() since there is only one input. What we need is only input, which are already included by default.
Case 3: Multiple input/Single output model
Same as above, but instead of having network_output_spec, we have network_input_spec this time.
Case 4: Multiple input/output model
We have both network_output_spec and network_input_spec. Then we can have sth like
model.networks['default'].state and model.networks['default'].action as the outputs and model.networks['default'].state_input and model.networks['default'].action_input as the inputs, to make it more readable.
Without the spec, we can only do model.networks['default'].inputs[1:4].
There was a problem hiding this comment.
this seems pretty complex -- is there a way we can always have an input/output spec, even for multi -input and -output models?
i think always having a spec would simplify the logic significantly.
There was a problem hiding this comment.
Do you mean we make specs not optional?
Right now by default the specs returns empty list, which means networks will only have input,inputs,output,outputs.
There was a problem hiding this comment.
i mean that you internally maintain the specs data structure regardless of whether the user has chosen to do multi input/output or not. then the single-input and single-output cases don't need to be handled with so much special logic.
the outer interface can stay the same.
| x_ifco = np.matmul(input_val, w_x_ifco) | ||
| h_ifco = np.matmul(step_hidden, w_h_ifco) | ||
|
|
||
| x_i, x_f, x_c, x_o = np.split(x_ifco, 4, axis=1) # noqa: E501, pylint: disable=unbalanced-tuple-unpacking |
There was a problem hiding this comment.
what is the justifcation for disabling E501 and the pylint warning?
There was a problem hiding this comment.
Since I can't find out what's the reason pylint is complaining about that error, so I disable the pylint checking and since it is too long, I disabled E501 as well. I will dig into it and see what pylint is talking about.
There was a problem hiding this comment.
perhaps is one of your return values itself a tuple or list?
There was a problem hiding this comment.
no... They are all numpy.ndarray.
There was a problem hiding this comment.
Could be because interpreter cannot ensure the right hand side can be split into 4. Is there any other way to pass pylint for this?
There was a problem hiding this comment.
generally pylint rules try to encourage being very explicit and writing code so that it can be statically analyzed for errors (e.g. using pylint).
in this case i think pylint has good point, because the np.split operation is totally static (all inputs are known at compile time), but it has been written as a dynamic operation unnecessarily.
the rationale for this is that code is read much more often than it is written, so it's usually better to make it longer and easier to read rather than shorted and easier to write.
maybe it would be clearer if you just wrote out. this way a reader doesn't have to look up documentation for np.split to find out how your array is unpacked.
x_i, x_f, x_c, x_o = x_ifco[:, :4], x_ifco[:, 4:8], x_ifco[:, 8:12], x_ifco[:, 12:16]
# or
x_i = x_ifco[:, :4]
x_f = x_ifco[:, 4:8]
x_c = x_ifco[:, 8:12]
x_o = x_ifco[:, 12:16]ea988ff to
3d37267
Compare
3d37267 to
a8beb38
Compare
lstm,lstm_model,categorical_lstm_policy_with_model.categorical_lstm_policy_with_model