Conversation
ryanjulian
left a comment
There was a problem hiding this comment.
so how does my env get into the snapshot .pkl file?
Codecov Report
@@ Coverage Diff @@
## master #575 +/- ##
==========================================
+ Coverage 58.55% 58.66% +0.11%
==========================================
Files 139 139
Lines 9143 9124 -19
Branches 1338 1331 -7
==========================================
- Hits 5354 5353 -1
+ Misses 3404 3385 -19
- Partials 385 386 +1
Continue to review full report at Codecov.
|
|
@ryanjulian Because the snapshot is taken by the algorithm and I removed env from the pickled file, it turns out that there's no way to save env. My idea is that the saving env is not very important so I just removed it. Also, I doubt the experiement resuming really worked previously because garage/scripts/run_experiment.py Lines 174 to 178 in 1c299fd I plan to resolve this issue together with #516 . |
|
@naeioi it's quite important to serialize the environment, and we can't really merge this without saving the environment. Look at https://github.com/rlworkgroup/garage/blob/master/examples/sim_policy.py -- this PR breaks it. TODO: We should add a test to make sure the snapshotter produces complete .pkl files, so that this PR would fail in the CI immediately. Reasons:
|
154a86c to
4b4595b
Compare
|
@ryanjulian That makes sense. I added a test to verify the presence and integrity of both environment and policy in snapshot. As a workaround of algo not having access to the actual environment, the environment is now saved in |
|
awesome! |
ryanjulian
left a comment
There was a problem hiding this comment.
awesome work. these PRs are making a huge positive impact on the codebase.
86735a5 to
996a176
Compare
8b18cd2 to
a8821f0
Compare
| plot=False, | ||
| target_update_tau=0.05, | ||
| n_epoch_cycles=20, | ||
| max_path_length=100, |
There was a problem hiding this comment.
I thought this line is deleted in off policy algorithm?
There was a problem hiding this comment.
Actually not. I keep n_epoch_cycles so that an algorithm can compute epoch by iteration/n_epoch_cycles. This is a little bit awkward but required in DDPG.
There was a problem hiding this comment.
I don't quite understand. I meant is max_path_length=100 not necessary here?
|
bump. are there any hard blockers on this PR? |
|
I will have this merged after rebase. |
This commit saves env in runner because algo not longer has access to the actual env. A test is also added to test the presence and integrity of both env and policy in snapshot. Signed-off-by: Keren Zhu <naeioi@hotmail.com>
78254b4 to
7a8caa8
Compare
Disable not-context-manager check. Pylint has long been having this bug. See pylint-dev/pylint#782. Disable c-extension-no-member check. Some C modules don't report their exported functions and variable in python. In our case, baselines is using mpi4py which makes pylint complain. See https://travis-ci.com/rlworkgroup/garage/builds/105359651#L690.
|
I disable two pylint checks.
|
This PR replaces env with env_spec in algorithm constructor.
This closes #513 .