@@ -18,29 +18,53 @@ class DiscreteCNNQFunction(QFunction2):
1818 of Q(s, a).
1919
2020 Args:
21- env_spec: environment specification
22- filter_dims: Dimension of the filters.
23- num_filters: Number of filters.
24- strides: The strides of the sliding window.
25- hidden_sizes: Output dimension of dense layer(s).
26- name: Variable scope of the cnn.
27- padding: The type of padding algorithm to use, from "SAME", "VALID".
28- max_pooling: Boolean for using max pooling layer or not.
29- pool_shape: Dimension of the pooling layer(s).
30- hidden_nonlinearity: Activation function for
31- intermediate dense layer(s).
32- hidden_w_init: Initializer function for the weight
33- of intermediate dense layer(s).
34- hidden_b_init: Initializer function for the bias
35- of intermediate dense layer(s).
36- output_nonlinearity: Activation function for
37- output dense layer.
38- output_w_init: Initializer function for the weight
39- of output dense layer(s).
40- output_b_init: Initializer function for the bias
41- of output dense layer(s).
42- dueling: Bool for using dueling network or not.
43- layer_normalization: Bool for using layer normalization or not.
21+ env_spec (garage.envs.env_spec.EnvSpec): Environment specification.
22+ filter_dims (tuple[int]): Dimension of the filters. For example,
23+ (3, 5) means there are two convolutional layers. The filter for
24+ first layer is of dimension (3 x 3) and the second one is of
25+ dimension (5 x 5).
26+ num_filters (tuple[int]): Number of filters. For example, (3, 32) means
27+ there are two convolutional layers. The filter for the first layer
28+ has 3 channels and the second one with 32 channels.
29+ strides (tuple[int]): The stride of the sliding window. For example,
30+ (1, 2) means there are two convolutional layers. The stride of the
31+ filter for first layer is 1 and that of the second layer is 2.
32+ hidden_sizes (list[int]): Output dimension of dense layer(s).
33+ For example, (32, 32) means the MLP of this q-function consists of
34+ two hidden layers, each with 32 hidden units.
35+ name (str): Variable scope of the cnn.
36+ padding (str): The type of padding algorithm to use,
37+ either 'SAME' or 'VALID'.
38+ max_pooling (bool): Boolean for using max pooling layer or not.
39+ pool_shapes (tuple[int]): Dimension of the pooling layer(s). For
40+ example, (2, 2) means that all the pooling layers have
41+ shape (2, 2).
42+ pool_strides (tuple[int]): The strides of the pooling layer(s). For
43+ example, (2, 2) means that all the pooling layers have
44+ strides (2, 2).
45+ cnn_hidden_nonlinearity (callable): Activation function for
46+ intermediate dense layer(s) in the CNN. It should return a
47+ tf.Tensor. Set it to None to maintain a linear activation.
48+ hidden_nonlinearity (callable): Activation function for intermediate
49+ dense layer(s) in the MLP. It should return a tf.Tensor. Set it to
50+ None to maintain a linear activation.
51+ hidden_w_init (callable): Initializer function for the weight
52+ of intermediate dense layer(s) in the MLP. The function should
53+ return a tf.Tensor.
54+ hidden_b_init (callable): Initializer function for the bias
55+ of intermediate dense layer(s) in the MLP. The function should
56+ return a tf.Tensor.
57+ output_nonlinearity (callable): Activation function for output dense
58+ layer in the MLP. It should return a tf.Tensor. Set it to None
59+ to maintain a linear activation.
60+ output_w_init (callable): Initializer function for the weight
61+ of output dense layer(s) in the MLP. The function should return
62+ a tf.Tensor.
63+ output_b_init (callable): Initializer function for the bias
64+ of output dense layer(s) in the MLP. The function should return
65+ a tf.Tensor.
66+ dueling (bool): Bool for using dueling network or not.
67+ layer_normalization (bool): Bool for using layer normalization or not.
4468 """
4569
4670 def __init__ (self ,
0 commit comments