iwae_estimator

tfsnippet.iwae_estimator(log_values, axis, keepdims=False, name=None)

Derive the gradient estimator for \(\mathbb{E}_{q(\mathbf{z}^{(1:K)}|\mathbf{x})}\Big[\log \frac{1}{K} \sum_{k=1}^K f\big(\mathbf{x},\mathbf{z}^{(k)}\big)\Big]\), by IWAE (Burda, Y., Grosse, R. and Salakhutdinov, R., 2015) algorithm.

\[\begin{split}\begin{aligned} &\nabla\,\mathbb{E}_{q(\mathbf{z}^{(1:K)}|\mathbf{x})}\Big[\log \frac{1}{K} \sum_{k=1}^K f\big(\mathbf{x},\mathbf{z}^{(k)}\big)\Big] = \nabla \, \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\log \frac{1}{K} \sum_{k=1}^K w_k\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\nabla \log \frac{1}{K} \sum_{k=1}^K w_k\Bigg] = \\ & \quad \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\frac{\nabla \frac{1}{K} \sum_{k=1}^K w_k}{\frac{1}{K} \sum_{i=1}^K w_i}\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\frac{\sum_{k=1}^K w_k \nabla \log w_k}{\sum_{i=1}^K w_i}\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\sum_{k=1}^K \widetilde{w}_k \nabla \log w_k\Bigg] \end{aligned}\end{split}\]
Parameters:
  • log_values – Log values of the target function given z and x, i.e., \(\log f(\mathbf{z},\mathbf{x})\).
  • axis – The sampling axes to be reduced in outputs.
  • keepdims (bool) – When axis is specified, whether or not to keep the reduced axes? (default False)
  • name (str) – Default name of the name scope. If not specified, generate one according to the method name.
Returns:

The surrogate for optimizing the original target.

Maximizing/minimizing this surrogate via gradient descent will effectively maximize/minimize the original target.

Return type:

tf.Tensor