tfsnippet.variational

class tfsnippet.variational.VariationalChain(variational, model, log_joint=None, latent_names=None, latent_axis=None)

Bases: object

Chain of the variational and model nets for variational inference.

In the context of variational inference, it is a common usage for chaining the variational net and the model net, by feeding the samples of latent variables from the variational net as the observations of the model net. VariationalChain holds the BayesianNet instances of the variational and the model nets, and the VariationalInference object for this chain.

__init__(variational, model, log_joint=None, latent_names=None, latent_axis=None)

Construct the VariationalChain.

Parameters:
  • variational (BayesianNet) – The variational net.
  • model (BayesianNet) – The model net.
  • log_joint (tf.Tensor) – The log-joint of the model net. If None, the log-densities of all variables within model net will be summed up as the log-joint. (default None)
  • latent_names (Iterable[str]) – Names of the latent variables in variational inference. If None, all of the variables within variational net will be collected. (default None)
  • latent_axis – The axis or axes to be considered as the sampling dimensions of latent variables. The specified axes will be summed up in the variational lower-bounds or training objectives. (default None)
latent_axis

Get the axes of sampling dimensions of latent variables.

latent_names

Get the names of the latent variables for variational inference.

Returns:The names of the latent variables.
Return type:tuple[str]
log_joint

Get the log-joint of the model.

Returns:The log-joint of the model.
Return type:tf.Tensor
model

Get the model net.

Returns:The model net.
Return type:BayesianNet
variational

Get the variational net.

Returns:The variational net.
Return type:BayesianNet
vi

Get the variational inference object.

Returns:The variational inference object.
Return type:VariationalInference
tfsnippet.variational.sgvb_estimator(values, axis=None, keepdims=False, name=None)

Derive the gradient estimator for \(\mathbb{E}_{q(\mathbf{z}|\mathbf{x})}\big[f(\mathbf{x},\mathbf{z})\big]\), by SGVB (Kingma, D.P. and Welling, M., 2013) algorithm.

\[\nabla \, \mathbb{E}_{q(\mathbf{z}|\mathbf{x})}\big[f(\mathbf{x},\mathbf{z})\big] = \nabla \, \mathbb{E}_{q(\mathbf{\epsilon})}\big[f(\mathbf{x},\mathbf{z}(\mathbf{\epsilon}))\big] = \mathbb{E}_{q(\mathbf{\epsilon})}\big[\nabla f(\mathbf{x},\mathbf{z}(\mathbf{\epsilon}))\big]\]
Parameters:
  • values – Values of the target function given z and x, i.e., \(f(\mathbf{z},\mathbf{x})\).
  • axis – The sampling dimensions to be averaged out. If None, no dimensions will be averaged out.
  • keepdims (bool) – When axis is specified, whether or not to keep the averaged dimensions? (default False)
  • name (str) – Name of this operation in TensorFlow graph. (default “sgvb_estimator”)
Returns:

The surrogate for optimizing the target function

with SGVB gradient estimator.

Return type:

tf.Tensor

tfsnippet.variational.iwae_estimator(log_values, axis, keepdims=False, name=None)

Derive the gradient estimator for \(\mathbb{E}_{q(\mathbf{z}^{(1:K)}|\mathbf{x})}\Big[\log \frac{1}{K} \sum_{k=1}^K f\big(\mathbf{x},\mathbf{z}^{(k)}\big)\Big]\), by IWAE (Burda, Y., Grosse, R. and Salakhutdinov, R., 2015) algorithm.

\[\begin{split}\begin{aligned} &\nabla\,\mathbb{E}_{q(\mathbf{z}^{(1:K)}|\mathbf{x})}\Big[\log \frac{1}{K} \sum_{k=1}^K f\big(\mathbf{x},\mathbf{z}^{(k)}\big)\Big] = \nabla \, \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\log \frac{1}{K} \sum_{k=1}^K w_k\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\nabla \log \frac{1}{K} \sum_{k=1}^K w_k\Bigg] = \\ & \quad \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\frac{\nabla \frac{1}{K} \sum_{k=1}^K w_k}{\frac{1}{K} \sum_{i=1}^K w_i}\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\frac{\sum_{k=1}^K w_k \nabla \log w_k}{\sum_{i=1}^K w_i}\Bigg] = \mathbb{E}_{q(\mathbf{\epsilon}^{(1:K)})}\Bigg[\sum_{k=1}^K \widetilde{w}_k \nabla \log w_k\Bigg] \end{aligned}\end{split}\]
Parameters:
  • log_values – Log values of the target function given z and x, i.e., \(\log f(\mathbf{z},\mathbf{x})\).
  • axis – The sampling dimensions to be averaged out.
  • keepdims (bool) – When axis is specified, whether or not to keep the averaged dimensions? (default False)
  • name (str) – Name of this operation in TensorFlow graph. (default “iwae_estimator”)
Returns:

The surrogate for optimizing the target function

with IWAE gradient estimator.

Return type:

tf.Tensor

tfsnippet.variational.importance_sampling_log_likelihood(log_joint, latent_log_prob, axis, keepdims=False, name=None)

Compute \(\log p(\mathbf{x})\) by importance sampling.

\[\log p(\mathbf{x}) = \log \mathbb{E}_{q(\mathbf{z}|\mathbf{x})} \Big[\exp\big(\log p(\mathbf{x},\mathbf{z}) - \log q(\mathbf{z}|\mathbf{x})\big) \Big]\]
Parameters:
  • log_joint – Values of \(\log p(\mathbf{z},\mathbf{x})\), computed with \(\mathbf{z} \sim q(\mathbf{z}|\mathbf{x})\).
  • latent_log_prob\(q(\mathbf{z}|\mathbf{x})\).
  • axis – The sampling dimensions to be averaged out.
  • keepdims (bool) – When axis is specified, whether or not to keep the averaged dimensions? (default False)
  • name (str) – Name of this operation in TensorFlow graph. (default “importance_sampling_log_likelihood”)
Returns:

The computed \(\log p(x)\).

class tfsnippet.variational.VariationalInference(log_joint, latent_log_probs, axis=None)

Bases: object

Class for variational inference.

__init__(log_joint, latent_log_probs, axis=None)

Construct the VariationalInference.

Parameters:
  • log_joint (tf.Tensor) – The log-joint of model.
  • latent_log_probs (Iterable[tf.Tensor]) – The log-densities of latent variables from the variational net.
  • axis – The axis or axes to be considered as the sampling dimensions of latent variables. The specified axes will be summed up in the variational lower-bounds or training objectives. (default None)
axis

Get the axis or axes to be considered as the sampling dimensions of latent variables.

evaluation

Get the factory for evaluation outputs.

Returns:The factory for evaluation outputs.
Return type:VariationalEvaluation
latent_log_prob

Get the summed log-density of latent variables.

Returns:The summed log-density of latent variables.
Return type:tf.Tensor
latent_log_probs

Get the log-densities of latent variables.

Returns:The log-densities of latent variables.
Return type:tuple[tf.Tensor]
log_joint

Get the log-joint of the model.

Returns:The log-joint of the model.
Return type:tf.Tensor
lower_bound

Get the factory for variational lower-bounds.

Returns:The factory for variational lower-bounds.
Return type:VariationalLowerBounds
training

Get the factory for training objectives.

Returns:The factory for training objectives.
Return type:VariationalTrainingObjectives
zs_elbo()

Create a zhusuan.variational.EvidenceLowerBoundObjective, with pre-computed log-joint.

Returns:
The constructed
per-data ELBO objective.
Return type:zhusuan.variational.EvidenceLowerBoundObjective
zs_importance_weighted_objective()

Create a zhusuan.variational.ImportanceWeightedObjective, with pre-computed log-joint.

Returns:
The constructed
per-data importance weighted objective.
Return type:zhusuan.variational.ImportanceWeightedObjective
zs_klpq()

Create a zhusuan.variational.InclusiveKLObjective, with pre-computed log-joint.

Returns:
The constructed
per-data inclusive KL objective.
Return type:zhusuan.variational.InclusiveKLObjective
zs_objective(func, **kwargs)

Create a zhusuan.variational.VariationalObjective with pre-computed log-joint, by specified algorithm.

Parameters:
  • func – The variational algorithm from ZhuSuan. Supported functions are: 1. zhusuan.variational.elbo() 2. zhusuan.variational.importance_weighted_objective() 3. zhusuan.variational.klpq()
  • **kwargs – Named arguments passed to func.
Returns:

The constructed

per-data variational objective.

Return type:

zhusuan.variational.VariationalObjective

class tfsnippet.variational.VariationalLowerBounds(vi)

Bases: object

Factory for variational lower-bounds.

__init__(vi)

Construct a new VariationalEvaluation.

Parameters:vi (VariationalInference) – The variational inference object.
elbo(name=None)

Get the evidence lower-bound.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “elbo”)
Returns:The evidence lower-bound.
Return type:tf.Tensor
importance_weighted_objective(name=None)

Get the importance weighted lower-bound.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “monte_carlo_objective”)
Returns:The per-data importance weighted lower-bound.
Return type:tf.Tensor
monte_carlo_objective(name=None)

Get the importance weighted lower-bound.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “monte_carlo_objective”)
Returns:The per-data importance weighted lower-bound.
Return type:tf.Tensor
class tfsnippet.variational.VariationalTrainingObjectives(vi)

Bases: object

Factory for variational training objectives.

__init__(vi)

Construct a new VariationalEvaluation.

Parameters:vi (VariationalInference) – The variational inference object.
iwae(name=None)

Get the SGVB training objective for importance weighted objective.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “iwae”)
Returns:
The per-data SGVB training objective for importance
weighted objective.
Return type:tf.Tensor
reinforce(variance_reduction=True, baseline=None, decay=0.8, name=None)

Get the REINFORCE training objective.

Parameters:
  • variance_reduction (bool) – Whether to use variance reduction.
  • baseline (tf.Tensor) – A trainable estimation for the scale of the elbo value.
  • decay (float) – The moving average decay for variance normalization.
  • name (str) – Name of this operation in TensorFlow graph. (default “reinforce”)
Returns:

The per-data REINFORCE training objective.

Return type:

tf.Tensor

See also

zhusuan.variational.EvidenceLowerBoundObjective.reinforce()

rws_wake(name=None)

Get the wake-phase Reweighted Wake-Sleep (RWS) training objective.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “rws_wake”)
Returns:The per-data wake-phase RWS training objective.
Return type:tf.Tensor

See also

zhusuan.variational.InclusiveKLObjective.rws()

sgvb(name=None)

Get the SGVB training objective.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “sgvb”)
Returns:
The per-data SGVB training objective.
It is the negative of ELBO, which should directly be minimized.
Return type:tf.Tensor
vimco(name=None)

Get the VIMCO training objective.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “vimco”)
Returns:The per-data VIMCO training objective.
Return type:tf.Tensor

See also

zhusuan.variational.ImportanceWeightedObjective.vimco()

class tfsnippet.variational.VariationalEvaluation(vi)

Bases: object

Factory for variational evaluation outputs.

__init__(vi)

Construct a new VariationalEvaluation.

Parameters:vi (VariationalInference) – The variational inference object.
importance_sampling_log_likelihood(name=None)

Compute \(log p(x)\) by importance sampling.

Parameters:name (str) – Name of this operation in TensorFlow graph. (default “importance_sampling_log_likelihood”)
Returns:The per-data \(log p(x)\).
Return type:tf.Tensor

See also

zhusuan.evaluation.is_loglikelihood()

is_loglikelihood(name=None)

Short-cut for importance_sampling_log_likelihood().

tfsnippet.variational.elbo_objective(log_joint, latent_log_prob, axis=None, keepdims=False, name=None)

Derive the ELBO objective.

\[\mathbb{E}_{\mathbf{z} \sim q_{\phi}(\mathbf{z}|\mathbf{x})}\big[ \log p_{\theta}(\mathbf{x},\mathbf{z}) - \log q_{\phi}(\mathbf{z}|\mathbf{x}) \big]\]
Parameters:
  • log_joint – Values of \(\log p(\mathbf{z},\mathbf{x})\), computed with \(\mathbf{z} \sim q(\mathbf{z}|\mathbf{x})\).
  • latent_log_prob\(q(\mathbf{z}|\mathbf{x})\).
  • axis – The sampling dimensions to be averaged out. If None, no dimensions will be averaged out.
  • keepdims (bool) – When axis is specified, whether or not to keep the averaged dimensions? (default False)
  • name (str) – Name of this operation in TensorFlow graph. (default “elbo_objective”)
Returns:

The ELBO objective. Not applicable for training.

Return type:

tf.Tensor

tfsnippet.variational.monte_carlo_objective(log_joint, latent_log_prob, axis=None, keepdims=False, name=None)

Derive the Monte-Carlo objective.

\[\mathcal{L}_{K}(\mathbf{x};\theta,\phi) = \mathbb{E}_{\mathbf{z}^{(1:K)} \sim q_{\phi}(\mathbf{z}|\mathbf{x})}\Bigg[ \log \frac{1}{K} \sum_{k=1}^K { \frac{p_{\theta}(\mathbf{x},\mathbf{z}^{(k)})} {q_{\phi}(\mathbf{z}^{(k)}|\mathbf{x})} } \Bigg]\]
Parameters:
  • log_joint – Values of \(\log p(\mathbf{z},\mathbf{x})\), computed with \(\mathbf{z} \sim q(\mathbf{z}|\mathbf{x})\).
  • latent_log_prob\(q(\mathbf{z}|\mathbf{x})\).
  • axis – The sampling dimensions to be averaged out.
  • keepdims (bool) – When axis is specified, whether or not to keep the averaged dimensions? (default False)
  • name (str) – Name of this operation in TensorFlow graph. (default “monte_carlo_objective”)
Returns:

The Monte Carlo objective. Not applicable for training.

Return type:

tf.Tensor