Link Search Menu Expand Document

AllCandidateSampler


tensorflow C++ API

tensorflow::ops::AllCandidateSampler

Generates labels for candidate sampling with a learned unigram distribution.


Summary

See explanations of candidate sampling and the data formats at go/candidate-sampling.

For each batch, this op picks a single set of sampled candidate labels.

The advantages of sampling candidates per-batch are simplicity and the possibility of efficient dense matrix multiplication. The disadvantage is that the sampled candidates must be chosen independently of the context and of the true labels.

Arguments:

  • scope: A Scope object
  • true_classes: A batch_size * num_true matrix, in which each row contains the IDs of the num_true target_classes in the corresponding original label.
  • num_true: Number of true labels per context.
  • num_sampled: Number of candidates to produce.
  • unique: If unique is true, we sample with rejection, so that all sampled candidates in a batch are unique. This requires some approximation to estimate the post-rejection sampling probabilities.

Optional attributes (seeAttrs):

  • seed: If either seed or seed2 are set to be non-zero, the random number generator is seeded by the given seed. Otherwise, it is seeded by a random seed.
  • seed2: An second seed to avoid seed collision.

Returns:

  • Output sampled_candidates: A vector of length num_sampled, in which each element is the ID of a sampled candidate.
  • Output true_expected_count: A batch_size * num_true matrix, representing the number of times each candidate is expected to occur in a batch of sampled candidates. If unique=true, then this is a probability.
  • Output sampled_expected_count: A vector of length num_sampled, for each sampled candidate representing the number of times the candidate is expected to occur in a batch of sampled candidates. If unique=true, then this is a probability.

AllCandidateSampler block

Source link : https://github.com/EXPNUNI/enuSpaceTensorflow/blob/master/enuSpaceTensorflow/tf_candidate_sampling_ops.cpp

Argument:

  • Scope scope : A Scope object (A scope is generated automatically each page. A scope is not connected.)
  • Input true_classes: A batch_size * num_true matrix, in which each row contains the IDs of the num_true target_classes in the corresponding original label.
  • Int64 num_true: Number of true labels per context.
  • Int64 num_sampled: Number of candidates to produce.
  • bool unique: If unique is true, we sample with rejection, so that all sampled candidates in a batch are unique. This requires some approximation to estimate the post-rejection sampling probabilities.
  • AllCandidateSampler::Attrs attrs:
    • seed: If either seed or seed2 are set to be non-zero, the random number generator is seeded by the given seed. Otherwise, it is seeded by a random seed.
    • seed2: An second seed to avoid seed collision.

Return:

  • Output sampled_candidates: Output object of AllCandidateSampler class object.
  • Output true_expected_count: Output object of AllCandidateSampler class object.
  • Output sampled_expected_count: Output object of AllCandidateSampler class object.

Result:

  • std::vector(Tensor) result_sampled_candidates: A vector of length num_sampled, in which each element is the ID of a sampled candidate.
  • std::vector(Tensor) result_true_expected_count: A batch_size * num_true matrix, representing the number of times each candidate is expected
  • std::vector(Tensor) result_sampled_expected_count: A vector of length num_sampled, for each sampled candidate representing the number of times the candidate is expected to occur in a batch of sampled candidates. If unique=true, then this is a probability.

Using Method