"CTCBeamSearch" (Net Decoder)

NetDecoder[{"CTCBeamSearch",alphabet}]

represents a decoder that interprets a sequence of probability vectors and gives the most likely sequence decoding.

NetDecoder[{"CTCBeamSearch",,"BeamSize"n}]

represents a decoder with specified beam size.

Details

  • NetDecoder[][input] applies the decoder to an input to produce an output.
  • NetDecoder[][{input1,input2,}] applies the decoder to a list of inputs to produce a list of outputs.
  • The input array to the "CTCBeamSearch" decoder must be a sequence of vectors, each of size n+1, where n is the size of the alphabet. The last element of each vector represents the special blank class.
  • The output of "CTCBeamSearch" is a sequence of elements from the alphabet whose maximum length is equal to the length of the input sequence. Fewer elements are typically returned.
  • A decoder can be attached to an output port of a net by specifying "port"->NetDecoder[] when constructing the net.
  • Parameters
  • With the parameter "BeamSize"->n, the "CTCBeamSearch" decoder will maintain a set of n candidate decodings during processing. The default is 100.
  • A "BeamSize" of 1 is equivalent to greedy search, where top probability is chosen at each element in the sequence.
  • Properties
  • NetDecoder[][data,prop] can be used to calculate a specific property for the input data.
  • When a "CTCBeamSearch" decoder is attached to a net, net[data,prop] or net[data,"oport"->prop] can be used to calculate a specific property of the decoded output.
  • The "CTCBeamSearch" decoder supports the following properties:
  • "Decoding"the single most probable sequence found (default)
    "Decodings"all the most probable sequences found
    "TopDecodings"->ngives the n most probable sequences
    "NegativeLogLikelihoods"gives the negative log-likelihood of each decoding, returned as a list of rules
    "TopNegativeLogLikelihoods"->ngives the negative log-likelihoods of the top n decodings
    Nonebypass decoding and return the input

Examples

Basic Examples  (1)

Create a CTC beam search decoder:

Use the decoder on a sequence of probability vectors:

Obtain the top three decoded sequences and their negative log-likelihoods: