"UTF8" (Net Encoder)
NetEncoder["UTF8"]
represents an encoder that converts a string to a sequence of integers corresponding to the UTF-8 encoding of its characters.
NetEncoder[{"UTF8",form}]
represents an encoder that converts a string to the output type form according to the UTF-8 encoding of its characters.
Details
- NetEncoder[…][input] applies the encoder to an input string to produce an output.
- NetEncoder[…][{input1,input2,…}] applies the encoder to a list of input strings to produce a list of outputs.
- When form is "Index" (the default), the output of the encoder consists of integer codes in the range 1 to 248 corresponding to characters in the input string. One character can produce multiple integers.
- When form is "UnitVector", the output of the encoder consists of 248-dimensional unit vectors, where the i vector is in the pi direction, where pi is the code corresponding to the i character.
- An encoder can be attached to an input port of a net by specifying "port"->NetEncoder[…] when constructing the net.
Examples
open allclose allScope (1)
Properties & Relations (1)
NetEncoder["UTF8"][input] is equivalent to ToCharacterCode[input,"UTF8"]+1: