System.Text.UTF8Encoding Class

public class UTF8Encoding : Encoding

Base Types

Object
  Encoding
    UTF8Encoding

Assembly

mscorlib

Library

BCL

Summary

Represents a UTF8 character Encoding.

Description

UTF8Encoding encodes Unicode characters using the UTF-8 encoding (UCS Transformation Format, 8-bit form). This encoding supports all Unicode character values.

[Note: UTF-8 encodes Unicode characters with a variable number of bytes per character. This encoding is optimized for the lower 127 ASCII characters, yielding an efficient mechanism to encode English in an internationalizable way. The UTF-8 identifier is the Unicode byte order mark (0xFEFF) written in UTF-8 (0xEF 0xBB 0xBF). The byte order mark is used to distinguish UTF-8 text from other encodings.

This class offers an error-checking feature that can be turned on when an instance of the class is constructed. Certain methods in this class check for invalid sequences of surrogate pairs. If error-checking is turned on and an invalid sequence is detected, ArgumentException is thrown. If error-checking is not turned on and an invalid sequence is detected, no exception is thrown and execution continues in a method-defined manner. For more information regarding surrogate pairs, see UnicodeCategory .

]

See Also

System.Text Namespace

Members

UTF8Encoding Constructors

UTF8Encoding(bool, bool) Constructor
UTF8Encoding(bool) Constructor
UTF8Encoding() Constructor

UTF8Encoding Methods

UTF8Encoding.Equals Method
UTF8Encoding.GetByteCount(System.String) Method
UTF8Encoding.GetByteCount(char[], int, int) Method
UTF8Encoding.GetBytes(System.String, int, int, byte[], int) Method
UTF8Encoding.GetBytes(System.String) Method
UTF8Encoding.GetBytes(char[], int, int, byte[], int) Method
UTF8Encoding.GetCharCount Method
UTF8Encoding.GetChars Method
UTF8Encoding.GetDecoder Method
UTF8Encoding.GetEncoder Method
UTF8Encoding.GetHashCode Method
UTF8Encoding.GetMaxByteCount Method
UTF8Encoding.GetMaxCharCount Method
UTF8Encoding.GetPreamble Method


UTF8Encoding(bool, bool) Constructor

public UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes);

Summary

Constructs a new instance of the UTF8Encoding class using the specified Boolean flags.

Parameters

encoderShouldEmitUTF8Identifier
A Boolean that indicates whether the Unicode byte order mark in UTF-8 is recognized or emitted when reading from or writing to a Stream .
throwOnInvalidBytes
A Boolean that indicates whether error-checking is turned on for the current instance.

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding(bool) Constructor

public UTF8Encoding(bool encoderShouldEmitUTF8Identifier);

Summary

Constructs a new instance of the UTF8Encoding class with the specified Boolean that indicates whether the Unicode byte order mark in UTF-8 is recognized or emitted when reading from or writing to a Stream.

Parameters

encoderShouldEmitUTF8Identifier
A Boolean that indicates whether the Unicode byte order mark in UTF-8 is recognized or emitted when reading from or writing to a Stream .

Description

This constructor is equivalent to UTF8Encoding (encoderShouldEmitUTF8Identifier, false ).

[Note: By default, this constructor turns error-checking off for the new instance.]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding() Constructor

public UTF8Encoding();

Summary

Constructs a new instance of the UTF8Encoding class.

Description

This constructor is equivalent to UTF8Encoding (false , false ).

[Note: By default, this constructor turns error-checking off for the new instance.]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.Equals Method

public override bool Equals(object value);

Summary

Determines whether the current instance and the specified Object represent the same type and value.

Parameters

value
A Object to compare with the current instance.

Return Value

true if value is a UTF8Encoding and represents the same type and value as the current instance; otherwise, false .

Description

[Note: This method overrides System.Object.Equals(System.Object).]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetByteCount(System.String) Method

public override int GetByteCount(string chars);

Summary

Determines the number of bytes required to encode the characters in the specified String as a UTF8Encoding.

Parameters

chars
A String to encode as a UTF8Encoding.

Return Value

A Int32 that specifies the number of bytes necessary to encode chars as a UTF8Encoding.

Exceptions

Exception TypeCondition
ArgumentNullExceptionchars is null .
ArgumentExceptionError-checking is turned on for the current instance and chars contains an invalid surrogate sequence.

ArgumentOutOfRangeExceptionThe return value is greater than System.Int32.MaxValue.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetByteCount(System.Char[]).]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetByteCount(char[], int, int) Method

public override int GetByteCount(char[] chars, int index, int count);

Summary

Determines the number of bytes required to encode the specified range of characters in the specified Unicode character array as a UTF8Encoding.

Parameters

chars
The Char array to encode as a UTF8Encoding .
index
A Int32 that specifies the first index of chars to encode.
count
A Int32 that specifies the number of characters to encode.

Return Value

A Int32 containing the number of bytes necessary to encode the range in chars from index to index + count - 1 as a UTF8Encoding.

Exceptions

Exception TypeCondition
ArgumentNullExceptionchars is null .
ArgumentOutOfRangeExceptionThe return value is greater than System.Int32.MaxValue.

-or-

index or count is less than zero.

-or-

index and count do not specify a valid range in chars (i.e. (index + count) > chars.Length).

ArgumentExceptionError-checking is turned on for the current instance and chars contains an invalid surrogate sequence.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.

[Note: This method overrides System.Text.Encoder.GetByteCount(System.Char[],System.Int32,System.Int32,System.Boolean) .

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetBytes(System.String, int, int, byte[], int) Method

public override int GetBytes(string s, int charIndex, int charCount, byte[] bytes, int byteIndex);

Summary

Encodes the specified range of the specified String into the specified range of the specified Byte array as a UTF8Encoding .

Parameters

s
The String to encode as a UTF8Encoding .
charIndex
A Int32 that specifies the first index of s to encode.
charCount
A Int32 that specifies the number of characters to encode.
bytes
The Byte array to encode into.
byteIndex
A Int32 that specifies the first index of bytes to encode into.

Return Value

A Int32 that indicates the number of bytes encoded into bytes as a UTF8Encoding .

Exceptions

Exception TypeCondition
ArgumentExceptionbytes does not contain sufficient space to store the encoded characters.

-or-

Error-checking is turned on for the current instance and chars contains an invalid surrogate sequence.

ArgumentNullExceptionchars or bytes is null .
ArgumentOutOfRangeExceptioncharIndex, charCount, or byteIndex is less than zero.

-or-

(s.Length - charIndex) < charCount.

-or-

byteIndex >= bytes.Length.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and are not encoded into bytes, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetBytes(System.Char[]).]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetBytes(System.String) Method

public override byte[] GetBytes(string s);

Summary

Encodes the specified String as a UTF8Encoding.

Parameters

s
The String to encode as a UTF8Encoding.

Return Value

A Byte array containing the values encoded from s as a UTF8Encoding.

Exceptions

Exception TypeCondition
ArgumentExceptionError-checking is turned on for the current instance and s contains an invalid surrogate sequence.

ArgumentNullExceptions is null .

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and are not encoded into the returned Byte array, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetBytes(System.Char[]).]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetBytes(char[], int, int, byte[], int) Method

public override int GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex);

Summary

Encodes the specified range of the specified Char array into the specified range of the specified Byte array as a UTF8Encoding.

Parameters

chars
The Char array to encode as a UTF8Encoding .
charIndex
A Int32 that specifies the first index of chars to encode.
charCount
A Int32 that specifies the number of characters to encode.
bytes
The Byte array to encode into.
byteIndex
A Int32 that specifies the first index of bytes to encode into.

Return Value

A Int32 that indicates the number of bytes encoded into bytes as a UTF8Encoding.

Exceptions

Exception TypeCondition
ArgumentExceptionbytes does not contain sufficient space to store the encoded characters.

-or-

Error-checking is turned on for the current instance and chars contains an invalid surrogate sequence.

ArgumentNullExceptionchars or bytes is null .
ArgumentOutOfRangeExceptioncharIndex, charCount, or byteIndex is less than zero.

-or-

(chars.Length - charIndex) < charCount.

-or-

byteIndex > bytes.Length.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and are not encoded into bytes, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetBytes(System.Char[]).

System.Text.UTF8Encoding.GetByteCount(System.Char[],System.Int32,System.Int32) can be used to determine the exact number of bytes that will be produced for a given range of characters. Alternatively, System.Text.UTF8Encoding.GetMaxByteCount(System.Int32) can be used to determine the maximum number of bytes that will be produced for a specified number of characters, regardless of the actual character values.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetCharCount Method

public override int GetCharCount(byte[] bytes, int index, int count);

Summary

Returns the number of characters produced by decoding the specified range of the specified Byte array as a UTF8Encoding .

Parameters

bytes
The Byte array to decode as a UTF8Encoding .
index
A Int32 that specifies the first index of bytes to decode.
count
A Int32 that specifies the number of bytes to decode.

Return Value

A Int32 that indicates the number of characters produced by decoding the range in bytes from index to index + count - 1 as a UTF8Encoding .

Exceptions

Exception TypeCondition
ArgumentNullExceptionbytes is null .
ArgumentOutOfRangeExceptionindex or count is less than zero.

-or-

index and count do not specify a valid range in bytes (i.e. (index + count) > bytes.Length).

ArgumentExceptionError-checking is turned on for the current instance and bytes contains an invalid surrogate sequence.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid bytes are ignored and do not affect the return value, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetCharCount(System.Byte[]).]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetChars Method

public override int GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex);

Summary

Decodes the specified range of the specified Byte array into the specified range of the specified Char array as a UTF8Encoding .

Parameters

bytes
The Byte array to decode as a UTF8Encoding .
byteIndex
A Int32 that specifies the first index of bytes to decode.
byteCount
A Int32 that specifies the number of bytes to decode.
chars
The Char array to decode into.
charIndex
A Int32 that specifies the first index of chars to decode into.

Return Value

The number of characters decoded into chars as a UTF8Encoding .

Exceptions

Exception TypeCondition
ArgumentExceptionchars does not contain sufficient space to store the decoded characters.

-or-

Error-checking is turned on for the current instance and bytes contains an invalid surrogate sequence.

ArgumentNullExceptionbytes or chars is null .

ArgumentOutOfRangeExceptionbyteIndex, byteCount, or charIndex is less than zero.

-or-

(bytes.Length - byteIndex) < byteCount.

-or-

charIndex > chars.Length.

Description

If error-checking is turned off and an invalid surrogate sequence is detected, the invalid bytes are ignored and are not encoded into chars, and no exception is thrown.

[Note: This method overrides System.Text.Encoding.GetChars(System.Byte[]) .

System.Text.UTF8Encoding.GetCharCount(System.Byte[],System.Int32,System.Int32) can be used to determine the exact number of characters that will be produced for a specified range of bytes. Alternatively, System.Text.UTF8Encoding.GetMaxCharCount(System.Int32) can be used to determine the maximum number of characters that will be produced for a specified number of bytes, regardless of the actual byte values.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetDecoder Method

public override Decoder GetDecoder();

Summary

Returns a Decoder for the current instance.

Return Value

A Decoder for the current instance.

Description

[Note: This method overrides System.Text.Encoding.GetDecoder .

Contrary to System.Text.UTF8Encoding.GetChars(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32) , a decoder can convert partial sequences of bytes into partial sequences of characters by maintaining the appropriate state between the conversions.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetEncoder Method

public override Encoder GetEncoder();

Summary

Returns a Encoder for the current instance.

Return Value

A Encoder for the current instance.

Description

[Note: This method overrides System.Text.Encoding.GetEncoder.

Contrary to System.Text.UTF8Encoding.GetBytes(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32) , an encoder can convert partial sequences of characters into partial sequences of bytes by maintaining the appropriate state between the conversions.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetHashCode Method

public override int GetHashCode();

Summary

Generates a hash code for the current instance.

Return Value

A Int32 value containing a hash code for the current instance

Description

The algorithm used to generate the hash code is unspecified.

[Note: This method overrides System.Object.GetHashCode.]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetMaxByteCount Method

public override int GetMaxByteCount(int charCount);

Summary

Returns the maximum number of bytes required to encode the specified number of characters as a UTF8Encoding, regardless of the actual character values.

Parameters

charCount
A Int32 that specifies the number of characters to encode as a UTF8Encoding .

Return Value

A Int32 that specifies the maximum number of bytes required to encode charCount characters as a UTF8Encoding .

Exceptions

Exception TypeCondition
ArgumentOutOfRangeExceptioncharCount < 0.

Description

[Note: This method overrides System.Text.Encoding.GetMaxByteCount(System.Int32) .

This method can be used to determine an appropriate buffer size for byte arrays passed to System.Text.UTF8Encoding.GetBytes(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32). Using this minimum buffer size can help ensure that no buffer overflow exceptions will occur.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetMaxCharCount Method

public override int GetMaxCharCount(int byteCount);

Summary

Returns the maximum number of characters produced by decoding the specified number of bytes as a UTF8Encoding, regardless of the actual byte values.

Parameters

byteCount
A Int32 that specifies the number of bytes to decode as a UTF8Encoding .

Return Value

A Int32 that specifies the maximum number of characters produced by decoding byteCount bytes as a UTF8Encoding .

Exceptions

Exception TypeCondition
ArgumentOutOfRangeExceptionbyteCount < 0.

Description

[Note: This method overrides System.Text.Encoding.GetMaxCharCount(System.Int32) .

This method can be used to determine an appropriate minimum buffer size for character arrays passed to System.Text.UTF8Encoding.GetChars(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32) . Using this minimum buffer size can help ensure that no buffer overflow exceptions will occur.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace

UTF8Encoding.GetPreamble Method

public override byte[] GetPreamble();

Summary

Returns the bytes used at the beginning of a stream to determine which encoding a file was created with.

Return Value

A Byte array containing the UTF-8 encoding preamble.

Description

[Note: This method overrides System.Text.Encoding.GetPreamble .

System.Text.UTF8Encoding.GetPreamble returns the Unicode byte order mark (U+FEFF) written in UTF-8 (0xef, 0xbb, 0xbf) if this instance was constructed with a request to emit the UTF-8 identifier.

]

See Also

System.Text.UTF8Encoding Class, System.Text Namespace