java.lang.Object
- org.apache.directory.api.util.Unicode

```
public final class Unicode
extends Object
```
Various unicode manipulation methods that are more efficient then chaining operations: all is done in the same buffer without creating a bunch of string objects.

Author:

Apache Directory Project

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static char`	`bytesToChar(byte[] bytes)`	Return the Unicode char which is coded in the bytes at position 0.
`static char`	`bytesToChar(byte[] bytes, int pos)`	Return the Unicode char which is coded in the bytes at the given position.
`static byte[]`	`charToBytes(char car)`	Return the Unicode char which is coded in the bytes at the given position.
`static int`	`countBytes(char[] chars)`	Count the number of bytes included in the given char[].
`static int`	`countBytesPerChar(byte[] bytes, int pos)`	Count the number of bytes needed to return an Unicode char.
`static int`	`countChars(byte[] bytes)`	Count the number of chars included in the given byte[].
`static int`	`countNbBytesPerChar(char car)`	Return the number of bytes that hold an Unicode char.
`static boolean`	`isUnicodeSubset(byte b)`	Check if the current byte is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
`static boolean`	`isUnicodeSubset(char c)`	Check if the current char is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
`static boolean`	`isUnicodeSubset(String str, int pos)`	Check if the current char is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
`static String`	`readUTF(ObjectInput objectInput)`	Reads in a string that has been encoded using a modified UTF-8 format.
`static void`	`writeUTF(ObjectOutput objectOutput, String str)`	Writes four bytes of length information to the output stream, followed by the modified UTF-8 representation of every character in the string str.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - countBytesPerChar
```
public static int countBytesPerChar(byte[] bytes,
                                    int pos)
```
    Count the number of bytes needed to return an Unicode char. This can be from 1 to 6.
    
    Parameters:
    
    bytes - The bytes to read
    
    pos - Position to start counting. It must be a valid start of a encoded char !
    
    Returns:
    
    The number of bytes to create a char, or -1 if the encoding is wrong. TODO : Should stop after the third byte, as a char is only 2 bytes long.
  - bytesToChar
```
public static char bytesToChar(byte[] bytes)
```
    Return the Unicode char which is coded in the bytes at position 0.
    
    Parameters:
    
    bytes - The byte[] represntation of an Unicode string.
    
    Returns:
    
    The first char found.
  - bytesToChar
```
public static char bytesToChar(byte[] bytes,
                               int pos)
```
    Return the Unicode char which is coded in the bytes at the given position.
    
    Parameters:
    
    bytes - The byte[] represntation of an Unicode string.
    
    pos - The current position to start decoding the char
    
    Returns:
    
    The decoded char, or -1 if no char can be decoded TODO : Should stop after the third byte, as a char is only 2 bytes long.
  - countNbBytesPerChar
```
public static int countNbBytesPerChar(char car)
```
    Return the number of bytes that hold an Unicode char.
    
    Parameters:
    
    car - The character to be decoded
    
    Returns:
    
    The number of bytes to hold the char. TODO : Should stop after the third byte, as a char is only 2 bytes long.
  - countBytes
```
public static int countBytes(char[] chars)
```
    Count the number of bytes included in the given char[].
    
    Parameters:
    
    chars - The char array to decode
    
    Returns:
    
    The number of bytes in the char array
  - countChars
```
public static int countChars(byte[] bytes)
```
    Count the number of chars included in the given byte[].
    
    Parameters:
    
    bytes - The byte array to decode
    
    Returns:
    
    The number of char in the byte array
  - charToBytes
```
public static byte[] charToBytes(char car)
```
    Return the Unicode char which is coded in the bytes at the given position.
    
    Parameters:
    
    car - The character to be transformed to an array of bytes
    
    Returns:
    
    The byte array representing the char TODO : Should stop after the third byte, as a char is only 2 bytes long.
  - isUnicodeSubset
```
public static boolean isUnicodeSubset(String str,
                                      int pos)
```
    Check if the current char is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
    
    Parameters:
    
    str - The string to check
    
    pos - Position of the current char
    
    Returns:
    
    True if the current char is in the unicode subset
  - isUnicodeSubset
```
public static boolean isUnicodeSubset(char c)
```
    Check if the current char is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
    
    Parameters:
    
    c - The char to check
    
    Returns:
    
    True if the current char is in the unicode subset
  - isUnicodeSubset
```
public static boolean isUnicodeSubset(byte b)
```
    Check if the current byte is in the unicodeSubset : all chars but '\0', '(', ')', '*' and '\'
    
    Parameters:
    
    b - The byte to check
    
    Returns:
    
    True if the current byte is in the unicode subset
  - writeUTF
```
public static void writeUTF(ObjectOutput objectOutput,
                            String str)
                     throws IOException
```
    Writes four bytes of length information to the output stream, followed by the modified UTF-8 representation of every character in the string str. If str is null, the string value 'null' is written with a length of 0 instead of throwing an NullPointerException. Each character in the string s is converted to a group of one, two, or three bytes, depending on the value of the character. Due to given restrictions (total number of written bytes in a row can't exceed 65535) the total length is written in the length information (four bytes (writeInt)) and the string is split into smaller parts if necessary and written. As each character may be converted to a group of maximum 3 bytes and 65535 bytes can be written at maximum we're on the save side when writing a chunk of only 21845 (65535/3) characters at once. See also DataOutput.writeUTF(String).
    
    Parameters:
    
    objectOutput - The objectOutput to write to
    
    str - The value to write
    
    Throws:
    
    IOException - If the value can't be written to the file
  - readUTF
```
public static String readUTF(ObjectInput objectInput)
                      throws IOException
```
    Reads in a string that has been encoded using a modified UTF-8 format. The general contract of readUTF is that it reads a representation of a Unicode character string encoded in modified UTF-8 format; this string of characters is then returned as a String. First, four bytes are read (readInt) and used to construct an unsigned 16-bit integer in exactly the manner of the readUnsignedShort method . This integer value is called the UTF length and specifies the number of additional bytes to be read. These bytes are then converted to characters by considering them in groups. The length of each group is computed from the value of the first byte of the group. The byte following a group, if any, is the first byte of the next group. See also DataInput.readUTF().
    
    Parameters:
    
    objectInput - The objectInput to read from
    
    Returns:
    
    The read string
    
    Throws:
    
    IOException - If the value can't be read

Class Unicode

Method Summary

Methods inherited from class java.lang.Object

Method Detail

countBytesPerChar

bytesToChar

bytesToChar

countNbBytesPerChar

countBytes

countChars

charToBytes

isUnicodeSubset

isUnicodeSubset

isUnicodeSubset

writeUTF

readUTF