Class BOMInputStream

All Implemented Interfaces:
Closeable, AutoCloseable

public class BOMInputStream extends ProxyInputStream
This class is used to wrap a stream that includes an encoded ByteOrderMark as its first bytes. This class detects these bytes and, if required, can automatically skip them and return the subsequent byte as the first byte in the stream. The ByteOrderMark implementation has the following pre-defined BOMs:

Example 1 - Detect and exclude a UTF-8 BOM

 BOMInputStream bomIn = new BOMInputStream(in);
 if (bomIn.hasBOM()) {
     // has a UTF-8 BOM
 }
 

Example 2 - Detect a UTF-8 BOM (but don't exclude it)

 boolean include = true;
 BOMInputStream bomIn = new BOMInputStream(in, include);
 if (bomIn.hasBOM()) {
     // has a UTF-8 BOM
 }
 

Example 3 - Detect Multiple BOMs

 BOMInputStream bomIn = new BOMInputStream(in, 
   ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE,
   ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE
   );
 if (bomIn.hasBOM() == false) {
     // No BOM found
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_16LE)) {
     // has a UTF-16LE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_16BE)) {
     // has a UTF-16BE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_32LE)) {
     // has a UTF-32LE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_32BE)) {
     // has a UTF-32BE BOM
 }
 
Since:
2.0
Version:
$Id$
See Also:
  • Constructor Details

    • BOMInputStream

      public BOMInputStream(InputStream delegate)
      Constructs a new BOM InputStream that excludes a ByteOrderMark.UTF_8 BOM.
      Parameters:
      delegate - the InputStream to delegate to
    • BOMInputStream

      public BOMInputStream(InputStream delegate, boolean include)
      Constructs a new BOM InputStream that detects a a ByteOrderMark.UTF_8 and optionally includes it.
      Parameters:
      delegate - the InputStream to delegate to
      include - true to include the UTF-8 BOM or false to exclude it
    • BOMInputStream

      public BOMInputStream(InputStream delegate, ByteOrderMark... boms)
      Constructs a new BOM InputStream that excludes the specified BOMs.
      Parameters:
      delegate - the InputStream to delegate to
      boms - The BOMs to detect and exclude
    • BOMInputStream

      public BOMInputStream(InputStream delegate, boolean include, ByteOrderMark... boms)
      Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.
      Parameters:
      delegate - the InputStream to delegate to
      include - true to include the specified BOMs or false to exclude them
      boms - The BOMs to detect and optionally exclude
  • Method Details

    • hasBOM

      public boolean hasBOM() throws IOException
      Indicates whether the stream contains one of the specified BOMs.
      Returns:
      true if the stream has one of the specified BOMs, otherwise false if it does not
      Throws:
      IOException - if an error reading the first bytes of the stream occurs
    • hasBOM

      public boolean hasBOM(ByteOrderMark bom) throws IOException
      Indicates whether the stream contains the specified BOM.
      Parameters:
      bom - The BOM to check for
      Returns:
      true if the stream has the specified BOM, otherwise false if it does not
      Throws:
      IllegalArgumentException - if the BOM is not one the stream is configured to detect
      IOException - if an error reading the first bytes of the stream occurs
    • getBOM

      public ByteOrderMark getBOM() throws IOException
      Return the BOM (Byte Order Mark).
      Returns:
      The BOM or null if none
      Throws:
      IOException - if an error reading the first bytes of the stream occurs
    • getBOMCharsetName

      Return the BOM charset Name - ByteOrderMark.getCharsetName().
      Returns:
      The BOM charset Name or null if no BOM found
      Throws:
      IOException - if an error reading the first bytes of the stream occurs
    • read

      public int read() throws IOException
      Invokes the delegate's read() method, detecting and optionally skipping BOM.
      Overrides:
      read in class ProxyInputStream
      Returns:
      the byte read (excluding BOM) or -1 if the end of stream
      Throws:
      IOException - if an I/O error occurs
    • read

      public int read(byte[] buf, int off, int len) throws IOException
      Invokes the delegate's read(byte[], int, int) method, detecting and optionally skipping BOM.
      Overrides:
      read in class ProxyInputStream
      Parameters:
      buf - the buffer to read the bytes into
      off - The start offset
      len - The number of bytes to read (excluding BOM)
      Returns:
      the number of bytes read or -1 if the end of stream
      Throws:
      IOException - if an I/O error occurs
    • read

      public int read(byte[] buf) throws IOException
      Invokes the delegate's read(byte[]) method, detecting and optionally skipping BOM.
      Overrides:
      read in class ProxyInputStream
      Parameters:
      buf - the buffer to read the bytes into
      Returns:
      the number of bytes read (excluding BOM) or -1 if the end of stream
      Throws:
      IOException - if an I/O error occurs
    • mark

      public void mark(int readlimit)
      Invokes the delegate's mark(int) method.
      Overrides:
      mark in class ProxyInputStream
      Parameters:
      readlimit - read ahead limit
    • reset

      public void reset() throws IOException
      Invokes the delegate's reset() method.
      Overrides:
      reset in class ProxyInputStream
      Throws:
      IOException - if an I/O error occurs
    • skip

      public long skip(long n) throws IOException
      Invokes the delegate's skip(long) method, detecting and optionallyskipping BOM.
      Overrides:
      skip in class ProxyInputStream
      Parameters:
      n - the number of bytes to skip
      Returns:
      the number of bytes to skipped or -1 if the end of stream
      Throws:
      IOException - if an I/O error occurs