Java Tutorial

The Standard Streams

Your operating system will typically define three standard streams that are accessible through members of the System class in Java:

A standard input stream that usually corresponds to the keyboard by default. This is encapsulated by the in member of the System class, and is of type InputStream.
A standard output stream that corresponds to output on the command line. This is encapsulated by the out member of the System class, and is of type PrintStream.
A standard error output stream for error messages that usually maps to the command line output by default. This is encapsulated by the err member of the System class, and is also of type PrintStream.

You can reassign any of these to another stream within a Java application. The System class provides the static methods setIn(), setOut(), and setErr() for this purpose. The setIn() method requires an argument of type InputStream that specifies the new source of standard input. The other two methods expect an argument of type PrintStream.

Since the standard input stream is of type InputStream, we are not exactly overwhelmed by the capabilities for reading data from the keyboard in Java. Basically, we can read a byte or an array of bytes using a read() method as standard, and that's it. If you want more than that, reading integers, or decimal values, or strings as keyboard input, you're on your own. Let's see what we can do to remedy that.

Getting Data From the Keyboard

To get sensible input from the keyboard, you have to be able to scan the stream of characters and recognize what they are. When you read a numerical value from the stream, you have to look for the digits and possibly the sign and decimal point, figure out where the number starts and ends in the stream, and finally convert it to the appropriate value. To write the code to do this from scratch would take quite a lot of work. Fortunately, we can get a lot of help from the StreamTokenizer class in the java.io package.

The term token refers to a data item, such as a number or a string that will, in general, consist of several consecutive characters of a particular kind from the stream. For example, a number is usually a sequence of characters that consists of digits, maybe a decimal point, and sometimes a sign in front. The class has the name StreamTokenizer because it can read characters from a stream, and parse it into a series of tokens that it recognizes.

You create a StreamTokenizer object from a stream reader object that reads data from the underlying input stream. Since we want to read the standard input stream, System.in, we shall use an InputStreamReader that converts the raw bytes from the stream from the local character encoding to Unicode characters before the StreamTokenizer object sees them. In the interests of efficiency we will also buffer the data from the InputStreamReader through a BufferedReader object that will buffer the data in memory. We will therefore create our StreamTokenizer object like this:

StreamTokenizer tokenizer = new StreamTokenizer(
                              new BufferedReader(
                                new InputStreamReader(System.in)));

The argument to the StreamTokenizer is the original standard input stream, System.in, inside an InputStreamReader object that converts the bytes to Unicode, inside a BufferedReader that supplies the stream of Unicode characters via a buffer in memory.

Before we can make use of our StreamTokenizer object for keyboard input, we need to understand a bit more about how it works.

Tokenizing a Stream

The StreamTokenizer class defines objects that can read an input stream and parse it into tokens. The input stream is read and treated as a series of separate bytes, and each byte is regarded as a character in the range 'u\0000' to 'u\00FF'. A StreamTokenizer object in its default state can recognize the following kinds of tokens:

Token	Description
Numbers	A sequence consisting of the digits 0 to 9, plus possibly a decimal point, and a + or - sign.
Strings	Any sequence of characters between a pair of single quotes or a pair of double quotes.
Words	Any sequence of letters or digits 0 to 9 beginning with a letter. A letter is defined as any of A to Z and a to z or \u00A0 to \u00FF. A word follows a whitespace character and is terminated by another whitespace character, or any character other than a letter or a digit.
Comments	Any sequence of characters beginning with a forward slash, /, and ending with the end-of-line character. Comments are ignored and not returned by the tokenizer.
Whitespace	All byte values from \u0000 to \u0020, which includes space, backspace, horizontal tab, vertical tab, line feed, form feed, and carriage return. Whitespace acts as a delimiter between tokens and is ignored (except within a quoted string).

To retrieve a token from the stream, you call the nextToken() method for the StreamTokenizer object:

int tokenType = 0;
try {
  while(tokenType = tokenizer.nextToken() != tokenizer.TT_EOF) {
     // Do something with the token...
  }

} catch (IOException e) {
  e.printStackTrace(System.err);
  System.exit(1);
}

The method can throw an IOException so the call is in a try block. The value returned depends on the token recognized, and from the value, you can determine where to find the token itself. In the fragment above we compare the value returned with the static constant TT_EOF that is defined in the StreamTokenizer class. This value is returned when the end of the stream has been reached. The token that was read from the stream is itself stored in one of two instance variables of the StreamTokenizer object. If the data item is a number, it is stored in a public data member, nval, which is of type double. If the data item is a quoted string or a word, a reference to a String object is stored in the public data member, sval, which of course is of type, String. The analysis that segments the stream into tokens is fairly simple, and the way in which an arbitrary stream is broken into tokens is illustrated below.

As we have said, the int value returned by the nextToken() method indicates what kind of data item was read. It can be any of the following constant values defined in the StreamTokenizer class:

Token Value	Description
TT_NUMBER	The token is a number that has been stored in the public field, nval, of type double, in the tokenizer object.
TT_WORD	The token is a word that has been stored in the public field, sval, of type String, in the tokenizer object.
TT_EOF	The end-of-the stream has been reached.
TT_EOL	An end of line character has been read. This is only set if the eolIsSignificant() method has been called with the argument, true. Otherwise end-of-line characters are treated as whitespace and ignored.

If a quoted string is read from the stream, the value that is returned by nextToken() will be the quote character used for the string as type int – either a single or a double quote. In this case, you retrieve the reference to the string that was read from the sval member of the tokenizer object. The value indicating what kind of token was read last is also available from a public data member, ttype, of the StreamTokenizer object, which is of type int.

Customizing a Stream Tokenizer

The default tokenizing mode can be modified by calling one or other of the following methods:

Method	Description
resetSyntax()	Resets the state of the tokenizer object so no characters have any special significance. This has the effect that all characters are regarded as ordinary, and will be read from the stream as single characters. The value of each character will be stored in the ttype field.
ordinaryChar (int ch)	Sets the character, ch, as an ordinary character. An ordinary character is a character that has no special significance. It will be read as a single character whose value will be stored in the ttype field. Calling this method will not alter the state of characters other than the argument value.
ordinaryChars (int low, int hi)	Causes all characters from low to hi inclusive to be treated as ordinary characters. Calling this method will not alter the state of characters other than those specified by the argument values.
whitespaceChars (int low, int hi)	Causes all characters from low to hi inclusive to be treated as whitespace characters. Unless they appear in a string, whitespace characters are treated as delimiters between tokens. Calling this method will not alter the state of characters other than those specified by the argument values.
wordChars (int low, int hi)	Specifies that the characters from low to hi inclusive are word characters. A word is at least one of these characters. Calling this method will not alter the state of characters other than those specified by the argument values.
commentChar(int ch)	Specifies that ch is a character that indicates the start of a comment. All characters to the end of the line following the character, ch, will be ignored. Calling this method will not alter the state of characters other than the argument value.
quoteChar(int ch)	Specifies that matching pairs of the character, ch, enclose a string. Calling this method will not alter the state of characters other than the argument value.
slashStarComments (boolean flag)	If the argument is false, this switches off recognizing comments between /* and */. A true argument switches it on again.
slashSlashComments (boolean flag)	If the argument is false, this switches off recognizing comments starting will a double slash. A true argument switches it on again.
lowerCaseMode (boolean flag)	An argument of true causes strings to be converted to lower case before being stored in sval. An argument of false switches off lower case mode.
pushback()	Calling this method causes the next call of the nextToken() method to return the ttype value that was set by the previous nextToken() call and to leave sval and nval unchanged.

If you want to alter a tokenizer, it is usually better to reset it by calling the resetSyntax() method, then call the other methods to set the tokenizer up the way that you want. If you adopt this approach, any special significance attached to particular characters will be apparent from your code. The resetSyntax() method makes all characters, including whitespace, and ordinary characters, so that no character has any special significance. In some situations you may need to set a tokenizer up dynamically to suit retrieving each specific kind of data that you want to extract from the stream. When you want to read the next character as a character, whatever it is, you just need to call resetSyntax() before calling nextToken(). The character will be returned by nextToken() and stored in the ttype field. To read anything else subsequently, you have to set the tokenizer up appropriately.

Let's see how we can use this class to read data items from the keyboard.

Try It Out – Creating a Formatted Input Class

One way of reading formatted input is to define our own class that uses a StreamTokenizer object to read from standard input. We can define a class, FormattedInput, which will define methods to return various types of data items entered via the keyboard:

import java.io.*;

public class FormattedInput {

  // Method to read an int value...

  // Method to read a double value...

  // Plus methods to read various other data types...

  // Helper method to read the next token
  private int readToken() {
    try {
      ttype = tokenizer.nextToken();
      return ttype;

    } catch (IOException e) {  // Error reading in nextToken()
      e.printStackTrace(System.err);
      System.exit(1);         // End the program
    } 
    return 0;
  } 

  // Object to tokenize input from the standard input stream
  private StreamTokenizer tokenizer = new StreamTokenizer(
                                        new BufferedReader(
                                          new InputStreamReader(System.in)));
  private int ttype;   // Stores the token type code
}

The default constructor will be quite satisfactory for this class, because the instance variable, tokenizer, is already initialized. The readToken() method is there for use in the methods that will read values of various types. It makes the ttype value returned by nextToken() available directly, and saves having to repeat the try and catch blocks in all the other methods.

All we need to add are the methods to read the data values that we want. Here is one way to read a value of type int:

  // Method to read an int value
  public int readInt() {
    for (int i = 0; i < 5; i++) {

      if (readToken() == tokenizer.TT_NUMBER) {
        return (int) tokenizer.nval;   // Value is numeric, so return as int
      } else {
        System.out.println("Incorrect input: " + tokenizer.sval 
                           + " Re-enter an integer");
        continue;                      // Retry the read operation
      } 

    } 
    System.out.println("Five failures reading an int value" 
                       + " - program terminated");
    System.exit(1);   // End the program
    return 0;
  }

This method gives the user five chances to enter a valid input value before terminating the program. Terminating the program is likely to be inconvenient to say the least in many circumstances. If we make the method throw an exception in the case of failure here instead, and let the calling method decide what to do, this would be a much better way of signaling that the right kind of data could not be found.

We can define our own exception class for this. Let's define it as the type InvalidUserInputException:

public class InvalidUserInputException extends Exception {
  public InvalidUserInputException() {}

  public InvalidUserInputException(String message) {
    super(message);
  }
}

We haven't had to add anything to the base class capability. We just need the ability to pass our own message to the class. The significant thing we have added is our own exception type name.

Now we can change the code for the readInt() method so it works like this:

  public int readInt() throws InvalidUserInputException {
    if (readToken() != tokenizer.TT_NUMBER) {
      throw new InvalidUserInputException(" readInt() failed. " 
                                          + "Input data not numeric");
    }
    return (int) tokenizer.nval;
  }

If you need a method to read an integer value and return it as one of the other integer types, byte, short, or long, you could implement it in the same way but just cast the value in nval to the appropriate type. You might want to add checks that the original value was an integer, and maybe that it was not out of range for the shorter integer types. For instance, to do this for type int, we could code it as:

  public int readInt() throws InvalidUserInputException {
    if (readToken() != tokenizer.TT_NUMBER) {
      throw new InvalidUserInputException(" readInt() failed. " 
                                          + "Input data not numeric");
    } 

    if (tokenizer.nval > (double) Integer.MAX_VALUE 
            || tokenizer.nval < (double) Integer.MIN_VALUE) {
      throw new InvalidUserInputException(" readInt() failed. " 
                                          + "Input outside int range");
    } 

    if (tokenizer.nval != (double) (int) tokenizer.nval) {
      throw new InvalidUserInputException(" readInt() failed. " 
                                          + "Input not an integer");
    } 
    return (int) tokenizer.nval;
  }

The Integer class makes the maximum and minimum values of type int available in the public members MAX_VALUE and MIN_VALUE. Other classes corresponding to the basic numeric types provide similar fields. To determine whether the value in nval is really a whole number, we cast it to an integer, then cast it back to double and see whether it is the same value.

To implement readDouble(), the code is very simple. You don't need the cast for the value in nval since it is type double anyway:

  public double readDouble() throws InvalidUserInputException {
    if (readToken() != tokenizer.TT_NUMBER) {
      throw new InvalidUserInputException(" readDouble() failed. " 
                                          + "Input data not numeric");
    }
    return tokenizer.nval;
  }

A readFloat() method would just need to cast nval to type float.

Reading a string is slightly more involved. You could allow input strings to be quoted, or unquoted as long as they were alphanumeric and did not contain whitespace characters. Here's how the method might be coded to allow that:

  public String readString() throws InvalidUserInputException {
    if (readToken() == tokenizer.TT_WORD || ttype == '\"' 
            || ttype == '\'') {
      return tokenizer.sval;
    } else {
      throw new InvalidUserInputException(" readString() failed. " 
                                          + "Input data is not a string");
    }
  }

If either a word or a string is recognized, the token is stored as type String in the sval field of the StreamTokenizer object.

Let's see if it works.

Try It Out – Formatted Keyboard Input

We can try out our FormattedInput class in a simple program that iterates round a loop a few times to give you the opportunity to try out correct and incorrect input:

public class TestFormattedInput {
  public static void main(String[] args) {
    FormattedInput kb = new FormattedInput();
    for (int i = 0; i < 5; i++) {
      try {
        System.out.print("Enter an integer: ");
        System.out.println("Integer read: " + kb.readInt());
        System.out.print("Enter a double value: ");
        System.out.println("Double value read: " + kb.readDouble());
        System.out.print("Enter a string: ");
        System.out.println("String read: " + kb.readString());
      } catch (InvalidUserInputException e) {
        System.out.println("InvalidUserInputException thrown.\n"
                           + e.getMessage());
      }
    }
  }
}

It is best to run this example from the command line. Some Java IDEs are not terrific when it comes to keyboard input. If you try a few wrong values, you should see our exception being thrown.

How It Works

This just repeats requests for input of each of the three types of value we have provided methods for, over five iterations. Of course, after an exception of type InvalidUserInputException is thrown, the loop will go straight to the start of the next iteration – if there is one.

This code isn't foolproof. Bits of an incorrect entry can be left in the stream to confuse subsequent input and you can't enter floating-point values with exponents. However, it does work after a fashion and it's best not to look a gift horse in the mouth.

Writing to the Command Line

Up to now, we have made extensive use of the println() method from the PrintStream class in our examples to output formatted information to the screen. The out object in the expression, System.out.println(), is of type, PrintStream. This class outputs data of any of the basic types as a string. For example, an int value of 12345 becomes the string, "12345", as generated by the valueOf() method from the String class. However, we also have the PrintWriter class that we discussed earlier to do the same thing since this class has all the methods that PrintStream provides.

The principle difference between the two classes is that with the PrintWriter class you can control whether or not the stream buffer is flushed when the println() method is called, whereas with the PrintStream class you cannot. The PrintWriter class will only flush the stream buffer when one of the println() methods is called if automatic flushing is enabled. A PrintStream object will flush the stream buffer whenever a newline character is written to the stream, regardless of whether it was written by a print() or a println() method.

Both the PrintWriter and PrintStream classes format basic data as characters. The functionality that is missing is the ability to specify a field width for each output value. However, it is quite easy to line your numeric output up in columns by defining your own subclass of either PrintStream or PrintWriter. The approach is similar with both so let's arbitrarily try the latter.

Try It Out – Creating a Formatted Output Class

There is more than one approach possible to producing output in a given field width. We will create a FormattedWriter class that defines objects that can write values of any of the basic types to a stream, with a given field width. The class will implement overloaded print() and println() methods for each of the primitive types.

We will define the class with a data member containing the width of the output field for data items. The basic class definition will be:

import java.io.*;

public class FormattedWriter extends PrintWriter {
  public final static int LEFT_JUSTIFIED  = 1;
  public final static int RIGHT_JUSTIFIED = 2;
  private int justification = RIGHT_JUSTIFIED; 

  private int width = 0;                // Field width required for output

  // Constructor with a specified field width, autoflush, and justification 
  public FormattedWriter(Writer output, boolean autoflush, int width,
                                                int justification) {
    super(output, autoflush);     // Call PrintWriter constructor
    if(width>0)
      this.width = width;           // Store the field width
    if(justification == LEFT_JUSTIFIED || justification == RIGHT_JUSTIFIED)
      this.justification = justification; 
  }

  // Constructor with a specified field width
  public FormattedWriter(Writer output, int width) {
    this(output, false, width, RIGHT_JUSTIFIED);        
  }

  // Constructor with a specified field width and justification
  public FormattedWriter(Writer output, int width, int justification) {
    this(output, false, width, justification);        
  }

  // Constructor with a specified field width and autoflush option 
  public FormattedWriter(Writer output, boolean autoflush, int width) {
    this(output, autoflush, width, RIGHT_JUSTIFIED); 
  }

  // Lots of overloaded print() and println() methods 
  // for basic data types...
}

How It Works

There are four fields in our FormatWriter class. We have defined two static constants that identify whether the data is to be left or right justified in the output field. The justification member records this, and it has the value RIGHT_JUSTIFIED by default. The variable, width, of type, int, holds the output field width.

We have defined four constructors to provide flexibility in what you need to specify when you create an object of type FormattedWriter. As a minimum, two arguments are required, a reference of type Writer to an object encapsulating the output stream, and the field width. You can optionally specify the justification of the output in the field as one of the class constants we have defined for this purpose, and a boolean value that determines whether autoflushing of the stream is to be applied. All the constructors with fewer than four parameters call the constructor that has four by passing default values for the unspecified parameters. Note that we only set the width if the value supplied is positive, and we only set the justification if the argument is one or other of our constants. This ensures that our class object is always initialized correctly.

Since we derive our class from PrintWriter, we have all the facilities of the PrintWriter class available. At the moment, if you call print() or println() for a FormatWriter object, it will call the base class method, so the behavior will be exactly the same as a PrintWriter object. To change this, we will add our own print() and println() methods that override the base class methods. First, we will add a helper method.

Overriding print() and println()

We know that if width is non-zero, we want to output width characters for each value that we write to the stream. We need to figure out how many characters there are in each data value, subtract that from the total field width, and add that many blanks to the beginning or the end of the string representing the data value, depending on whether it is to be right or left justified. We can then write the resultant string of characters to the stream. If the character representation of the data exceeds the field width, we can output it as XXX...X with the number of X's corresponding to the specified width. This will show the user that the value can be displayed within the specified width.

The starting point for outputting a data value is to create a character representation for it. Once we have that, we need to extend it to the right to the required field width with spaces. We can implement a helper method, pad(), in our FormatWriter class that will accept a String object as an argument, and pad out the string appropriately before returning it:

  // Helper method to form string
  private String pad(String str) {
    if (width == 0) {
      return str;
    }

    int blanks = width - str.length();         // Number of blanks needed
    StringBuffer result = new StringBuffer();  // Will hold the output
 
    if(blanks<0) {                              // Data does not fit
      for(int i = 0 ; i<width ; i++)
        result.append('X');                    // so append X's
      return result.toString();                // and return the result       
    }
    
    if(blanks>0)                              // If we need some blanks
      for(int i = 0 ; i<blanks ; i++)
        result.append(' ');                   // append them

    // Insert the value string at the beginning or the end
    result.insert(justification == LEFT_JUSTIFIED ? 0 : result.length(),
                                                                     str);
    return result.toString();
  }

We will only use this method inside the class, so we make it private. If the width is zero then we just return the original string. Otherwise, we assemble the string to be output in the StringBuffer object, result. If the string, str, has more characters than the field width, then we fill result with X's. Alternatively, you could mess up the nice neat columns here and output the whole string as-is instead.

If the length of str is less than the field width, we append the appropriate number of spaces to result. We then insert str at the beginning of result (index position 0), or the end (index position result.length()). Finally, we return a String object that we create from result by calling its toString() method.

We can now implement the print()methods and println() methods in our class very easily using the pad() method. Here's how the print() method for type long looks:

// Output type long formatted in a given width
public void print(long value) {
  super.print(pad(String.valueOf(value)));   // Pad to width and output
}

The print(long value) method calls the static valueOf() method in the String class to convert the value to a character string. The string is then passed to the pad() method to create a string of the required length, and this is passed to the print() method belonging to the superclass, PrintWriter.

The print() method for a double value will be almost identical – well the body of the method is identical:

// Output type double formatted in a given width
public void print(double value) {
  super.print(pad(String.valueOf(value)));   // Pad to width and output
}

The print() method for a String value is not a lot different:

// Output type String formatted in a given width
public void print(String str) {
  super.print(pad(str));   // Pad to width and output
}

You should now be able to implement all the other versions of print() similarly to these so add print() methods to the FormattedWriter class for types int, boolean, char, and float.

The println() methods that you also need to add are not very different. You just need to call the println() method for the base class in each case. For instance, we can implement the println() method for type int like this:

  public void println(int value) {
    super.println(pad(String.valueOf(value)));   // Pad to width and output
  }

Add the other println() methods for the remaining primitive types. You can block copy all the print() methods and then modify the copies as a shortcut to save typing. Make sure you change the method name and the print() call in the body in each case though.

If you want more flexibility with objects of the FormattedWriter class, you can add a setWidth() member to change the field width and perhaps a getWidth() member to find out what it is currently. The setWidth() method will be:

public void setWidth(int width) {
  if(width >= 0)
   this.width = width;
}

We test for a non-negative value before setting the width. Here we need to allow the possibility of resetting the width to zero, whereas in the constructor we only want to set width for a non-zero positive argument value. Now we can dynamically set the width for a FormattedWriter object, and all subsequent output using the object will be in a field of the width that we specify.

It's ready to roll, so let's give it a whirl.

Try It Out – Outputting Data in Fixed Fields

Let's create a simple example that exercises our FormatWriter class by outputting integers, floating point values and strings:

import java.io.*;

public class TestFormattedWriter {
  public static void main(String[] args) {

    // Some arbitrary data to output
    int[] numbers = {
      1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377
    };

    double[] values = {
      1.0, 1.0, 1.414, 1.732, 2.236, 2.828, 3.606, 4.582, 5.831, 
      -123456789.23456
    };
    String[] strings = {
      "one", "one", "two", "three", "five", "eight", "thirteen"
    };

    // Create a formatted writer for a buffered output to the command line
    FormattedWriter out = new FormattedWriter(
                                new BufferedWriter(
                                    new OutputStreamWriter(System.out)), 12,
                                           FormattedWriter.RIGHT_JUSTIFIED);
    
    for (int i = 0; i < numbers.length; i++) {
      if (i % 6 == 0) {   // New line before each line of five values 
        out.println();

      } 
      out.print(numbers[i]);
    } 

    out.setWidth(10);
    for (int i = 0; i < values.length; i++) {
      if (i % 5 == 0) {   // New line before each line of four values
        out.println();

      } 
      out.print(values[i]);
    } 

    for (int i = 0; i < strings.length; i++) {
      if (i % 4 == 0) {            // New line before each line of three 
        out.println();

      } 
      //out.print(strings[i], 14);   // Override width
    }
  }
}

Of course, the file containing this class definition needs to be in the same directory as the definition for the FormattedWriter class. If you have typed in the FormattedWriter class and its multitude of methods correctly, this example should produce the output:

           1           1           2           3           5           8
          13          21          34          55          89         144
         233         377
       1.0       1.0     1.414     1.732     2.236
     2.828     3.606     4.582     5.831XXXXXXXXXX
           one           one           two         three
          five         eight      thirteen

How It Works

We first set up three arrays of different types containing interesting data to write to the command line. We then create a FormattedWriter object that will write data right-justified in a field width of 12 to a BufferedWriter object. The BufferedWriter buffers the OutputStreamWriter object that we wrap around System.out, the stream for output to the command line.

We then exercise our FormattedWriter object by writing each of the arrays to the stream differently. You can see the effect of exceeding the specified field width with the last value of type double. You might like to try this with the left-justified option specified in the constructor.

A FormattedWriter object can write to any type of Writer object so you are not limited to just command line output.