Back in Chapter 6 we put together a basic class defining a linked list. An object of type LinkedList represented an example of a collection of objects that could be of any type. A collection is used as a generic term for any object that represents a set of objects grouped together by some means. A linked list is only one of a number of ways of grouping a number of objects together in a collection.
There are three main types of collections: sets, sequences, and maps. Let's first get an understanding of how these three types of collections work in principle, and then come back to look at the Java classes that implement versions of these. One point I would like to emphasize about the following discussion is that when we talk about a collection of objects, we mean a collection of references to objects. In Java collections only stores references – the objects themselves are external to the collection.
A set is probably the simplest kind of collection you can have. Here the objects are not usually ordered in any particular way at all and objects are simply added to the set without any control over where they go. It's a bit like putting things in your pocket – you just put things in and they rattle around inside your pocket in no particular order.
The principal access mechanism that you have for a set is simply to check whether a given object is a member of the set or not. For this reason, you cannot have duplicate objects in a set – each object in the set must be unique. Of course, you can also remove a given object from a set, but only if you know what the object is in the first place – in other words if you have a reference to the object in the set.
There are variations on the basic set that we have described here. For instance, sets can be ordered, so objects added to a set will be inserted into a sequence of objects ordered according to some criterion of comparison. Such sets require that the class that defines the objects to be stored implements suitable methods for comparing objects.
A linked list is an example of a more general type of collection called a sequence or a list. A primary characteristic of a list is that the objects are stored in a linear fashion, not necessarily in any particular order, with a beginning and an end. This contrasts with a set where there is no order at all. An ordinary array is basically another example of a list, but is much more limited than a collection because it is fixed.
Collections generally have the ability to expand to accommodate as many elements as necessary. The Vector class, for example, is a collection class that provides similar functionality to an array, but which also has this ability to accommodate new elements as and when required.
Because a list is linear, you will only be able to add a new object to the list at the beginning, or at the end, or inserted following a given object in sequence – after the fifth say. Generally, you can retrieve an object from a list in several ways. You can select the first or the last; you can get the object at a given position – as in indexing an array; or you can search for an object identical to a given object by checking all the objects in the sequence either backwards or forwards. You can also iterate through the list backwards or forwards accessing each object in sequence. We didn't implement all these capabilities in our linked list class in Chapter 6, but we could have done.
You can delete objects from a list in the same sorts of ways that you retrieve an object; that is, you can remove the first or the last, an object at a particular position in the sequence, or an object that is equal to a given object. Sequences or lists have the facility that they can store several copies of the same object at different places in the sequence. This is not true of all types of collections, as we will see.
A stack, which is a last-in first-out storage mechanism, is also considered to be a list, as is a queue, which is a first-in first-out mechanism. It is easy to see that a linked list can act as a stack, since using the methods to add and remove objects at the end of a list makes the list operate as a stack. Similarly, only adding objects by using the method to add an object to the end of a linked list, and only retrieving objects from the head of the list, makes it operate as a queue.
A map is rather different from a set or a sequence because the entries involve pairs of objects. A map is also referred to sometimes as a dictionary because of the way it works. Each object that is stored in a map has an associated key object, and both the object and its key are stored as a pair. The key determines where the object is stored in the map, and when you want to retrieve an object you must supply the appropriate key – so it acts as the equivalent of a word that you look up in a regular dictionary.
A key can be any kind of object that you want to use to reference the object stored. Because the key has to uniquely identify the object, all the keys in a map must be different. To put this in context let's take an example. Suppose you were creating a program to provide an address book. You might store all the details of each person – their name, address, phone number, or whatever – in a single object of type Entry perhaps, and store a reference to the object in a map. The key is the mechanism for retrieving objects, so assuming that all names are different, the name of the person would be a natural choice for the key. Thus the entries in the map in this case would be Name/Entry pairs. You would supply a Name object as the key, and get back the Entry object corresponding to the key. You might well have another map in this application where entries were keyed on the phone number. Then you could retrieve an entry corresponding to a given number. Of course, in practice, names are not unique – hence the invention of such delightful attachments to the person as social security numbers.
Where a key/object pair is stored in a map is determined from the key by a process known as hashing. Hashing processes the key object to produce an integer value called a hash code. The hashCode() method that is defined in the Object class produces a hash code of type int for an object. The hash code is typically used as an offset from the start of the memory that has been allocated within the map, to determine the location where the key/object pair is to be stored. Ideally the hashing process should result in values that are uniformly distributed within a given range, and every key should produce a different hash code. In general, this may not be the case, but there are ways around this so it is not a problem. We will look at keys and hash codes in a little more detail when we discuss using maps later in this chapter.
Now let's look at how we can move through a collection.