Version: 0.3.38

3. Linked Lists#

Lecture:

Lecture 3.3 (slides)

Objectives:

What is a linked list, and how to implement the Sequence ADT using a linked list

Concepts:

Linked-lists, Iterators

Now we know about recursion, and how it applies to both algorithms and data types, we can look at our first recursive data type: The linked List. We will see how a linked list yields another implementation of the sequence ADT with different efficiency as opposed to array-based implementation. We will also briefly look at other members of linked-list family, including the doubly linked list, and the circular list. Finally, we will look at Iterator, a concepts that makes linked-list shine when arrays are inefficient.

3.1. Linked Lists#

The idea behind linked lists is to store each item in a specific record (sometimes called a node) that has two fields. The first hold the item itself, and the second a reference to the record that contains the next item. The sequence is therefore the chain made by these records, as shown by Fig. 3.7. Additionally, one can also add another list record that holdss a reference to the first “node”, as well as any other useful information (length, etc.).

../_images/singly_linked_list.svg

Fig. 3.7 Structure of a linked list of string items, where each node “points” to the next#

In practice, all these references to nodes are implemented with pointers or object references. By contrast to an array that allocates a single continuous block of memory, a linked list is made of many separate fragments of memory, linked using references.

Note that the record type of a node shown on Listing 3.8 is a recursive data type.

3.1.1. Creation#

To implement the seq.create() we allocate a new List record and we initialize its fields with default values, 0 for the length, and NULL for the first item.

List*
list_create ()
{
  List* new_list = malloc(sizeof(List));
  new_list->length = 0;
  new_list->head = NULL;
  return new_list;
}

3.1.2. Length#

Computing the length of the sequence can be done easily, if store the number of item in a dedicated field, as we did in Listing 3.8. Listing 3.9 shows one way to do that in C.

Listing 3.9 Exposing the length of the sequence#
int
list_length(List* list) {
  assert(list != NULL);
  return list->length;
}

3.1.3. Access#

To implement the seq.get() function and find the ith item, we need to follows the chain of nodes until the ith node, and return the corresponding item. Listing 3.10 shows the corresponding C procedure.

Listing 3.10 Find the ith item#
int
list_get(List* list, int index)
{
  assert(list != NULL);
  Node* target = find_node_at(list, index);
  return target->item;
}

Node*
find_node_at(List* list, int index) {
  assert(index > 0 && index <= list->length);
  Node* current = list->head;
  int i=1;
  while (current != NULL && i != index) {
    i++;
    current = current->next;
  }
  return current;
}

Important

With a linked list, accessing the ith item is slow: It takes a time linear to the length of the sequence.

3.1.4. Insertion#

To implement seq.insert(), we allocate a new Node record and we link it with the appropriate nodes in the chain. In the general case, we proceed as follows, althought care must be taken to insert in the front of the list:

  1. We create a new Node record, that carries the given item.

  2. We find the node that precedes the insertion points, so called previous node.

  3. We set the next node of the new node to be the next node of previous node

  4. We set the next node of the previous node to be the new node

  5. We increment the length

Fig. 3.8 shows the structure of the list, the new node (in gray), and how the links are updated.

../_images/insertion.svg

Fig. 3.8 Insertion of a new item at the 3rd position. We create a new node record and we update the chaining.#

The listing below shows a possible implementation of this insertion in C. To find the previous node, we use a separate procedure that follows the links from node to node. Following these links requires \(\Theta(n)\) accesses.

Listing 3.11 Insertion in linked-list in C#
void
list_insert(List* list, int item, int index) {
  assert(index > 0 && index <= list->length + 1);

  Node* new_node = malloc(sizeof(Node));
  new_node->item = item;

  if (index == 1) {
    list->head = new_node;

  } else {
    Node* previous = find_node_at(list, index-1);
    new_node->next = previous->next;
    previous->next = new_node;

  }
  list->length++;
}

Important

The insertion in linked list runs in \(\Theta(n)\). As opposed to array-based implementation, an linked list does not require shifting any item. We simply creates a new node and “link it” to the chain. The “expensive” part is to traverse the chain to find the insertion point.

3.1.5. Deletion#

To implement the seq.remove() operation, we mirror the insertion algorithm as follows (care must also be taken to delete the first item). Fig. 3.9 illustrates this removal process.

  1. We locate the node that precedes the target (so called previous node)

  2. We save the reference to the target node, to free that record

  3. We set the next field of the previous record to point to the next field of the target

  4. We free the target record.

../_images/deletion.svg

Fig. 3.9 Deleting an item in a list by re-wiring the links#

Listing 3.12 gives a possible implementation in C, which reuses the find_node_at procedure defined above.

Listing 3.12 A C implementation of the deletion algorithm#
void
list_remove(List* list, int index)
{
  assert(list != NULL);
  if (index == 1) {
    free(list->head);
    list->head = NULL;

  } else {
    Node* previous = find_node_at(list, index-1);
    Node* target = previous->next;
    previous->next = target->next;
    free(target);
  }
  list->length--;
}

3.2. Other Flavors of List#

The concept of “linked list” is in fact a general idea and there are many variations around. We look here at two common ones, doubly linked lists and circular lists, but there are others such as skip lists, self-adjusting lists, etc.

3.2.1. Doubly Linked Lists#

The idea of doubly linked list is to extend the nodes with a reference to the previous node. This permits navigating the list in both direction. Listing 3.13 opposite, shows a C implementation of these extended nodes.

../_images/doubly_linked_lists.svg

Fig. 3.10 Doubly linked list where each node points to the next and to the previous node.#

Handling this extra reference make the insertion and deletion a little more complex, but there is no need to search for “previous” item, since we can now navigate the list in both direction.

3.2.2. Circular Lists#

A circular list is a regular linked list where the last node points back to the first one, creating a loop, as shown on Fig. 3.11. A common use-case is the rolling log for instance, where one continuously appends new observation. Past a given maximum length \(n\), the list stops growing and one simply overwrites the beginning, preserving only the last \(n\) items.

../_images/circular_list.svg

Fig. 3.11 A circular list, where the last node points towards the first one, creating a loop-like structure.#

3.3. Iterators#

Consider our Sequence ADT, and how one can traverse it, say to print all items. Given the operations we defined, one could write the program shown by recursion/sequence/traversal.

Listing 3.14 Traversing a sequence using the operations of the ADT#
for (int i=1 ; i<=seq_length(sequence) ; i++) {
   int item = seq_get(sequence, index);
   // Do something with item ...
}

If use an array to implement our sequence (see Lecture 2.2 <sequences/arrays), this traversal would run in \(O(n)\). Indeed, each access to a specific items (see seq_get() takes \(O(1)\) and we will do this for each of the \(n\) items.

However, if we used a linked list, this traversal would run in \(O(n^2)\). With a linked list, each access runs in \(O(n)\) because we have to navigate the list from its “head” to the desired node, and doing that for the \(n\) items would take \(O(n^2)\). But we can do better! Recall the find_node_at, it also traverses the sequences, and yet runs in \(O(n)\) because it moves from one node to the next without restarting at the beginning, as shown on Listing 3.15.

Listing 3.15 Traversing a linked list efficiently, from node to node.#
Node* current = list->head;
while (current != NULL) {
   int item = current->item;
   // .. do something with item
   current = current->next;
}

3.3.1. The Iterator ADT#

The iterator ADT captures the notion of “position within a container” (i.e., sequences, trees, maps, etc.). It defines at least the following operations:

iterator.create(s: Sequence, i: Natural) Iterator#

Create a new sequence iterator, setup at the given index. The resulting iterator is bound to the given sequence \(s\).

iterator.item(i: Iterator) Item#

Returns the item at the known index

iterator.hasNext(i: Iterator) Boolean#

Returns true if and only if there are more value beyond the position represented by the iterator,

iterator.next(i: Iterator) Iterator#

Move the iterator from its current position to the next in the sequence

With this new ADT and its operations, we can now write a list traversal as follows on Listing 3.16.

Listing 3.16 Traversing a sequence using an iterator#
Iterator* position = iterator_create(sequence, 1);
while (iterator_has_next(position)) {
   int item = iterator_item(position);
   // .. do something with item ...
   iterator_next(position);
}

3.3.2. C Implementation#

Iterators can be implemented against an array, or a linked list, as we will do below. To implement an iterator, we must keep track of the list of interest, the current node, and its predecessor (if any). Listing 3.17 shows how that could look like in C, where we use a record to hold a reference to each of these bits and pieces.

Fig. 3.12 shows the structure of a list, with an iterator “pointing” at the 3rd item. The previous field enables extending the set of operations, with insertion and deletion in \(O(1)\).

../_images/iterator.svg

Fig. 3.12 An iterator referencing a list and one of its node.#

3.4. Linked Lists vs. Dynamic Arrays#

Table 3.6 summarizes the performance of linked lists and dynamic arrays. Regarding the memory, a linked list requires more memory than an equivalent array, because we need to store all these extra pointers. As for the runtime, there is no clear winner, and using one or the other depends on the problem at hand.

Table 3.6 Comparing runtime efficiency of linked list and dynamic arrays (worst cases)#

Scenario

Dynamic Array

Linked List

Create

\(O(1)\)

\(O(1)\)

Access

\(O(1)\)

\(O(n)\)

Insert first

\(O(n)\)

\(O(1)\)

Insert last

\(O(1)*\)

\(O(n)\)

Insert (iterator)

\(O(n)\)

\(O(1)\)

Remove first

\(O(n)\)

\(O(1)\)

Remove last

\(O(1)*\)

\(O(n)\)

Remove (iterator)

\(O(n)\)

\(O(1)\)