Version: 0.3.38
3. Linked Lists#
- Lecture:
Lecture 3.3
(slides)
- Objectives:
What is a linked list, and how to implement the Sequence ADT using a linked list
- Concepts:
Linked-lists, Iterators
Now we know about recursion, and how it applies to both algorithms and data types, we can look at our first recursive data type: The linked List. We will see how a linked list yields another implementation of the sequence ADT with different efficiency as opposed to array-based implementation. We will also briefly look at other members of linked-list family, including the doubly linked list, and the circular list. Finally, we will look at Iterator, a concepts that makes linked-list shine when arrays are inefficient.
3.1. Linked Lists#
The idea behind linked lists is to store each item in a specific record (sometimes called a node) that has two fields. The first hold the item itself, and the second a reference to the record that contains the next item. The sequence is therefore the chain made by these records, as shown by Fig. 3.7. Additionally, one can also add another list record that holdss a reference to the first “node”, as well as any other useful information (length, etc.).
Fig. 3.7 Structure of a linked list of string items, where each node “points” to the next#
In practice, all these references to nodes are implemented with pointers or object references. By contrast to an array that allocates a single continuous block of memory, a linked list is made of many separate fragments of memory, linked using references.
Note that the record type of a node shown on Listing 3.8 is a recursive data type.
3.1.1. Creation#
To implement the seq.create()
we allocate a new List
record
and we initialize its fields with default values, 0 for the length,
and NULL
for the first item.
List*
list_create ()
{
List* new_list = malloc(sizeof(List));
new_list->length = 0;
new_list->head = NULL;
return new_list;
}
3.1.2. Length#
Computing the length of the sequence can be done easily, if store the number of item in a dedicated field, as we did in Listing 3.8. Listing 3.9 shows one way to do that in C.
int
list_length(List* list) {
assert(list != NULL);
return list->length;
}
3.1.3. Access#
To implement the seq.get()
function and find the ith
item, we need to follows the chain of nodes until the ith node,
and return the corresponding
item. Listing 3.10 shows the corresponding C
procedure.
int
list_get(List* list, int index)
{
assert(list != NULL);
Node* target = find_node_at(list, index);
return target->item;
}
Node*
find_node_at(List* list, int index) {
assert(index > 0 && index <= list->length);
Node* current = list->head;
int i=1;
while (current != NULL && i != index) {
i++;
current = current->next;
}
return current;
}
Important
With a linked list, accessing the ith item is slow: It takes a time linear to the length of the sequence.
3.1.4. Insertion#
To implement seq.insert()
, we allocate a new Node
record
and we link it with the appropriate nodes in the chain. In the general
case, we proceed as follows, althought care must be taken to insert in
the front of the list:
We create a new
Node
record, that carries the given item.We find the node that precedes the insertion points, so called previous node.
We set the next node of the new node to be the next node of previous node
We set the next node of the previous node to be the new node
We increment the length
Fig. 3.8 shows the structure of the list, the new node (in gray), and how the links are updated.
Fig. 3.8 Insertion of a new item at the 3rd position. We create a new node record and we update the chaining.#
The listing below shows a possible implementation of this insertion in C. To find the previous node, we use a separate procedure that follows the links from node to node. Following these links requires \(\Theta(n)\) accesses.
void
list_insert(List* list, int item, int index) {
assert(index > 0 && index <= list->length + 1);
Node* new_node = malloc(sizeof(Node));
new_node->item = item;
if (index == 1) {
list->head = new_node;
} else {
Node* previous = find_node_at(list, index-1);
new_node->next = previous->next;
previous->next = new_node;
}
list->length++;
}
Important
The insertion in linked list runs in \(\Theta(n)\). As opposed to array-based implementation, an linked list does not require shifting any item. We simply creates a new node and “link it” to the chain. The “expensive” part is to traverse the chain to find the insertion point.
3.1.5. Deletion#
To implement the seq.remove()
operation, we mirror the insertion
algorithm as follows (care must also be taken to delete the first
item). Fig. 3.9 illustrates this removal
process.
We locate the node that precedes the target (so called previous node)
We save the reference to the target node, to free that record
We set the
next
field of the previous record to point to thenext
field of the targetWe free the target record.
Fig. 3.9 Deleting an item in a list by re-wiring the links#
Listing 3.12 gives a possible
implementation in C, which reuses the find_node_at
procedure
defined above.
void
list_remove(List* list, int index)
{
assert(list != NULL);
if (index == 1) {
free(list->head);
list->head = NULL;
} else {
Node* previous = find_node_at(list, index-1);
Node* target = previous->next;
previous->next = target->next;
free(target);
}
list->length--;
}
3.2. Other Flavors of List#
The concept of “linked list” is in fact a general idea and there are many variations around. We look here at two common ones, doubly linked lists and circular lists, but there are others such as skip lists, self-adjusting lists, etc.
3.2.1. Doubly Linked Lists#
The idea of doubly linked list is to extend the nodes with a reference to the previous node. This permits navigating the list in both direction. Listing 3.13 opposite, shows a C implementation of these extended nodes.
Fig. 3.10 Doubly linked list where each node points to the next and to the previous node.#
Handling this extra reference make the insertion and deletion a little more complex, but there is no need to search for “previous” item, since we can now navigate the list in both direction.
3.2.2. Circular Lists#
A circular list is a regular linked list where the last node points back to the first one, creating a loop, as shown on Fig. 3.11. A common use-case is the rolling log for instance, where one continuously appends new observation. Past a given maximum length \(n\), the list stops growing and one simply overwrites the beginning, preserving only the last \(n\) items.
Fig. 3.11 A circular list, where the last node points towards the first one, creating a loop-like structure.#
3.3. Iterators#
Consider our Sequence ADT, and how one can
traverse it, say to print all items. Given the operations we
defined, one could write the program shown by
recursion/sequence/traversal
.
for (int i=1 ; i<=seq_length(sequence) ; i++) {
int item = seq_get(sequence, index);
// Do something with item ...
}
If use an array to implement our sequence (see Lecture 2.2
<sequences/arrays), this traversal would run in \(O(n)\). Indeed,
each access to a specific items (see seq_get()
takes
\(O(1)\) and we will do this for each of the \(n\) items.
However, if we used a linked list, this traversal would run in
\(O(n^2)\). With a linked list, each access runs in \(O(n)\)
because we have to navigate the list from its “head” to the desired
node, and doing that for the \(n\) items would take
\(O(n^2)\). But we can do better! Recall the find_node_at
,
it also traverses the sequences, and yet runs in \(O(n)\) because
it moves from one node to the next without restarting at the
beginning, as shown on Listing 3.15.
Node* current = list->head;
while (current != NULL) {
int item = current->item;
// .. do something with item
current = current->next;
}
3.3.1. The Iterator ADT#
The iterator ADT captures the notion of “position within a container” (i.e., sequences, trees, maps, etc.). It defines at least the following operations:
- iterator.create(s: Sequence, i: Natural) Iterator #
Create a new sequence iterator, setup at the given index. The resulting iterator is bound to the given sequence \(s\).
- iterator.item(i: Iterator) Item #
Returns the item at the known index
- iterator.hasNext(i: Iterator) Boolean #
Returns true if and only if there are more value beyond the position represented by the iterator,
- iterator.next(i: Iterator) Iterator #
Move the iterator from its current position to the next in the sequence
With this new ADT and its operations, we can now write a list traversal as follows on Listing 3.16.
Iterator* position = iterator_create(sequence, 1);
while (iterator_has_next(position)) {
int item = iterator_item(position);
// .. do something with item ...
iterator_next(position);
}
3.3.2. C Implementation#
Iterators can be implemented against an array, or a linked list, as we will do below. To implement an iterator, we must keep track of the list of interest, the current node, and its predecessor (if any). Listing 3.17 shows how that could look like in C, where we use a record to hold a reference to each of these bits and pieces.
Fig. 3.12 shows the structure of a list, with an iterator “pointing” at the 3rd item. The previous field enables extending the set of operations, with insertion and deletion in \(O(1)\).
Fig. 3.12 An iterator referencing a list and one of its node.#
3.4. Linked Lists vs. Dynamic Arrays#
Table 3.6 summarizes the performance of linked lists and dynamic arrays. Regarding the memory, a linked list requires more memory than an equivalent array, because we need to store all these extra pointers. As for the runtime, there is no clear winner, and using one or the other depends on the problem at hand.
Scenario |
Dynamic Array |
Linked List |
---|---|---|
Create |
\(O(1)\) |
\(O(1)\) |
Access |
\(O(1)\) |
\(O(n)\) |
Insert first |
\(O(n)\) |
\(O(1)\) |
Insert last |
\(O(1)*\) |
\(O(n)\) |
Insert (iterator) |
\(O(n)\) |
\(O(1)\) |
Remove first |
\(O(n)\) |
\(O(1)\) |
Remove last |
\(O(1)*\) |
\(O(n)\) |
Remove (iterator) |
\(O(n)\) |
\(O(1)\) |