Linked List was developed in 1955 by Allen Newell, Cliff Shaw, and Herbert A. Simon at RAND Corporation. The Linked List is a dynamically-allocated data structure, meaning it grows dynamically in memory on its own very time-efficiently (as opposed to an Array List, for which we needed to explicitly reallocate memory as the backing array fills, which is a very time-costly operation).
A Linked List is a data structure composed of nodes: containers that each hold a single element. These nodes are "linked" to one another via pointers. Typically, we maintain one global head pointer (i.e., a pointer to the first node in the Linked List) and one global tail pointer (i.e., a pointer to the last node in a Linked List). These are the only two nodes to which we have direct access, and all other nodes can only be accessed by traversing pointers starting at either the head or the tail node.
In a Singly-Linked List, each node only has a pointer pointing to the node directly after it (for internal nodes) or a null pointer (for the tail node). As a result, we can only traverse a Singly-Linked List in one direction.
In a Doubly-Linked List, each node has two pointers: one pointing to the node directly after it and one pointing to the node directly before it (of course, the head and the tail nodes only have a single pointer each). As a result, we can traverse a Doubly-Linked List in both directions.
To access an arbitrary element in a Linked List, because we don't have direct access to the internal nodes (we only have direct access to head and tail), we have to iterate through all n elements in the worst case. If we are looking for the i-th element of a Linked List, this is a O(i) operation, whereas in an array, accessing the i-th element was a O(1) operation (because of "random access"). This is one main drawback of a Linked List: even if we know exactly what index we want to access, because the data is not stored contiguously in memory, we need to slowly iterate through the elements one-by-one until we reach the node we want.
Below is pseudocode for the "find" operation of a Linked List:
find(element): // returns True if element exists in Linked List, otherwise returns False current = head // start at the "head" node while current is not NULL: if current.data == element: return True current = current.next // follow the forward pointer of current return False // we iterated over all nodes but did not find "element"
If we want to find what element is at a given "index" of the Linked List, we can perform a similar algorithm, where we start at the head node and iterate through the nodes via their forward pointers until we find the element we desire. Note that in the pseudocode below, we use 0-based indexing.
find(index): // returns element at position "index" of Linked List (or NULL if invalid) if index < 0 or index >= n: // check for invalid indices return NULL // we will use NULL to denote an invalid node curr = head // looking for an element in the middle of the Linked List repeat index times: // move forward index times curr = curr.next return curr
The "insert" algorithm is almost identical to the "find" algorithm: you first execute the "find" algorithm just like before, but once you find the insertion site, you rearrange pointers to fit the new node in its rightful spot. Below is an example of inserting the number 5 to index 1 of the Linked List (using 0-based counting):
Notice how we do a regular "find" operation to the index directly before index i, then we point the new node's "next" pointer to the node that was previously at index i (and the new node's "prev" pointer to the node that is directly before index i in the case of a Doubly-Linked List, as above), and then we point the "next" pointer of the node before the insertion site to point to the new node (and the "prev" pointer of the node previously at index i to point to the new node in the case of a Doubly-Linked List).
Because the structure of a Linked List is only based on pointers, the insertion algorithm is complete simply after changing those pointers.
Below is pseudocode for the "insert" operation of a Linked List (with added corner cases for the first and last indices):
insert(newnode, index): // inserts newnode at position "index" of Linked List if index == 0: // special case for insertion at beginning of list newnode.next = head head.prev = newnode head = newnode else if index == size: // special case for insertion at end of list newnode.prev = tail tail.next = newnode tail = newnode else: // general case for insertion in middle of list curr = head repeat index-1 times: // move curr to directly before insertion site curr = curr.next newnode.next = curr.next // update the pointers newnode.prev = curr curr.next = newnode newnode.next.prev = newnode size = size + 1 // increment size
The "remove" algorithm is also almost identical to the "find" algorithm: you first execute the "find" algorithm just like before, but once you find the insertion site, you rearrange pointers to remove the node of interest. Below is pseudocode for the "remove" operation of a Linked List (with added corner cases for the first and last indices):
remove(index): // removes the element at position "index" of Linked List if index == 0: // special case for removing from beginning of list head = head.next head.prev = NULL else if index == n-1: // special case for removing from end of list tail = tail.prev tail.next = NULL else: // general case for removing from middle of list curr = head repeat index-1 times: // move curr to directly before removal site curr = curr.next curr.next = curr.next.next // update the pointers curr.next.prev = curr n = n - 1 // decrement n
In summation, Linked Lists are great (constant-time) when we add or remove elements from the beginning or the end of the list, but finding elements in a Linked List (even one in which elements are sorted) cannot be optimized like it can in an Array List, so we are stuck with O(n) "find" operations. Also, recall that, with Array Lists, we needed to allocate extra space to avoid having to recreate the backing array repeatedly, but because of the dynamic allocation of memory for new nodes in a Linked List, we have no wasted memory here.
Array-based and Linked data structures are two basic starting points for many more complicated data structures. Because one strategy does not entirely dominate the other (i.e., they both have their pros and cons), you must analyze each situation to see which approach would be better.