ListModification

This is based on a posting by Thomas Burdick. – AlexSchroeder

Newbies are often confused when creating new lists. This page presents some examples showing box diagrams. Maybe that helps.

Of the Lisp functions described here, the following can use side effects to modify ListStructure: ‘delete’, ‘add-to-list’, ‘nconc’, ‘setcar’, and ‘setcdr’. The other functions described here are non-destructive.

The list we shall be using in our examples is this:

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)

Here’s the corresponding box diagram:

          +-------+---+    +------+---+  +-------+-----+
 list1--->| alpha | *----->| beta | *--->| gamma | nil |
          +-------+---+    +------+---+  +-------+-----+

See ListStructure if you don’t feel comfortable with the box diagram above.

The ELISP prompt is from InferiorEmacsLispMode, ‘M-x ielm’.

delete

Try some deleting:

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (setq list2 (delete 'beta list1))
 (alpha gamma)
 
 ELISP> (setq list3 (delete 'alpha list1))
 (gamma)

What are the values of list1, list2, and list3?

Here’s some explanation first: list1 is a pointer. It points to alpha cell, which is a cons cell containing alpha, and a pointer to beta cell. beta cell contains beta and a pointer to gamma cell, and gamma cell contains gamma and nil. nil is the end of the list.

          +-------+---+    +------+---+  +-------+-----+
 list1--->| alpha | *----->| beta | *--->| gamma | nil |
          +-------+---+    +------+---+  +-------+-----+

Now you call ‘delete’. This will change pointers around. If you delete beta, then alpha cell will point to gamma cell, and beta is deleted. Since ‘delete’ returns the pointer to the first cell, list2 points to the modified cell, list1 has the same value!

          +-------+---+    +------+---+       +-------+-----+
 list1--->| alpha | *---+  | beta | *------+->| gamma | nil |
 list2    +-------+---+ |  +------+---+    |  +-------+-----+
                        |                  |
                        +------------------+

Both list1 and list2 have the value (alpha gamma).

What’s the value of list3? Since alpha is the first item in the list, there is no pointer to repoint. delete returns the beginning of the new list starting with the gamma cell. That’s why the value of list3 is (gamma) and both list1 and list2 remain unchanged:

          +-------+---+    +-------+-----+
 list1--->| alpha | *----->| gamma | nil |
 list2    +-------+---+    +-------+-----+
                           ^
                           |
 list3---------------------+

The gamma cell is shared between all three lists. There is no more reference to the beta cell and it will eventually be garbage collected.

remove

To “delete” elements of a list without modifying the original list, use ‘remove’ instead of ‘delete’. It returns a copy of the original list with elements elided. See the source code of ‘remove’ and see it is implemented using ‘copy-sequence’ and ‘delete’.

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (setq list2 (remove 'beta list1))
 (alpha gamma)
 
 ELISP> (setq list3 (remove 'alpha list1))
 (beta gamma)

The same pattern applies for ‘delq’ and ‘remq’ as for ‘delete’ and ‘remove’ except they use ‘eq’ for comparisons instead of ‘equal’.

add-to-list

You pass ‘add-to-list’ a variable whose value is a list and an element to be added to the list. ‘add-to-list’ adds the element to the front of the list if it is not already a member of the list. This avoids duplicates, but if you use this a lot in your code, remember that ‘add-to-list’ has to go through the entire list in order to check for duplicates.

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (add-to-list 'list1 'omega)
 (omega alpha beta gamma)
 
 ELISP> list1
 (omega alpha beta gamma)

Here’s the result:

          +-------+---+   +-------+---+   +------+---+   +-------+-----+
 list1--->| omega | *---->| alpha | *---->| beta | *---->| gamma | nil |
          +-------+---+   +-------+---+   +------+---+   +-------+-----+

remove duplicates

Duplicates elements can be easily removed from a list, but you may need to supply the correct comparison test. The following example indicates how:

 ELISP> (setq a '("one" "one" "two"))
 ("one" "one" "two")
 ELISP> (remove-duplicates a :test 'string=)
 ("one" "two")

In the case of strings, if you neglect the comparison test, you get an incorrect result:

 ELISP> (remove-duplicates a)
 ("one" "one" "two")

Numbers would work as expected here, because the default test is ‘eq’.

Similar to ‘remove’, the original list is unchanged. If you want to update the value of a variable to the new list, then do that:

    (setq a (remove-duplicates a))

If you don’t mind having the original list modified and you want to conserve memory, you can use the DestructiveOperation, ‘delete-dups’. It returns the new duplicate-free list, possibly reusing some of the memory from the original list. ‘delete-dups’ always uses ‘equals’ for comparison.

Again, if you want to update the value of a variable to the returned list then do that:

    (setq a (delete-dups a))

See the Elisp manual, node [Rearrangement](http://www.gnu.org/software/emacs/manual/html_node/elisp/Rearrangement.html) for information about list-modifying functions.

cons

A cons is a pair of pointers; they are called the car and the cdr of the cell. Several cons cells form a list if the car always points to an element of the list and the cdr points to the next cons cell. The cdr of the last cons cell in the list is nil.

Using ‘cons’ to add to the front of a list is the easiest and most natural thing to do! After adding lots of elements to the list in this way, you can reverse the list using ‘nreverse’.

Here is an example where we take the cdr of a list (ie. we skip the first element), and “cons a new element onto the list.” Note how the tail of the list is shared.

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (setq list2 (cdr list1))
 (beta gamma)
 
 ELISP> (setq list2 (cons 'ypsilon list2))
 (ypsilon beta gamma)
 
 ELISP> (eq list1 list2)
 nil
 ELISP> (eq (cdr list1) (cdr list2))
 t

And here’s the box diagram:

          +-------+---+    +------+---+  +-------+-----+
 list1--->| alpha | *----->| beta | *--->| gamma | nil |
          +-------+---+    +------+---+  +-------+-----+
                           ^
          +---------+---+  |
 list2--->| ypsilon | *----+
          +---------+---+

In terms of efficiency, consing is something expensive because it allocates new storage for a cons cell. Usually that doesn’t matter. If you want to add an existing cons cell to your list (eg. moving it from one list to another) you might get away with lots of ‘setcar’ and ‘setcdr’. But that usually results in code which is difficult to maintain.

nconc

‘nconc’ modifies the cdr of the last cell of a list: Its target is changed from nil to the new element.

Here’s what happens if you add something other than a cons cell to the list. The result is no longer a “proper” list. The cdr of the last cons cell is no longer nil.

 ELISP> (setq list1 '(alpha beta gamma))
      => (alpha beta gamma)
 
 ELISP> (setq var (cdr list1))
      => (beta gamma)
 
 ELISP> (nconc var 'ypsilon)
      => (beta gamma . ypsilon)
 
 ELISP> list1
      => (alpha beta gamma . ypsilon)

This situation looks like:

          +-------+---+  +------+---+  +-------+---------+
 list1--->| alpha | *--->| beta | *--->| gamma | ypsilon |
          +-------+---+  +------+---+  +-------+---------+
                         ^
                         |
 var---------------------+
 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (setq list2 '(omega ypsilon))
 (omega ypsilon)
 
 ELISP> (setq list3 (nconc list1 list2))
 (alpha beta gamma omega ypsilon)

Note how list1 was changed:

 ELISP> list1
 (alpha beta gamma omega ypsilon)
 
 ELISP> list2
 (omega ypsilon)

Actually, if this were real code, you would have assigned the result of ‘nconc’ to one of the existing variables instead of using a third one.

 ELISP> (setq list1 (nconc list1 list2))
 (alpha beta gamma omega ypsilon)

Anyway, here’s the picture:

 list3----+
          |
          v
          +-------+---+  +------+---+  +-------+---+  +-------+---+  +---------+-----+
 list1--->| alpha | *--->| beta | *--->| gamma | *--->| omega | *--->| ypsilon | nil |
          +-------+---+  +------+---+  +-------+---+  +-------+---+  +---------+-----+
                                                      ^
                                                      |
 list2------------------------------------------------+

Just to prove that the tail (list2) is shared:

 ELISP> (eq list1 list3)
 t
 
 ELISP> (eq (nthcdr 3 list1) list2)
 t

See also CircularLists.

append

‘append’ creates a copy of the list that is added to the front of the other list. The front is new; the tail is shared. Because a new list structure is created this can be more expensive than using ‘nconc’.

 ELISP> (setq list1 '(alpha beta gamma))
 (alpha beta gamma)
 
 ELISP> (setq list2 '(omega ypsilon))
 (omega ypsilon)
 
 ELISP> (setq list3 (append list1 list2))
 (alpha beta gamma omega ypsilon)

Note how list1 has not been changed:

 ELISP> list1
 (alpha beta gamma)
 
 ELISP> list2
 (omega ypsilon)
          +-------+---+    +------+---+  +-------+-----+
 list1--->| alpha | *----->| beta | *--->| gamma | nil |
          +-------+---+    +------+---+  +-------+-----+


          +-------+---+  +------+---+  +-------+---+  +-------+---+  +---------+-----+
 list3--->| alpha | *--->| beta | *--->| gamma | *--->| omega | *--->| ypsilon | nil |
          +-------+---+  +------+---+  +-------+---+  +-------+---+  +---------+-----+
                                                      ^
                                                      |
 list2------------------------------------------------+

Just to prove that the tail (list2) is shared with list3 (and that list1 is not part of list3):

 ELISP> (eq list1 list3)
 nil
 
 ELISP> (eq (nthcdr 3 list1) list2)
 nil
 
 ELISP> (eq (nthcdr 3 list3) list2)
 t

setcar

‘setcar’ sets the car of a cell. You can use it to change a single element of a list, once you know that a list is composed of cons cells, where the cdr points to the next cell, and the car points to the elements.

 ELISP> (setq list '(alpha beta gamma delta))
 (alpha beta gamma delta)
 
 ELISP> (setcar (nthcdr 2 list) 'epsilon)
 epsilon
 
 ELISP> list
 (alpha beta epsilon delta)

What happened? The list got modified destructively. ‘nthcdr’ is like using a number of ‘cdr’ calls one after another. It returns the rest of the list. The beginning of the list is a cons cell, therefore calling setcar on it changes the corresponding element of the list.

          +-------+---+    +------+---+  +-------+---+  +-------+-----+
 list1--->| alpha | *----->| beta | *--->| gamma | *--->| delta | nil |
          +-------+---+    +------+---+  +-------+---+  +-------+-----+
                                         ^
                                         |
 nthcdr 2--------------------------------+

setcdr

Like ‘setcar’, ‘setcdr’ can be used to modify a list destructively, though instead of modifying the car (ie the “contents”) of the cell, it modifies its cdr, or “pointer cell”.

  ELISP> (setq list '(alpha beta gamma delta))
  (alpha beta gamma delta)
  
  ELISP> (setcdr (last list) '(epsilon))
  (epsilon)
  
  ELISP> list
  (alpha beta gamma delta epsilon)
    
  ELISP> (setq list '(alpha beta (gamma delta)))
  (alpha beta (gamma delta))
  
  ELISP> (setcdr (nth 2 list) '(eta))
  (eta)
  
  ELISP> list
  (alpha beta (gamma eta))

last

‘last’ returns the nth-to-last link of a list. It is used to discard the front part of a list.

 ELISP> (setq list '(alpha beta gamma delta))
 (alpha beta gamma delta)
 ELISP> (last list 2)
 (gamma delta)

butlast

‘butlast’ removes the last n elements. It is used to discard the tail of a list. There is also a destructive variant, ‘nbutlast’.

 ELISP> (setq list '(alpha beta gamma delta))
 (alpha beta gamma delta)
 ELISP> (butlast list 2)
 (alpha beta)

See Also: DestructiveOperations

Extra

Here’s my personal helper function for manipulating lists:

It would be great if someone can show me a better way to delete 0th element in delete-nth

 (defun delete-nth (index seq)
   "Delete the INDEX th element of SEQ.
 Return result sequence, SEQ __is__ modified."
   (if (equal index 0)
       (progn
         (setcar seq (car (cdr seq)))
         (setcdr seq (cdr (cdr seq))))
     (setcdr (nthcdr (1- index) seq) (nthcdr (1+ index) seq))))
 (defun set-nth (index seq newval)
   "Set the INDEX th element of SEQ to NEWVAL.
 SEQ __is__ modified."
   (setcar (nthcdr index seq) newval))

CategoryCode