I need to join two lists, sort them and remove duplicates. Is there a better way to do this?

Go To StackoverFlow.com

6

I have two unsorted lists and I need to produce another list which is sorted and where all the elements are unique.

The elements can occur multiple times in both lists and they are originally unsorted.

My function looks like this:

(defun merge-lists (list-a list-b sort-fn)
    "Merges two lists of (x, y) coordinates sorting them and removing dupes"
    (let   ((prev nil))
        (remove-if
            (lambda (point)
                (let   ((ret-val (equal point prev)))
                    (setf prev point)
                    ret-val))
            (sort
                (merge 'list list-a list-b sort-fn) ;'
                 sort-fn))))

Is there a better way to achieve the same?

Sample call:

[CL]> (merge-lists '(9 8 4 8 9 7 2) '(1 7 3 9 2 6) #'>)
  ==> (9 8 7 6 4 3 2 1)
2008-09-19 06:29
by dsm
You might want to clarify what you mean with "better" - mweerden 2008-09-19 06:58
Did you try the untested snippet and did it work? I'd love to edit my answer so that the generations after us don't have to live in fear of the snippet crashing their Lisp 3000.. - Antti Rasinen 2008-09-19 15:30
I did indeed test it and it did indeed work. Many thanks for the answer - dsm 2008-09-19 15:40
First, is sort-fn compatible with equal? i.e., does (sort-fn a b) = NIL and (sort-fn b a) = NIL imply (equal a b) = T? Second, you should merge after sort for speed. Third, sort-fn should be named compare or something similar - sds 2013-03-14 16:58


11

Our neighbourhood friendly Lisp guru pointed out the remove-duplicates function.

He also provided the following snippet:

(defun merge-lists (list-a list-b sort-fn test-fn)
    (sort (remove-duplicates (append list-a list-b) :test test-fn) sort-fn))
2008-09-19 06:49
by Antti Rasinen
I knew there was an elegant way of doing this : - dsm 2008-09-19 07:03
This is not elegant - this is quadratic in sum of lengths of the lists instead of linearithmic in the length of the longest list - sds 2013-03-14 16:53


1

I think I would first sort the two lists separately and then merge them with a function that also skips over duplicates. This should be a bit faster as it requires one less traversal of both lists.

P.S.: I doubt it can be done much faster as you basically always need at least one sort and one merge. Perhaps you can combine both in one function, but I wouldn't be surprised if that doesn't make a (big) difference.

2008-09-19 06:38
by mweerden


1

If the lists are sorted before you merge them, they can be merged, duplicate-removed and sorted at the same time. If they are sorted AND duplicate-free, then the merge/sort/duplicate-remove function becomes really trivial.

In fact, it might be better to change your insert function so that it performs a sorted insertion that checks for duplicates. Then you always have sorted lists that are free of duplicates, and merging them is a trivial matter.

Then again, you might prefer to have a fast insert function at the cost of sorting/removing duplicates later on.

2008-09-19 06:54
by aib


0

Wouldn't the remove-duplicates function operate better if the sort was applied before the remove-duplicates?

2008-09-19 10:46
by jdkoftinoff
Probably not. According to HyperSpec it does not require the list to be ordered - Antti Rasinen 2008-09-19 12:54


0

As Antti pointed out, you probably want to leverage REMOVE-DUPLICATES and SORT, though I'd probably use a keyword (or optional argument) for the test function: (defun merge-lists (list-1 list-2 sort-fn &key (test #'eql)) ...) or (defun merge-lists (list-1 list-2 sort-fn &optional (test #'eql) ...)

This way, you won't have to specify the test function (used by REMOVE-DUPLICATES to test for "is these considered duplicates"), unless EQL is not good enough.

2008-11-06 12:05
by Vatine


-2

Sounds like you need to be using Sets.

2008-09-19 06:44
by RKitson
Sets are generally not sorted - Matthew Schinckel 2008-09-19 10:14
RKitson has a point. If it's easy to use sets, then one never has to deal with duplicates in the first place. On the other hand, to use sets one needs to map from the animalcules used by dsm to integers. This sounds nontrivial. I came here with a similar question, and mapping to integers would not have been easy - Bill Evans at Mariposa 2012-01-29 00:59
Ads