How to slice a deque?

Go To StackoverFlow.com

23

I've changed some code that used a list to using a deque. I can no longer slice into it, as I get the error:

TypeError: sequence index must be integer, not 'slice'

Here's a REPL that shows the problem.

>>> import collections
>>> d = collections.deque()
>>> for i in range(3):
...     d.append(i)
...
>>> d
deque([0, 1, 2])
>>> d[2:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence index must be integer, not 'slice'

So, is there a workaround to support slicing into deques in Python?

2012-04-03 23:56
by Drew Noakes
@HunterMcMillen, thank you for your comment. I'm using a deque for performance (as a circular buffer actually) and so copying data to a list each cycle isn't ideal - Drew Noakes 2012-04-04 00:02
rel: http://stackoverflow.com/questions/7064289/use-slice-notation-with-collections-dequ - georg 2012-04-04 00:08
@thg435, thanks for the cross reference. I Googled the error string and didn't find anything on SO, hence posting this new question. Some good insights there as well - Drew Noakes 2012-04-04 00:17


26

Try itertools.islice().

 deque_slice = collections.deque(itertools.islice(my_deque, 10, 20))

Indexing into a deque requires following a linked list from the beginning each time, so the islice() approach, skipping items to get to the start of the slice, will give the best possible performance (better than coding it as an index operation for each element).

You could easily write a deque subclass that does this automagically for you.

class sliceable_deque(collections.deque):
    def __getitem__(self, index):
        if isinstance(index, slice):
            return type(self)(itertools.islice(self, index.start,
                                               index.stop, index.step))
        return collections.deque.__getitem__(self, index)

Note that you can't use negative indices or step values with islice. It's possible to code around this, and might be worthwhile to do so if you take the subclass approach. For negative start or stop you can just add the length of the deque; for negative step, you'll need to throw a reversed() in there somewhere. I'll leave that as an exercise. :-)

The performance of retrieving individual items from the deque will be slightly reduced by the if test for the slice. If this is an issue, you can use an EAFP pattern to ameliorate this somewhat -- at the cost of making the slice path slightly less performant due to the need to process the exception:

class sliceable_deque(collections.deque):
    def __getitem__(self, index):
        try:
            return collections.deque.__getitem__(self, index)
        except TypeError:
            return type(self)(itertools.islice(self, index.start,
                                               index.stop, index.step))

Of course there's an extra function call in there still, compared to a regular deque, so if you really care about performance, you really want to add a separate slice() method or the like.

2012-04-04 00:04
by kindall
I'm constantly amazed at how magical the itertools module is. Seems I keep learning more and more about it every time I review SO - jdi 2012-04-04 00:18
My big go-to modules are itertools, functools, and collections. So much power - kindall 2012-04-04 00:20
Thanks, I was using negative indexing on a variable length buffer, but it was fixed length per instance. So, I flipped my logic around and made it positive indexing per instance. A tradeoff that was more than smart - RobotHumans 2017-07-03 23:10
I would expect a sliceable deque to implement __setitem__ and __delitem__ for slices too. Quick question, why do you return type(self)(...) in the except: branch (which suggests you want cooperative inheritance), but you reference collections.deque explicitly in the try: block instead of using super (which suggests you don't want cooperative inheritance). Am I missing something - wim 2018-04-24 04:49


8

If performance is a concern, consider a direct access/comprehension method as suggested in this answer. It's much faster than islice on large collections:

import timeit

setup = """
import collections, itertools
d = collections.deque(range(10000))
"""

print timeit.timeit('list(itertools.islice(d, 9000, 9010))', setup, number=10000)
## 0.631947040558
print timeit.timeit('[d[i] for i in range(9000, 9010)]', setup, number=10000)
## 0.0292208194733

As per @RaymondHettinger comment below, the comprehension method is only better when slices are short. On longer slices, islice convincingly wins. For example, here are timings for slicing a 10,000 items deque from the offset 6000:

offset  length      islice       compr
 6000      10      400.496      46.611
 6000      50      424.600     183.988
 6000      90      432.277     237.894
 6000     130      441.289     352.383
 6000     170      431.299     404.596
 6000     210      456.405     546.503
 6000     250      448.895     575.995
 6000     290      485.802     778.294
 6000     330      483.704     781.703
 6000     370      490.904     948.501
 6000     410      500.011     875.807
 6000     450      508.213    1045.299
 6000     490      518.894    1010.203
 6000     530      530.887    1192.784
 6000     570      534.415    1151.013
 6000     610      530.887    1504.779
 6000     650      539.279    1486.802
 6000     690      536.084    1650.810
 6000     730      549.698    1454.687
 6000     770      564.909    1576.114
 6000     810      545.001    1588.297
 6000     850      564.504    1711.607
 6000     890      584.197    1760.793
 6000     930      564.480    1963.091
 6000     970      586.390    1955.199
 6000    1010      590.706    2117.491

The comprehension does first few slices very fast, but the performance falls down dramatically as the length grows. islice is slower on smaller slices, but its average speed is much better.

This is how I tested:

import timeit

size = 10000
repeats = 100

setup = """
import collections, itertools
d = collections.deque(range(%d))
""" % size

print '%5s\t%5s\t%10s\t%10s' % ('offset', 'length', 'islice', 'compr')

for offset in range(0, size - 2000, 2000):
    for length in range(10, 2000, 40):
        t1 = timeit.timeit('list(itertools.islice(d, %d, %d))' % (offset, offset + length), setup, number=repeats)
        t2 = timeit.timeit('[d[i] for i in range(%d, %d)]' % (offset, offset + length), setup, number=repeats)
        print '%5d\t%5d\t%10.3f\t%10.3f' % (offset, length, t1 * 100000, t2  * 100000)
2012-04-04 00:27
by georg
And of course this could be wrapped in an overridden __getitem__ method as well - aaronasterling 2012-04-04 00:30
That's a really surprising result - kindall 2012-04-04 03:28
This answer is really misleading and draws erroneous conclusions from the timing. Indexing using d[i] is generally fast because the lookups strides the deque 62 elements at time and because it is smart enough to traverse from the right when i > len(d)//2. However, when you need a longer slice, then indexing one-at-a-time is a bad idea. For example, you would get a much different answer if you tried this timing for the slice 9000:10000 in a deque of 30000 elements - Raymond Hettinger 2012-04-04 08:00
Hi @RaymondHettinger, thanks for your comment! I edited the post to reflect your point - georg 2012-04-04 10:52
Didn't realize that deque uses a stride technique, that's nice to know - kindall 2012-04-04 22:13
Ads