Can iterators be reset in Python?
Can iterators be reset in Python?
I see many answers suggesting itertools.tee, but thats ignoring one crucial warning in the docs for it:
This itertool may require significant
auxiliary storage (depending on how
much temporary data needs to be
stored). In general, if one iterator
uses most or all of the data before
another iterator starts, it is faster
to uselist()
instead oftee()
.
Basically, tee
is designed for those situation where two (or more) clones of one iterator, while getting out of sync with each other, dont do so by much — rather, they say in the same vicinity (a few items behind or ahead of each other). Not suitable for the OPs problem of redo from the start.
L = list(DictReader(...))
on the other hand is perfectly suitable, as long as the list of dicts can fit comfortably in memory. A new iterator from the start (very lightweight and low-overhead) can be made at any time with iter(L)
, and used in part or in whole without affecting new or existing ones; other access patterns are also easily available.
As several answers rightly remarked, in the specific case of csv
you can also .seek(0)
the underlying file object (a rather special case). Im not sure thats documented and guaranteed, though it does currently work; it would probably be worth considering only for truly huge csv files, in which the list
I recommmend as the general approach would have too large a memory footprint.
If you have a csv file named blah.csv That looks like
a,b,c,d
1,2,3,4
2,3,4,5
3,4,5,6
you know that you can open the file for reading, and create a DictReader with
blah = open(blah.csv, r)
reader= csv.DictReader(blah)
Then, you will be able to get the next line with reader.next()
, which should output
{a:1,b:2,c:3,d:4}
using it again will produce
{a:2,b:3,c:4,d:5}
However, at this point if you use blah.seek(0)
, the next time you call reader.next()
you will get
{a:1,b:2,c:3,d:4}
again.
This seems to be the functionality youre looking for. Im sure there are some tricks associated with this approach that Im not aware of however. @Brian suggested simply creating another DictReader. This wont work if youre first reader is half way through reading the file, as your new reader will have unexpected keys and values from wherever you are in the file.
Can iterators be reset in Python?
No. Pythons iterator protocol is very simple, and only provides one single method (.next()
or __next__()
), and no method to reset an iterator in general.
The common pattern is to instead create a new iterator using the same procedure again.
If you want to save off an iterator so that you can go back to its beginning, you may also fork the iterator by using itertools.tee