Iteratator Design
Q: Why would anyone use iterators when there is a foreach construct?
-
// examples in the D programming language
-
// http://digitalmars.com/d/
-
auto iter = collection.iterator;
-
while(iter.more) {
-
auto v = iter.next;
-
// …
-
}
-
foreach (v; collection) {
-
…
-
}
Answers:
1. Enumerable is the only common interface.
Why? Mixing I/O streams and containers.
Solution:
Make ‘enumerable’ compatible with the foreach construct (ie. it implements opApply).
2. Iterators are used to transform the data structure (ala c++ stl).
Solution A:
Have mutable iteration either as default or as an extra call.
-
foreach (inout i; collection.mutating)
-
i++;
Two reasons to have them as an extra protocol:
- The normal interface should be used for guarded operations
- Some collections (eg. sets) should not be able to be mutated in such a manner. Although this can be overcome by the default iterator itself, it supplies a consistent interface.
Solution B:
But what about the “walk of death?” One idea I’ve seen is a purging iterator.
-
foreach (inout purge, inout i; collection.purging) {
-
if (i.stale)
-
purge = true;
-
else
-
i++;
-
}
3. Iterators are used as markers.
Why? It Gives constant time access to elements in expensive constructs, such as
linked lists.
-
list.collection.insert(i, b);
This kind of operation greatly speeds insertion in middle of lists; it is
otherwise a O(n) operation: the cost to move every element in an array-based
list, as well as the cost of finding position i in a linked list.
This should not be used with containers with no concept of order, such as
hash-based constructs.
Solution A:
Make an easy way to iterate from one iterator to another or to the end of a container
-
foreach (i; Range[iter1..iterEnd])
-
{ … }
-
-
foreach (i; Range[iter])
-
{ … }
-
-
foreach (iter, i; IteratorRange[iter1..iterEnd])
-
{ … }
Solution B:
Make it easy to use the iterator in connection with the collection.
-
collection.insertBefore(iter);
Solution C:
Make it easy to re-get the value from an iterator.
-
iter.value
Solution D:
In light of Solutions A-C, rename Iterator to Marker and have it non-enumerable (ie. it only has a value() method.) This, however, will break compatibility with I/O iterators; therefore, it will probably never happen.
4. The iterators are wrapped into transformations or chains.
Why? Self evident-but highly useful.
5. Parallel Iteration
Solution:
Why? It just cannot be done in a foreach construct.
6. Speed.
Is this really an issue? Some of the solutions above assume a speed aspect (2A,B and 3B, for example).
Conclusions:
Looking at these answers, I can see which ones are likely to be useful, at least used most often. Iterators are not cursors (markers). The concern of speed whilst inserting into the middle of a data structure should govern the use of the data structure. Of course, if the data structure does not support that crucial operation it has lost value; but in the case of the a structure like a deque is more appropriate.
Markers are out, but I have learned some useful things in this little exercize.
Wednesday, July 9th, 2008 @ 1:04 pm