Iteratee
In functional programming, an iteratee is a composable abstraction for incrementally processing sequentially presented chunks of input data in a purely functional fashion. With iteratees, it is possible to lazily transform how a resource will emit data, for example, by converting each chunk of the input to uppercase as they are retrieved or by limiting the data to only the five first chunks without loading the whole input data into memory. Iteratees are also responsible for opening and closing resources, providing predictable resource management.
On each step, an iteratee is presented with one of three possible types of values: the next chunk of data, a value to indicate no data is available, or a value to indicate the iteration process has finished. It may return one of three possible types of values, to indicate to the caller what should be done next: one that means "stop", one that means "continue", and one that means "signal an error". The latter types of values in effect represent the possible "states" of an iteratee. An iteratee would typically start in the "continue" state.
Iteratees are used in Haskell and Scala, and are also available for F#. Various slightly different implementations of iteratees exist. For example, in the Play framework, they involve Futures so that asynchronous processing can be performed.
Because iteratees are called by other code which feeds them with data, they are an example of inversion of control. However, unlike many other examples of inversion of control such as SAX XML parsing, the iteratee retains a limited amount of control over the process. It cannot reverse back and look at previous data, but it can stop the process cleanly without throwing an exception.
Commonly associated abstractions
The following abstractions are not strictly speaking necessary to work with iteratees, but they do make it more convenient.Enumerators
An Enumerator is a convenient abstraction for feeding data into an iteratee from an arbitrary data source. Typically the enumerator will take care of any necessary resource cleanup associated with the data source. Because the enumerator knows exactly when the iteratee has finished reading data, it will do the resource cleanup at exactly the right time – neither too early nor too late. However, it can do this without needing to know about, or being co-located to, the implementation of the iteratee – so enumerators and iteratees form an example of separation of concerns.Enumeratees
An Enumeratee is a convenient abstraction for transforming the output of either an enumerator or iteratee, and feeding that output to an iteratee. For example, a "map" enumeratee would map a function over each input chunk.Motivations
Iteratees were created due to problems with existing purely functional solutions to the problem of making input/output composable yet correct. Lazy I/O in Haskell allowed pure functions to operate on data on disk as if it were in memory, without explicitly doing I/O at all after opening the file - a kind of memory-mapped file feature - but because it was impossible in general for the runtime to know whether the file or other resource was still needed, excessive numbers of files could be left open unnecessarily, resulting in file descriptor exhaustion at the operating system level. Traditional C-style I/O, on the other hand, was too low-level and required the developer to be concerned with low-level details such as the current position in the file, which hindered composability. Iteratees and enumerators combine the high-level functional programming benefits of lazy I/O, with the ability to control resources and low-level details where necessary afforded by C-style I/O.Examples
Uses
Iteratees are used in the Play framework to push data out to long-running Comet and WebSocket connections to web browsers.Iteratees may also be used to perform incremental parsing, for example of JSON.
It is important to note, however, that iteratees are a very general abstraction and can be used for arbitrary kinds of sequential information processing - and need not involve any I/O at all. This makes it easy to repurpose an iteratee to work on an in-memory dataset instead of data flowing in from the network.
History
In a sense, a distant predecessor of the notion of an enumerator pushing data into a chain of one or more iteratees, was the pipeline concept in operating systems. However, unlike a typical pipeline, iteratees are not separate processes - or even separate threads, although they can perform work in a similar manner to a chain of worker threads sending messages to each other. This means that iteratees are more lightweight than processes or threads - unlike the situations with separate processes or threads, no extra stacks are needed.Iteratees and enumerators were invented by Oleg Kiselyov for use in Haskell. Later, they were introduced into Scalaz and into Play Framework 2.0.
Formal semantics
Iteratees have been formally modelled as free monads, allowing equational laws to be validated, and employed to optimise programs using iteratees.Alternatives
- Iterators may be used instead of iteratees in Scala, but they are imperative, so are not a purely functional solution.
- In Haskell, two alternative abstractions known as Conduits and Pipes have been developed.. Conduits in particular are associated with substantially richer libraries of primitives and combinators than iteratees; conduit adapters for incremental functionalities such as parsing HTML, XML, generalised parsing, making HTTP requests and processing the responses, exist, making conduits more suitable than iteratees for industrial software development in Haskell, out of the box.
- There is also a high-level abstraction named . In Scala there is a package called , whose ancestry can be traced back to machines via several ports, renames and refactors.
- In Haskell, the package exists. It provides a simpler solution to some of the same problems, which essentially involves being "strict enough" to pull all data that is required, or might be required, through a pipeline which takes care of cleaning up the resources on completion.