Async Entity Component System based on the ideas of specs (https://github.com/amethyst/specs)
您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315
  1. # Parallel Iterators
  2. These are some notes on the design of the parallel iterator traits.
  3. This file does not describe how to **use** parallel iterators.
  4. ## The challenge
  5. Parallel iterators are more complicated than sequential iterators.
  6. The reason is that they have to be able to split themselves up and
  7. operate in parallel across the two halves.
  8. The current design for parallel iterators has two distinct modes in
  9. which they can be used; as we will see, not all iterators support both
  10. modes (which is why there are two):
  11. - **Pull mode** (the `Producer` and `UnindexedProducer` traits): in this mode,
  12. the iterator is asked to produce the next item using a call to `next`. This
  13. is basically like a normal iterator, but with a twist: you can split the
  14. iterator in half to produce disjoint items in separate threads.
  15. - in the `Producer` trait, splitting is done with `split_at`, which accepts
  16. an index where the split should be performed. Only indexed iterators can
  17. work in this mode, as they know exactly how much data they will produce,
  18. and how to locate the requested index.
  19. - in the `UnindexedProducer` trait, splitting is done with `split`, which
  20. simply requests that the producer divide itself *approximately* in half.
  21. This is useful when the exact length and/or layout is unknown, as with
  22. `String` characters, or when the length might exceed `usize`, as with
  23. `Range<u64>` on 32-bit platforms.
  24. - In theory, any `Producer` could act unindexed, but we don't currently
  25. use that possibility. When you know the exact length, a `split` can
  26. simply be implemented as `split_at(length/2)`.
  27. - **Push mode** (the `Consumer` and `UnindexedConsumer` traits): in
  28. this mode, the iterator instead is *given* each item in turn, which
  29. is then processed. This is the opposite of a normal iterator. It's
  30. more like a `for_each` call: each time a new item is produced, the
  31. `consume` method is called with that item. (The traits themselves are
  32. a bit more complex, as they support state that can be threaded
  33. through and ultimately reduced.) Unlike producers, there are two
  34. variants of consumers. The difference is how the split is performed:
  35. - in the `Consumer` trait, splitting is done with `split_at`, which
  36. accepts an index where the split should be performed. All
  37. iterators can work in this mode. The resulting halves thus have an
  38. idea about how much data they expect to consume.
  39. - in the `UnindexedConsumer` trait, splitting is done with
  40. `split_off_left`. There is no index: the resulting halves must be
  41. prepared to process any amount of data, and they don't know where that
  42. data falls in the overall stream.
  43. - Not all consumers can operate in this mode. It works for
  44. `for_each` and `reduce`, for example, but it does not work for
  45. `collect_into_vec`, since in that case the position of each item is
  46. important for knowing where it ends up in the target collection.
  47. ## How iterator execution proceeds
  48. We'll walk through this example iterator chain to start. This chain
  49. demonstrates more-or-less the full complexity of what can happen.
  50. ```rust
  51. vec1.par_iter()
  52. .zip(vec2.par_iter())
  53. .flat_map(some_function)
  54. .for_each(some_other_function)
  55. ```
  56. To handle an iterator chain, we start by creating consumers. This
  57. works from the end. So in this case, the call to `for_each` is the
  58. final step, so it will create a `ForEachConsumer` that, given an item,
  59. just calls `some_other_function` with that item. (`ForEachConsumer` is
  60. a very simple consumer because it doesn't need to thread any state
  61. between items at all.)
  62. Now, the `for_each` call will pass this consumer to the base iterator,
  63. which is the `flat_map`. It will do this by calling the `drive_unindexed`
  64. method on the `ParallelIterator` trait. `drive_unindexed` basically
  65. says "produce items for this iterator and feed them to this consumer";
  66. it only works for unindexed consumers.
  67. (As an aside, it is interesting that only some consumers can work in
  68. unindexed mode, but all producers can *drive* an unindexed consumer.
  69. In contrast, only some producers can drive an *indexed* consumer, but
  70. all consumers can be supplied indexes. Isn't variance neat.)
  71. As it happens, `FlatMap` only works with unindexed consumers anyway.
  72. This is because flat-map basically has no idea how many items it will
  73. produce. If you ask flat-map to produce the 22nd item, it can't do it,
  74. at least not without some intermediate state. It doesn't know whether
  75. processing the first item will create 1 item, 3 items, or 100;
  76. therefore, to produce an arbitrary item, it would basically just have
  77. to start at the beginning and execute sequentially, which is not what
  78. we want. But for unindexed consumers, this doesn't matter, since they
  79. don't need to know how much data they will get.
  80. Therefore, `FlatMap` can wrap the `ForEachConsumer` with a
  81. `FlatMapConsumer` that feeds to it. This `FlatMapConsumer` will be
  82. given one item. It will then invoke `some_function` to get a parallel
  83. iterator out. It will then ask this new parallel iterator to drive the
  84. `ForEachConsumer`. The `drive_unindexed` method on `flat_map` can then
  85. pass the `FlatMapConsumer` up the chain to the previous item, which is
  86. `zip`. At this point, something interesting happens.
  87. ## Switching from push to pull mode
  88. If you think about `zip`, it can't really be implemented as a
  89. consumer, at least not without an intermediate thread and some
  90. channels or something (or maybe coroutines). The problem is that it
  91. has to walk two iterators *in lockstep*. Basically, it can't call two
  92. `drive` methods simultaneously, it can only call one at a time. So at
  93. this point, the `zip` iterator needs to switch from *push mode* into
  94. *pull mode*.
  95. You'll note that `Zip` is only usable if its inputs implement
  96. `IndexedParallelIterator`, meaning that they can produce data starting
  97. at random points in the stream. This need to switch to push mode is
  98. exactly why. If we want to split a zip iterator at position 22, we
  99. need to be able to start zipping items from index 22 right away,
  100. without having to start from index 0.
  101. Anyway, so at this point, the `drive_unindexed` method for `Zip` stops
  102. creating consumers. Instead, it creates a *producer*, a `ZipProducer`,
  103. to be exact, and calls the `bridge` function in the `internals`
  104. module. Creating a `ZipProducer` will in turn create producers for
  105. the two iterators being zipped. This is possible because they both
  106. implement `IndexedParallelIterator`.
  107. The `bridge` function will then connect the consumer, which is
  108. handling the `flat_map` and `for_each`, with the producer, which is
  109. handling the `zip` and its preecessors. It will split down until the
  110. chunks seem reasonably small, then pull items from the producer and
  111. feed them to the consumer.
  112. ## The base case
  113. The other time that `bridge` gets used is when we bottom out in an
  114. indexed producer, such as a slice or range. There is also a
  115. `bridge_unindexed` equivalent for - you guessed it - unindexed producers,
  116. such as string characters.
  117. <a name="producer-callback">
  118. ## What on earth is `ProducerCallback`?
  119. We saw that when you call a parallel action method like
  120. `par_iter.reduce()`, that will create a "reducing" consumer and then
  121. invoke `par_iter.drive_unindexed()` (or `par_iter.drive()`) as
  122. appropriate. This may create yet more consumers as we proceed up the
  123. parallel iterator chain. But at some point we're going to get to the
  124. start of the chain, or to a parallel iterator (like `zip()`) that has
  125. to coordinate multiple inputs. At that point, we need to start
  126. converting parallel iterators into producers.
  127. The way we do this is by invoking the method `with_producer()`, defined on
  128. `IndexedParallelIterator`. This is a callback scheme. In an ideal world,
  129. it would work like this:
  130. ```rust
  131. base_iter.with_producer(|base_producer| {
  132. // here, `base_producer` is the producer for `base_iter`
  133. });
  134. ```
  135. In that case, we could implement a combinator like `map()` by getting
  136. the producer for the base iterator, wrapping it to make our own
  137. `MapProducer`, and then passing that to the callback. Something like
  138. this:
  139. ```rust
  140. struct MapProducer<'f, P, F: 'f> {
  141. base: P,
  142. map_op: &'f F,
  143. }
  144. impl<I, F> IndexedParallelIterator for Map<I, F>
  145. where I: IndexedParallelIterator,
  146. F: MapOp<I::Item>,
  147. {
  148. fn with_producer<CB>(self, callback: CB) -> CB::Output {
  149. let map_op = &self.map_op;
  150. self.base_iter.with_producer(|base_producer| {
  151. // Here `producer` is the producer for `self.base_iter`.
  152. // Wrap that to make a `MapProducer`
  153. let map_producer = MapProducer {
  154. base: base_producer,
  155. map_op: map_op
  156. };
  157. // invoke the callback with the wrapped version
  158. callback(map_producer)
  159. });
  160. }
  161. });
  162. ```
  163. This example demonstrates some of the power of the callback scheme.
  164. It winds up being a very flexible setup. For one thing, it means we
  165. can take ownership of `par_iter`; we can then in turn give ownership
  166. away of its bits and pieces into the producer (this is very useful if
  167. the iterator owns an `&mut` slice, for example), or create shared
  168. references and put *those* in the producer. In the case of map, for
  169. example, the parallel iterator owns the `map_op`, and we borrow
  170. references to it which we then put into the `MapProducer` (this means
  171. the `MapProducer` can easily split itself and share those references).
  172. The `with_producer` method can also create resources that are needed
  173. during the parallel execution, since the producer does not have to be
  174. returned.
  175. Unfortunately there is a catch. We can't actually use closures the way
  176. I showed you. To see why, think about the type that `map_producer`
  177. would have to have. If we were going to write the `with_producer`
  178. method using a closure, it would have to look something like this:
  179. ```rust
  180. pub trait IndexedParallelIterator: ParallelIterator {
  181. type Producer;
  182. fn with_producer<CB, R>(self, callback: CB) -> R
  183. where CB: FnOnce(Self::Producer) -> R;
  184. ...
  185. }
  186. ```
  187. Note that we had to add this associated type `Producer` so that
  188. we could specify the argument of the callback to be `Self::Producer`.
  189. Now, imagine trying to write that `MapProducer` impl using this style:
  190. ```rust
  191. impl<I, F> IndexedParallelIterator for Map<I, F>
  192. where I: IndexedParallelIterator,
  193. F: MapOp<I::Item>,
  194. {
  195. type MapProducer = MapProducer<'f, P::Producer, F>;
  196. // ^^ wait, what is this `'f`?
  197. fn with_producer<CB, R>(self, callback: CB) -> R
  198. where CB: FnOnce(Self::Producer) -> R
  199. {
  200. let map_op = &self.map_op;
  201. // ^^^^^^ `'f` is (conceptually) the lifetime of this reference,
  202. // so it will be different for each call to `with_producer`!
  203. }
  204. }
  205. ```
  206. This may look familiar to you: it's the same problem that we have
  207. trying to define an `Iterable` trait. Basically, the producer type
  208. needs to include a lifetime (here, `'f`) that refers to the body of
  209. `with_producer` and hence is not in scope at the impl level.
  210. If we had [associated type constructors][1598], we could solve this
  211. problem that way. But there is another solution. We can use a
  212. dedicated callback trait like `ProducerCallback`, instead of `FnOnce`:
  213. [1598]: https://github.com/rust-lang/rfcs/pull/1598
  214. ```rust
  215. pub trait ProducerCallback<T> {
  216. type Output;
  217. fn callback<P>(self, producer: P) -> Self::Output
  218. where P: Producer<Item=T>;
  219. }
  220. ```
  221. Using this trait, the signature of `with_producer()` looks like this:
  222. ```rust
  223. fn with_producer<CB: ProducerCallback<Self::Item>>(self, callback: CB) -> CB::Output;
  224. ```
  225. Notice that this signature **never has to name the producer type** --
  226. there is no associated type `Producer` anymore. This is because the
  227. `callback()` method is generically over **all** producers `P`.
  228. The problem is that now the `||` sugar doesn't work anymore. So we
  229. have to manually create the callback struct, which is a mite tedious.
  230. So our `MapProducer` code looks like this:
  231. ```rust
  232. impl<I, F> IndexedParallelIterator for Map<I, F>
  233. where I: IndexedParallelIterator,
  234. F: MapOp<I::Item>,
  235. {
  236. fn with_producer<CB>(self, callback: CB) -> CB::Output
  237. where CB: ProducerCallback<Self::Item>
  238. {
  239. return self.base.with_producer(Callback { callback: callback, map_op: self.map_op });
  240. // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  241. // Manual version of the closure sugar: create an instance
  242. // of a struct that implements `ProducerCallback`.
  243. // The struct declaration. Each field is something that need to capture from the
  244. // creating scope.
  245. struct Callback<CB, F> {
  246. callback: CB,
  247. map_op: F,
  248. }
  249. // Implement the `ProducerCallback` trait. This is pure boilerplate.
  250. impl<T, F, CB> ProducerCallback<T> for Callback<CB, F>
  251. where F: MapOp<T>,
  252. CB: ProducerCallback<F::Output>
  253. {
  254. type Output = CB::Output;
  255. fn callback<P>(self, base: P) -> CB::Output
  256. where P: Producer<Item=T>
  257. {
  258. // The body of the closure is here:
  259. let producer = MapProducer { base: base,
  260. map_op: &self.map_op };
  261. self.callback.callback(producer)
  262. }
  263. }
  264. }
  265. }
  266. ```
  267. OK, a bit tedious, but it works!