|
36 | 36 | "cell_type": "markdown",
|
37 | 37 | "metadata": {},
|
38 | 38 | "source": [
|
39 |
| - "In chapter Distributed, we showed that executing a calculation (created using delayed) with the distributed executor is identical to any other executor. However, we now have access to additional functionality, and control over what data is held in memory.\n", |
| 39 | + "In the previous chapter, we showed that executing a calculation (created using delayed) with the distributed executor is identical to any other executor. However, we now have access to additional functionality, and control over what data is held in memory.\n", |
40 | 40 | "\n",
|
41 |
| - "To begin, the `futures` interface (derived from the built-in `concurrent.futures`) allow map-reduce like functionality. We can submit individual functions for evaluation with one set of inputs, or evaluated over a sequence of inputs with `submit()` and `map()`. Notice that the call returns immediately, giving one or more *futures*, whose status begins as \"pending\" and later becomes \"finished\". There is no blocking of the local Python session." |
| 41 | + "To begin, the `futures` interface (derived from the built-in `concurrent.futures`) allows map-reduce like functionality. We can submit individual functions for evaluation with one set of inputs, or evaluated over a sequence of inputs with `submit()` and `map()`. Notice that the call returns immediately, giving one or more *futures*, whose status begins as \"pending\" and later becomes \"finished\". There is no blocking of the local Python session." |
42 | 42 | ]
|
43 | 43 | },
|
44 | 44 | {
|
|
542 | 542 | "\n",
|
543 | 543 | "@delayed\n",
|
544 | 544 | "def summation(*a):\n",
|
545 |
| - " return sum(*a)\n", |
| 545 | + " return sum(a)\n", |
546 | 546 | "\n",
|
547 | 547 | "ina = [5, 25, 30]\n",
|
548 | 548 | "inb = [5, 5, 6]\n",
|
549 |
| - "out = summation([ratio(a, b) for (a, b) in zip(ina, inb)])\n", |
| 549 | + "out = summation(*[ratio(a, b) for (a, b) in zip(ina, inb)])\n", |
550 | 550 | "f = c.compute(out)\n",
|
551 | 551 | "f"
|
552 | 552 | ]
|
|
586 | 586 | "source": [
|
587 | 587 | "ina = [5, 25, 30]\n",
|
588 | 588 | "inb = [5, 0, 6]\n",
|
589 |
| - "out = summation([ratio(a, b) for (a, b) in zip(ina, inb)])\n", |
| 589 | + "out = summation(*[ratio(a, b) for (a, b) in zip(ina, inb)])\n", |
590 | 590 | "f = c.compute(out)\n",
|
591 | 591 | "c.gather(f)"
|
592 | 592 | ]
|
|
634 | 634 | "metadata": {},
|
635 | 635 | "source": [
|
636 | 636 | "The trouble with this approach is that Dask is meant for the execution of large datasets/computations - you probably can't simply run the whole thing \n",
|
637 |
| - "in one local thread, else you wouldn't have used Dask in the first place. So the code above should only be used on a small part of the data that also exchibits the error. \n", |
| 637 | + "in one local thread, else you wouldn't have used Dask in the first place. So the code above should only be used on a small part of the data that also exihibits the error. \n", |
638 | 638 | "Furthermore, the method will not work when you are dealing with futures (such as `f`, above, or after persisting) instead of delayed-based computations.\n",
|
639 | 639 | "\n",
|
640 |
| - "As alternative, you can ask the scheduler to analyze your calculation and find the specific sub-task responsible for the error, and pull only it and its dependnecies locally for execution." |
| 640 | + "As an alternative, you can ask the scheduler to analyze your calculation and find the specific sub-task responsible for the error, and pull only it and its dependnecies locally for execution." |
641 | 641 | ]
|
642 | 642 | },
|
643 | 643 | {
|
|
0 commit comments