Skip to content

Commit 72064b1

Browse files
authored
Merge pull request #71 from dmatrix/br_jsd_add_good_bad_usage_tip_3_4
added good/bad usage comments
2 parents e05010e + 6325aa3 commit 72064b1

File tree

1 file changed

+102
-44
lines changed

1 file changed

+102
-44
lines changed

ex_06_ray_api_calls.ipynb

Lines changed: 102 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@
162162
"</div>\n"
163163
],
164164
"text/plain": [
165-
"RayContext(dashboard_url='127.0.0.1:8265', python_version='3.8.13', ray_version='2.2.0', ray_commit='b6af0887ee5f2e460202133791ad941a41f15beb', address_info={'node_ip_address': '127.0.0.1', 'raylet_ip_address': '127.0.0.1', 'redis_address': None, 'object_store_address': '/tmp/ray/session_2022-12-31_16-01-17_526291_25937/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2022-12-31_16-01-17_526291_25937/sockets/raylet', 'webui_url': '127.0.0.1:8265', 'session_dir': '/tmp/ray/session_2022-12-31_16-01-17_526291_25937', 'metrics_export_port': 62563, 'gcs_address': '127.0.0.1:62593', 'address': '127.0.0.1:62593', 'dashboard_agent_listen_port': 52365, 'node_id': '4e9cbd35b4e72abbf50b1b6201b666cb5ce50f1aab8c5753b21f2283'})"
165+
"RayContext(dashboard_url='127.0.0.1:8265', python_version='3.8.13', ray_version='2.2.0', ray_commit='b6af0887ee5f2e460202133791ad941a41f15beb', address_info={'node_ip_address': '127.0.0.1', 'raylet_ip_address': '127.0.0.1', 'redis_address': None, 'object_store_address': '/tmp/ray/session_2023-01-01_08-43-09_914236_61566/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-01-01_08-43-09_914236_61566/sockets/raylet', 'webui_url': '127.0.0.1:8265', 'session_dir': '/tmp/ray/session_2023-01-01_08-43-09_914236_61566', 'metrics_export_port': 61622, 'gcs_address': '127.0.0.1:54042', 'address': '127.0.0.1:54042', 'dashboard_agent_listen_port': 52365, 'node_id': 'b5a66792d6abaa014788485a35caaa837573e16d451ea93aef504f8d'})"
166166
]
167167
},
168168
"execution_count": 3,
@@ -187,7 +187,7 @@
187187
"cell_type": "markdown",
188188
"metadata": {},
189189
"source": [
190-
"## Fetching Cluster Information\n",
190+
"### Fetching Cluster Information\n",
191191
"\n",
192192
"Many methods return information:\n",
193193
"\n",
@@ -214,9 +214,9 @@
214214
"text": [
215215
"\n",
216216
"ray.get_gpu_ids(): []\n",
217-
"ray.nodes(): [{'NodeID': '4e9cbd35b4e72abbf50b1b6201b666cb5ce50f1aab8c5753b21f2283', 'Alive': True, 'NodeManagerAddress': '127.0.0.1', 'NodeManagerHostname': 'Juless-MacBook-Pro-16', 'NodeManagerPort': 60973, 'ObjectManagerPort': 60972, 'ObjectStoreSocketName': '/tmp/ray/session_2022-12-31_16-01-17_526291_25937/sockets/plasma_store', 'RayletSocketName': '/tmp/ray/session_2022-12-31_16-01-17_526291_25937/sockets/raylet', 'MetricsExportPort': 62563, 'NodeName': '127.0.0.1', 'alive': True, 'Resources': {'node:127.0.0.1': 1.0, 'CPU': 4.0, 'memory': 49407454413.0, 'object_store_memory': 2147483648.0}}]\n",
218-
"ray.cluster_resources(): {'CPU': 4.0, 'node:127.0.0.1': 1.0, 'memory': 49407454413.0, 'object_store_memory': 2147483648.0}\n",
219-
"ray.available_resources(): {'memory': 49407454413.0, 'object_store_memory': 2147483648.0, 'CPU': 4.0, 'node:127.0.0.1': 1.0}\n",
217+
"ray.nodes(): [{'NodeID': 'b5a66792d6abaa014788485a35caaa837573e16d451ea93aef504f8d', 'Alive': True, 'NodeManagerAddress': '127.0.0.1', 'NodeManagerHostname': 'Juless-MacBook-Pro-16', 'NodeManagerPort': 50600, 'ObjectManagerPort': 50599, 'ObjectStoreSocketName': '/tmp/ray/session_2023-01-01_08-43-09_914236_61566/sockets/plasma_store', 'RayletSocketName': '/tmp/ray/session_2023-01-01_08-43-09_914236_61566/sockets/raylet', 'MetricsExportPort': 61622, 'NodeName': '127.0.0.1', 'alive': True, 'Resources': {'node:127.0.0.1': 1.0, 'CPU': 4.0, 'object_store_memory': 2147483648.0, 'memory': 49037236634.0}}]\n",
218+
"ray.cluster_resources(): {'memory': 49037236634.0, 'object_store_memory': 2147483648.0, 'node:127.0.0.1': 1.0, 'CPU': 4.0}\n",
219+
"ray.available_resources(): {'memory': 49037236634.0, 'CPU': 4.0, 'object_store_memory': 2147483648.0, 'node:127.0.0.1': 1.0}\n",
220220
"\n"
221221
]
222222
}
@@ -283,7 +283,7 @@
283283
"cell_type": "markdown",
284284
"metadata": {},
285285
"source": [
286-
"### @ray.method()\n",
286+
"## @ray.method()\n",
287287
"\n",
288288
"Related to `@ray.remote()`, [@ray.method()](https://ray.readthedocs.io/en/latest/package-ref.html#ray.method) allows you to specify the number of return values for a method in a task or an actor, by passing the `num_returns` keyword argument. None of the other `@ray.remote()` keyword arguments are allowed. Here is an example:"
289289
]
@@ -297,7 +297,7 @@
297297
"name": "stdout",
298298
"output_type": "stream",
299299
"text": [
300-
"(LIONEL MESSIE, 5, 12.100000000000001)\n"
300+
"(LIONEL MESSIE, 8, 12.100000000000001)\n"
301301
]
302302
}
303303
],
@@ -333,7 +333,7 @@
333333
"name": "stdout",
334334
"output_type": "stream",
335335
"text": [
336-
"(LIONEL MESSIE, 9, 12.100000000000001)\n"
336+
"(LIONEL MESSIE, 5, 12.100000000000001)\n"
337337
]
338338
}
339339
],
@@ -369,7 +369,7 @@
369369
"name": "stdout",
370370
"output_type": "stream",
371371
"text": [
372-
"(LIONEL MESSIE, 10, 12.100000000000001)\n"
372+
"(LIONEL MESSIE, 7, 12.100000000000001)\n"
373373
]
374374
}
375375
],
@@ -394,7 +394,7 @@
394394
"cell_type": "markdown",
395395
"metadata": {},
396396
"source": [
397-
"# Tips and Tricks for first-time users\n",
397+
"## Tips and Tricks for first-time users\n",
398398
"Because Ray's core APIs are simple and flexible, first time users can trip upon certain API calls in Ray's usage patterns. This short tips & tricks will insure you against unexpected results. Below we briefly explore a handful of API calls and their best practices."
399399
]
400400
},
@@ -438,8 +438,8 @@
438438
"name": "stdout",
439439
"output_type": "stream",
440440
"text": [
441-
"CPU times: user 54.1 ms, sys: 24.9 ms, total: 79 ms\n",
442-
"Wall time: 5.1 s\n"
441+
"CPU times: user 45 ms, sys: 21.5 ms, total: 66.5 ms\n",
442+
"Wall time: 5.09 s\n"
443443
]
444444
},
445445
{
@@ -486,8 +486,8 @@
486486
"name": "stdout",
487487
"output_type": "stream",
488488
"text": [
489-
"CPU times: user 19.6 ms, sys: 11.4 ms, total: 31 ms\n",
490-
"Wall time: 2.31 s\n"
489+
"CPU times: user 15 ms, sys: 9.86 ms, total: 24.8 ms\n",
490+
"Wall time: 2.26 s\n"
491491
]
492492
},
493493
{
@@ -554,7 +554,7 @@
554554
"name": "stdout",
555555
"output_type": "stream",
556556
"text": [
557-
"CPU times: user 144 ms, sys: 195 ms, total: 339 ms\n",
557+
"CPU times: user 136 ms, sys: 175 ms, total: 311 ms\n",
558558
"Wall time: 12.9 s\n"
559559
]
560560
},
@@ -603,8 +603,8 @@
603603
"name": "stdout",
604604
"output_type": "stream",
605605
"text": [
606-
"CPU times: user 7.23 s, sys: 2.32 s, total: 9.55 s\n",
607-
"Wall time: 10.8 s\n"
606+
"CPU times: user 7.22 s, sys: 3.33 s, total: 10.5 s\n",
607+
"Wall time: 11.9 s\n"
608608
]
609609
},
610610
{
@@ -638,6 +638,14 @@
638638
"One way to mitigate is to make the remote tasks \"larger\" in order to amortize invocation overhead. This is achieved by aggregating tasks into bigger chunks of 1000.\n"
639639
]
640640
},
641+
{
642+
"cell_type": "markdown",
643+
"metadata": {},
644+
"source": [
645+
"#### Bad Usage\n",
646+
"Avoid small many tiny tasks as the overhead to scheduler may be slower than serial execution"
647+
]
648+
},
641649
{
642650
"cell_type": "code",
643651
"execution_count": 16,
@@ -658,8 +666,8 @@
658666
"name": "stdout",
659667
"output_type": "stream",
660668
"text": [
661-
"CPU times: user 204 ms, sys: 26.7 ms, total: 230 ms\n",
662-
"Wall time: 3.92 s\n"
669+
"CPU times: user 223 ms, sys: 33.3 ms, total: 257 ms\n",
670+
"Wall time: 3.95 s\n"
663671
]
664672
}
665673
],
@@ -674,9 +682,20 @@
674682
"cell_type": "markdown",
675683
"metadata": {},
676684
"source": [
677-
"A huge difference in execution time, almost **4X** faster!"
685+
"A huge difference in execution time, almost **4X** faster!\n",
686+
"\n",
687+
"#### Good Usage\n",
688+
"Break or restructure many small tasks into batches or chunks of large Ray remote tasks, as demonstrated above\n",
689+
"\n",
690+
"#### Takeway tip 2:\n",
691+
"Where possible strive to batch tiny smaller Ray tasks into chuncks to reap the benefits of distributing them."
678692
]
679693
},
694+
{
695+
"cell_type": "markdown",
696+
"metadata": {},
697+
"source": []
698+
},
680699
{
681700
"cell_type": "markdown",
682701
"metadata": {},
@@ -732,7 +751,7 @@
732751
},
733752
{
734753
"cell_type": "code",
735-
"execution_count": 31,
754+
"execution_count": 20,
736755
"metadata": {},
737756
"outputs": [],
738757
"source": [
@@ -780,16 +799,16 @@
780799
},
781800
{
782801
"cell_type": "code",
783-
"execution_count": 32,
802+
"execution_count": 21,
784803
"metadata": {},
785804
"outputs": [
786805
{
787806
"name": "stdout",
788807
"output_type": "stream",
789808
"text": [
790-
"Duration: 9.13 seconds and predictions: [0, 0, 1, 1, 2, 3]\n",
791-
"CPU times: user 55.2 ms, sys: 30.5 ms, total: 85.7 ms\n",
792-
"Wall time: 9.13 s\n"
809+
"Duration: 8.96 seconds and predictions: [0, 0, 1, 1, 2, 3]\n",
810+
"CPU times: user 61.2 ms, sys: 33.2 ms, total: 94.4 ms\n",
811+
"Wall time: 8.96 s\n"
793812
]
794813
}
795814
],
@@ -802,6 +821,15 @@
802821
"print(f\"Duration: {round(time.time() - start, 2)} seconds and predictions: {predictions}\")"
803822
]
804823
},
824+
{
825+
"cell_type": "markdown",
826+
"metadata": {},
827+
"source": [
828+
"#### Bad Usage\n",
829+
"Waiting for large number of tasks to finish using `ray.get()` on all of them before processing\n",
830+
"the results returned."
831+
]
832+
},
805833
{
806834
"cell_type": "markdown",
807835
"metadata": {},
@@ -811,16 +839,16 @@
811839
},
812840
{
813841
"cell_type": "code",
814-
"execution_count": 33,
842+
"execution_count": 22,
815843
"metadata": {},
816844
"outputs": [
817845
{
818846
"name": "stdout",
819847
"output_type": "stream",
820848
"text": [
821-
"Duration: 6.37 seconds and predictions: [0, 1, 3, 1, 0, 2]\n",
822-
"CPU times: user 40.9 ms, sys: 22.9 ms, total: 63.8 ms\n",
823-
"Wall time: 6.37 s\n"
849+
"Duration: 6.88 seconds and predictions: [0, 1, 0, 1, 2, 3]\n",
850+
"CPU times: user 50.5 ms, sys: 28.3 ms, total: 78.9 ms\n",
851+
"Wall time: 6.88 s\n"
824852
]
825853
}
826854
],
@@ -846,6 +874,31 @@
846874
"**Notice**: You get some incremental difference. However, for compute intensive and many tasks, and overtime, this difference will be in order of magnitude."
847875
]
848876
},
877+
{
878+
"cell_type": "markdown",
879+
"metadata": {},
880+
"source": [
881+
"#### Good Usage:\n",
882+
"For large number of tasks in flight, use `ray.get()` and `ray.wait()` to implement pipeline execution of processing\n",
883+
"those tasks already finished. \n",
884+
"\n",
885+
"#### Takeaway Tip 3: \n",
886+
"Use pipeline execution to process results returned from the finished Ray tasks using `ray.get()` and `ray.wait()`"
887+
]
888+
},
889+
{
890+
"cell_type": "markdown",
891+
"metadata": {},
892+
"source": [
893+
"#### Exercise for **Tip 3**:\n",
894+
" * Extend or add more images of sizes: 1024, 2048, ...\n",
895+
" * Increase the number of returns to 2 from the `ray.wait`()`\n",
896+
" * Process the images\n",
897+
" \n",
898+
" \n",
899+
" Is there a difference in processing time between serial and pipelining?"
900+
]
901+
},
849902
{
850903
"cell_type": "markdown",
851904
"metadata": {},
@@ -875,7 +928,7 @@
875928
"name": "stdout",
876929
"output_type": "stream",
877930
"text": [
878-
" results = 125005622.08 and duration = 0.703 sec\n"
931+
" results = 124995931.27 and duration = 0.729 sec\n"
879932
]
880933
}
881934
],
@@ -889,6 +942,15 @@
889942
"print(f\" results = {results:.2f} and duration = {time.time() - start:.3f} sec\")"
890943
]
891944
},
945+
{
946+
"cell_type": "markdown",
947+
"metadata": {},
948+
"source": [
949+
"#### Bad Usage\n",
950+
"Avoid sending the same large objects to Ray remote tasks. This creates multiple copies of the same\n",
951+
"object in the Ray distributed object store. Storing and fetching and copying identical object can degrade performance overtime."
952+
]
953+
},
892954
{
893955
"cell_type": "markdown",
894956
"metadata": {},
@@ -906,7 +968,7 @@
906968
"name": "stdout",
907969
"output_type": "stream",
908970
"text": [
909-
" results = 124977503.45 and duration = 0.330 sec\n"
971+
" results = 124998578.35 and duration = 0.418 sec\n"
910972
]
911973
}
912974
],
@@ -924,20 +986,16 @@
924986
"cell_type": "markdown",
925987
"metadata": {},
926988
"source": [
927-
"### Exercise\n",
989+
"#### Good Usage\n",
990+
"Place or insert the large object store into Ray's remote object store and only send the object Ref to the Ray remote task.\n",
928991
"\n",
929-
"For **Tip 3**:\n",
930-
" * Extend or add more images of sizes: 1024, 2048, ...\n",
931-
" * Increase the number of returns to 2 from the `ray.wait`()`\n",
932-
" * Process the images\n",
933-
" \n",
934-
" \n",
935-
" Is there a difference in processing time between serial and pipelining?"
992+
"#### Takeaway Tip 4:\n",
993+
"Avoid sending the same large object to a Ray remote tasks. Instead, put it into the object store and only send the object ref."
936994
]
937995
},
938996
{
939997
"cell_type": "code",
940-
"execution_count": null,
998+
"execution_count": 26,
941999
"metadata": {},
9421000
"outputs": [],
9431001
"source": [
@@ -950,13 +1008,13 @@
9501008
"source": [
9511009
"### Summary\n",
9521010
"\n",
953-
"In this short tutorial, we got a short glimpse at the Ray Core APIs. By no means it was comprehensive, but we touched on some methods we \n",
954-
"have seen in the previous lessons; however, here with those methods, we explored additional arguments to the `.remote()` call such as number of return\n",
1011+
"In this short tutorial, we got a short glimpse at the Ray Core APIs. By no means it was comprehensive, but we touched upon some methods we \n",
1012+
"have seen in the previous lessons. With those methods, we explored additional arguments to the `.remote()` call such as number of return\n",
9551013
"statements as well as how to supply runtime environments and dependencies for your Ray cluster during `ray.init()` call. Note that some arguments to `ray.init()` \n",
956-
"can also be supplied to `ray.remote()` decorator, such as num_cpus, num_gpus, runtime_env, etc. \n",
1014+
"can also be supplied to `ray.remote()` decorator, such as `num_cpus`, `num_gpus`, `runtime_env`, etc. \n",
9571015
"\n",
9581016
"More importantly, we walked through some tips and tricks that many developers new to Ray can easily stumble upon. Although the examples were short and simple,\n",
959-
"the idea and cautionary tales are important part of the learning process."
1017+
"the lessons behind the cautionary tales are important part of the learning process."
9601018
]
9611019
},
9621020
{

0 commit comments

Comments
 (0)