Closed
Description
I have this script dask_mpi_test.py
:
from dask_mpi import initialize
initialize()
from distributed import Client
import dask
client = Client()
df = dask.datasets.timeseries()
print(df.groupby(['time', 'name']).mean().compute())
print(client)
When I try to run this script with:
mpirun -np 4 python dask_mpi_test.py
I get these errors:
~/workdir $ mpirun -np 4 python dask_mpi_test.py
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO - Scheduler at: tcp://xxxxxx:8786
distributed.scheduler - INFO - bokeh at: :8787
distributed.worker - INFO - Start worker at: tcp://xxxxx:44712
/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/bokeh/core.py:57: UserWarning:
Port 8789 is already in use.
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
warnings.warn('\n' + msg)
distributed.worker - INFO - Start worker at: tcp://xxxxxx:36782
distributed.worker - INFO - Listening to: tcp://:44712
distributed.worker - INFO - bokeh at: :8789
distributed.worker - INFO - Listening to: tcp://:36782
distributed.worker - INFO - Waiting to connect to: tcp://xxxxxx:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - bokeh at: :43876
distributed.worker - INFO - Waiting to connect to: tcp://xxxxx:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Memory: 3.76 GB
distributed.worker - INFO - Memory: 3.76 GB
distributed.worker - INFO - Local Directory: /gpfs/fs1/scratch/abanihi/worker-uoz0vtci
distributed.worker - INFO - Local Directory: /gpfs/fs1/scratch/abanihi/worker-bb0u_737
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - -------------------------------------------------
Traceback (most recent call last):
File "dask_mpi_test.py", line 6, in <module>
client = Client()
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/client.py", line 640, in __init__
self.start(timeout=timeout)
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/client.py", line 763, in start
sync(self.loop, self._start, **kwargs)
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/utils.py", line 321, in sync
six.reraise(*error[0])
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/utils.py", line 306, in f
result[0] = yield future
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/client.py", line 851, in _start
yield self._ensure_connected(timeout=timeout)
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/distributed/client.py", line 892, in _ensure_connected
self._update_scheduler_info())
File "/glade/work/abanihi/softwares/miniconda3/envs/analysis/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
tornado.util.Timeout
$ conda list dask
# packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis:
#
# Name Version Build Channel
dask 1.2.0 py_0 conda-forge
dask-core 1.2.0 py_0 conda-forge
dask-jobqueue 0.4.1+28.g5826abe pypi_0 pypi
dask-labextension 0.3.3 pypi_0 pypi
dask-mpi 1.0.2 py37_0 conda-forge
$ conda list tornado
# packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis:
#
# Name Version Build Channel
tornado 5.1.1 py37h14c3975_1000 conda-forge
$ conda list distributed
# packages in environment at /glade/work/abanihi/softwares/miniconda3/envs/analysis:
#
# Name Version Build Channel
distributed 1.27.0 py37_0 conda-forge
Is anyone aware of anything that must have happened in an update to dask
or distributed
to cause dask-mpi
to break?
Ccing @kmpaul
Metadata
Metadata
Assignees
Labels
No labels