Name	Name	Last commit message	Last commit date
parent directory ..
scripts	scripts
README.md	README.md
pyproject.toml	pyproject.toml
sample.json	sample.json
tensorflow_with_horovod.ipynb	tensorflow_with_horovod.ipynb
uv.lock	uv.lock

`Enable distrubted deep learning using Intel® Optimization for Horovod and Tensorflow*` Sample

The Enable distrubted inference using Intel® Optimization for Horovod and Tensorflow* sample guides you through the process of how to run inference & training workloads across multi-cards using Intel Optimization for Horovod and TensorFlow* on Intel® dGPU's.

Area	Description
What you will learn	Enable distrubted deep learning using Intel Optimization for Horovod and Tensorflow*
Time to complete	10 minutes
Category	Code Optimization

Purpose

Through the implementation of end-to-end deep learning example, this sample demonstrates important concepts:

The performance benefits of distrubuting deep learning workload among multiple dGPUs

Prerequisites

Optimized for	Description
OS	Linux; Ubuntu* 18.04 or newer
Hardware	Intel® Data Center GPU Max/Flex Series
Software	AI Tools

For Local Development Environments

You will need to download and install the following toolkits, tools, and components to use the sample.

AI Tools

You can get the AI Tools from Intel® oneAPI Toolkits.
See Get Started with the AI Tools for Linux* for AI Tools installation information and post-installation steps and scripts.
Jupyter Notebook

Install using PIP: $pip install notebook.
Alternatively, see Installing Jupyter for detailed installation instructions.

For Intel® Developer Cloud (Beta)

The necessary tools and components are already installed in the environment other than intel-optimization-for-horovod package. See Intel® Developer Cloud for oneAPI for information.

Key Implementation Detailes

Jupyter Notebook

Notebook	Description
`tensorflow_distributed_inference_with_horovod.ipynb`	Enabling Multi-Card Inference/Training with Intel® Optimizations for Horovod

Run the distrubuted inference sample using Intel® Optimization for Horovod and Tensorflow:

On Linux*

Set up oneAPI environment by running setvars.sh script Default installation: source /opt/intel/oneapi/setvars.sh

or source /path/to/oneapi/setvars.sh

Set up conda environment.

conda create --name tensorflow_xpu --clone tensorflow-gpu
conda activate tensorflow_xpu

Install dependencies: If you havent already done so, you will need to install Jupyter notebook and Intel® Optimization for Horovod
```
pip install intel-optimization-for-horovod
```
```
pip install notebook
```
Launch Jupyter Notebook.
```
jupyter notebook --ip=0.0.0.0
```
Follow the instructions to open the URL with the token in your browser.

Locate and select the Notebook.

tensorflow_distributed_inference_with_horovod.ipynb

Change your Jupyter Notebook kernel to tensorflow_xpu.
Run every cell in the Notebook in sequence.

Run the Sample on Intel® Developer Cloud (Optional)

If you do not already have an account, follow the readme to request an Intel® Developer Cloud account at Setup an Intel® Developer Cloud Account.
On a Linux* system, open a terminal.
SSH into Intel® Developer Cloud.
```
ssh idc
```
Run oneAPI setvars script. source /opt/intel/oneapi/setvars.sh
Activate the prepared tensorflow_xpu enviornment.
```
conda activate tensorflow_xpu
```

Install Intel® Optimizations for Horovod

pip install intel-optimization-for-horovod

Follow the instructions here to launch a jupyter notebook on the Intel® developer cloud.

Locate and select the Notebook.

tensorflow_distributed_inference_with_horovod.ipynb

Change the kernel to tensorflow_xpu.
Run every cell in the Notebook in sequence.

Troubleshooting

If you receive an error message, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the Diagnostics Utility for Intel® oneAPI Toolkits User Guide for more information on using the utility.

License

Code samples are licensed under the MIT license. See License.txt for details.

Third party program Licenses can be found here: third-party-programs.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

`Enable distrubted deep learning using Intel® Optimization for Horovod and Tensorflow*` Sample

Purpose

Prerequisites

For Local Development Environments

For Intel® Developer Cloud (Beta)

Key Implementation Detailes

Jupyter Notebook

Run the distrubuted inference sample using Intel® Optimization for Horovod and Tensorflow:

On Linux*

Run the Sample on Intel® Developer Cloud (Optional)

Troubleshooting

License

FilesExpand file tree

IntelTensorFlow_Horovod_Distributed_Deep_Learning

Directory actions

More options

Directory actions

More options

Latest commit

History

IntelTensorFlow_Horovod_Distributed_Deep_Learning

Folders and files

parent directory

README.md

Enable distrubted deep learning using Intel® Optimization for Horovod and Tensorflow* Sample

Purpose

Prerequisites

For Local Development Environments

For Intel® Developer Cloud (Beta)

Key Implementation Detailes

Jupyter Notebook

Run the distrubuted inference sample using Intel® Optimization for Horovod and Tensorflow:

On Linux*

Run the Sample on Intel® Developer Cloud (Optional)

Troubleshooting

License

`Enable distrubted deep learning using Intel® Optimization for Horovod and Tensorflow*` Sample