
**AMD ROCm support can't be enabled at this time, since the base model installation process requires CUDA** A possible solution is to dig into the Mask R-CNN code and find out how to change this dependency, bu that is beyond the scope of simple conversion. Unfortunately, in order to run the benchmark, you need to setup and build the Mask R-CNN model as a python package, which requires CUDA installed. This benchmark is based on PyTorch framework and Mask R-CNN model created and published by Facebook. params_override="runtime.distribution_strategy=one_device" Then, you can start training with 1 GPU as In order to train Resnet50 with AMD ROCm support, launch the `rocm/tensorflow:rocm4.1.1-tf2.4-dev` docker image, clone the repository, export python path with `$ export PYTHONPATH="$PYTHONPATH://models"`. The fork of official repo with these commits has been done in The default batch size of 128 may be too large (which was the case for 1 AMD Vega 64 8gb), so the size needs to be manually in `flag_overrides` and `flag_overrides` with `'batch_size':32` (or other number according to your card)\ Need to add a line `'num_gpus':flags_obj.num_gpus` to `flag_overrides` dictionary on line #153 The number of gpus doesn't propagate correctly from entry flags to the actual config. The data channel order has to be forced to be `data_format = 'channels_last'` at line #234-235, outside of the if/else clause. There are 3 issues, that need to be addressed, in order to utilize AMD GPU. In the `official/vision/image_classification` directory of the repo, you can find resnet training script `classifier_trainer.py`, which will be used to start the training process after some modifications to enable AMD ROCm support.

We had more success training the Resnet50 model from official tensorflow repository **AMD ROCm support can be enabled, but the benchmark itself is troublesome to run with a large dataset and outdated documentation and framework used** In the container, you can directly run `./run_and_time.sh` script, even without the actual dataset present, you can see that your AMD GPU is recognized by tensorflow with messages like group-add video -cap-add=SYS_PTRACE -rm \

device=/dev/dri -ipc=host -shm-size 16G \ $ sudo docker run -it -network=host -device=/dev/kfd \ Next, build the image, and run the container with Substitute the base image from `nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04` to `rocm/tensorflow:rocm4.1-tf1.15-dev`, remove installation of `tf-nightly-gpu`, since it is already provided with the ROCm image and add `mlperf-compliance` instead. Locate to image_classification/tensorflow/ and inspect the Dockerfile there. Nevertheless, even without the Imagenet dataset, we can verify that AMD GPU can be used for this benchmark. The Readme files are outdated with several broken links regarding the base model used and dataset (Imagenet2014) preparation. This benchmark was one of the first, and doesn't seem to have been updated since ~2019. Run the container and try to launch the training process Substitute the base image with the one from ROCm with the support of appropriate DL frameworkĤ. Locate the dockerfile in each benchmarkĢ. The general strategy at porting the MLCommons benchmarks to use AMD GPUs via ROCm is toġ. Which is supposed to give you some information about the AMD GPU available. Make sure the ROCm drivers are available from inside the container in the same way as described by the ROCm installation tutorial, run: More information about installing DL frameworks with ROCm support can be found at: These include basic OS images (Ubuntu, CentOS), and the more useful images with support of popular DL frameworks, like Tensorflow(v1&v2) and PyTorch. There are several docker images available with ROCm libraries already installed. In order to enable support of ROCm docker containers, you need the ROCm drivers installed on yout machine.įollow the guidelines for your operating system available at: Document problems\results that occur in the process. In this guide, we try to run MLCommons Benchmarks with AMD hardware using ROCm. # tags: `AMD` `MLcCommons` `ROCm` `benchmark`
