# # INSTALLATION FROM LOCAL REPOSITORY # The cinnamon installation package is provided as an unsigned local repository and contains the following packages: md5sum package name d6c997de4516026714df97f1cf5baecc atmi-0.3.7-51-gb4f479d-Linux.rpm b89ac596e5ec4d590e9c007ffa59a03c hcc-1.2.18194-Linux.rpm 52a6f23372b8215c780d36ae1914cc14 hip_base-1.5.18231.rpm d2ed0ec14bef6e74f30303445396b0ef hipblas-0.11.3.1-Linux.rpm 09155c3e7ad0dd88111e7bfe06011c8d hip_doc-1.5.18231.rpm 8ccd3b9fd20ef445febabe7499685fc6 hip_hcc-1.5.18231.rpm ea4c93b834ae4fa63217234db726512a hip_nvcc-1.5.18231.rpm 90486696287586c01712c521ac5ffafb hip_samples-1.5.18231.rpm de101052aba29bd37f21b85709e8798b hsa-amd-aqlprofile-1.0.0-Linux.rpm d458ba20671c4588d23dd15b0e67b7c7 hsa-ext-rocr-dev-1.1.8-13-g2b0acc2-Linux.rpm 64aa3a2822e6159ee57c6161bff22309 hsakmt-roct-1.0.7-32-g6e77f65-Linux.rpm 975b1bde5a0c14af4c37d3060f658b94 hsakmt-roct-dev-1.0.7-32-g6e77f65-Linux.rpm ecd40427f4f4facc0f8193fc15a0741b hsa-rocr-dev-1.1.8-13-g2b0acc2-Linux.rpm 3e78a4a56d19bea9a0d805b26c15266b llvm-amdgpu-3.9.dev-1.x86_64.rpm 069203fd2b2c3c0103ecc25f0e026976 rocblas-0.13.3.5-Linux.rpm 5b4671763a87c4298a4e39d471c08baf rock-dkms-1.8-7.el7.noarch.rpm 3f8198772c97d48fec27327a1857b652 rocm_bandwidth_test-1.0.0-Linux.rpm bb7f9e62857c823ac65d0f9aff107c53 rocm-clang-ocl-0.3.0-c1b678e-Linux.rpm b18d67ff48305e45c24f8b501655f322 rocm-cmake-0.2.0-5e74c90-Linux.rpm 917a07e12cd041f8b9c7f51e049b7fd2 rocm-dev-0.0.cinnamon-D9-Linux.rpm b33d2235ed259f75f8f69b8910a12cce rocm-device-libs-0.0.1-Linux.rpm f0ec5d91773badf89da2ac59479c78ea rocm-dkms-0.0.cinnamon-D9-Linux.rpm 988d8261879699853c4e8dda96dbed4f rocminfo-1.0.0-Linux.rpm 4d1e132e20991f28b229b2166b49e0b5 rocm-libs-0.0.cinnamon-D9-Linux.rpm b8c7a2933b952ce3ad33e0870713bc8b rocm-opencl-1.2.0-2018061153.x86_64.rpm c3e0ee6954cb08c9e849fe656a32e256 rocm-opencl-devel-1.2.0-2018061153.x86_64.rpm 4219068e01c598e01b814bb755b8c911 rocm-smi-1.0.0_45_g1389662-1.x86_64.rpm 3c6186207f4594a63910107642e1a11f rocm-utils-0.0.cinnamon-D9-Linux.rpm e6b1a86ff71507e42103048f1c95ebfb rocprofiler-dev-1.0.0-Linux.rpm The cinnamon-d9.tar.bz2 archive has the following hash: md5sum package name 876de364bf8020c48ce4fb2dabc14ba7 yum_cinnamon-d9.tar.bz2 # # The devtoolset 7 environment supporting ROCm on CentOS 7.4 and RHEL 7.4 # Support for ROCm on CentOS/RHEL 7 requires a special runtime environment and additional dkms support packages to properly install in run. The native install of CentOS/RHEL 7 uses a 4.8.x version of gcc. Many of the components used in ROCm demand gcc versions greater than this. On CentOS/RHEL 7 this will require additional support for both compiling the packages and supporting the runtime. This support is provided using the devtoolset environment provided by Software Collections. # # Preparing RHEL 7 for installation # RHEL is a subscription based operating system, and must enable several external repositories to enable installation of the devtoolset 7 environment and the DKMS support files. These steps are not required for CentOS. 1. The subscription for RHEL must be enabled and attached to a pool id. Please see Obtaining an RHEL image and license page for instructions on registering your system with the RHEL subscription server and attaching to a pool id. 2. Enable the following repositories: $ sudo subscription-manager repos --enable rhel-7-server-rhscl-rpms $ sudo subscription-manager repos --enable rhel-7-server-optional-rpms $ sudo subscription-manager repos --enable rhel-7-server-extras-rpms 3. Enable additional repositories by downloading and installing the epel-release-latest-7 repository RPM: $ sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm # # Instructions on the install and setup of devtoolset 7 # To setup the Devtoolset-7 environment, follow the instructions on this page: https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ # # Preparing CentOS/RHEL 7 for DKMS Install # Installing the kernel drivers on CentOS/RHEL requires dkms support to be installed: $ sudo yum install -y epel-release $ sudo yum install -y dkms kernel-headers # # Enable Enable CMA support # Enable driver support for contiguous memory access: $ echo "options amdkfd cma_enable=1" | sudo tee -a /etc/modprobe.d/amd.conf At this point they system can install ROCm using the DKMS drivers. # # Installing ROCm on the system # At this point ROCm can be installed on the target system. Start by downloading and extracting the cinnamon d3 archive: $ cd /tmp && wget http://repo.radeon.com/rocm/archive/cinnamon/centos/yum_cinnamon-d3.tar.bz2 && tar -xvf yum_cinnamon-d3.tar.bz2 Then, create a /etc/yum.repos.d/rocm.repo file with the following contents: [ROCm] name=ROCm baseurl=file:///tmp/yum_cinnamon-d3/ enabled=1 gpgcheck=0 The repo's URL should point to the location of the repositories repodata database. Install ROCm components using these commands: $ sudo yum install rocm-dkms $ sudo yum install rocblas After installation, the system must be rebooted. After rebooting the amdgpu and amdkfd modules should be running and the /dev/kfd device should be available. # # Compiling applications using hcc, hip, etc. # To compile applications or samples, please use gcc-7.2 provided by the devtoolset 7 environment. To do this, compile all applications after running this command: 'scl enable devtoolset-7 bash' See the devtoolset documentation on the Software Collections website for more information. # # VERIFYING INSTALLATION # After rebooting the installation can be verified using the /opt/rocm/bin/rocminfo command. Run this command and make sure that the gfx906 device is detected and available. $ /opt/rocm/bin/rocminfo ===================== HSA System Attributes ===================== Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (number of timestamp) Machine Model: LARGE System Endianness: LITTLE ========== HSA Agents ========== ******* Agent 1 ******* Name: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0 Queue Min Size: 0 Queue Max Size: 0 Queue Type: MULTI Node: 0 Device Type: CPU ...... ******* Agent 2 ******* Name: gfx906 Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 128 Queue Min Size: 4096 Queue Max Size: 131072 Queue Type: MULTI Node: 1 Device Type: GPU ...... *** Done *** # # ROCM BANDWIDTH TEST # Source code for the rocm_bandwidth_test is available here: https://github.com/RadeonOpenCompute/rocm_bandwidth_test However, the test has be included in the cinnamon repository and can be installed using the following command: $ sudo yum install rocm_bandwidth_test The test can be used to test bandwidth between GPU peers and the host system: $ /opt/rocm/bin/rocm_bandwidth_test ...... .... RocmBandwidthTest Version: 1.0.0 Device: 0, Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz Device: 1, Device 67df Device Access D/D 0 1 0 1 1 1 1 1 Device Numa Distance D/D 0 1 0 0 N/A 1 20 0 Unidirectional peak bandwidth GB/s D/D 0 1 0 N/A 5.586775 1 5.685909 177.160329 Bdirectional peak bandwidth GB/s D/D 0 1 0 N/A 8.263104 1 8.271742 N/A # # ROCBLAS TESTING # There are 2 ways to use rocBLAS. 1. The rocBLAS and/or hipBLAS library packages are available and can be installed by the user. A user's application that needs gemm or other blas functionality can then link with rocBLAS library and use it. 2. rocBLAS can be built from public source code on github. In addition to the library, it comes with samples, tests and benchmark programs. The following are instructions for doing #2 above. Please make sure the compilers can be found. You may have to append to PATH, like $ export PATH=$PATH:/opt/rocm/bin:/opt/rocm/hcc/bin:/opt/rocm/hip/bin Assuming you have all D1 Debian packages available to you for Ubuntu, make sure to follow other instructions to install all the base packages and HCC & HIP. If you want to build everything yourself so that you can run rocblas-test or rocblas-bench on the Vega20 device, here are the steps: $ git clone -b develop-gfx9 https://github.com/ROCmSoftwarePlatform/rocBLAS.git rocBLAS-gfx9 $ cd rocBLAS-gfx9 $ ./install.sh -dc 2>&1 | tee make.out # # if everything looks good, the test & benchmark clients would have been built # $ pushd build/release/clients/staging # # You can run all tests if you like (with no arguments to ./rocblas-test), # there are known failures in half-precision gemm: you would see 600 failures in *gemm*half* tests, # or more specifically, to just run the dgemm tests: # $ ./rocblas-test --gtest_filter=*gemm*double* # # at the end of the above test you should see something like # [==========] 901 tests from 6 test cases ran. (48499 ms total) [ PASSED ] 901 tests. # # On Vega20, you can run a benchmark e.g (double precision GEMM with M=N=K=5760) # $ ./rocblas-bench --lda 5760 --ldb 5760 --ldc 5760 -m 5760 -n 5760 -k 5760 --transposeA N --transposeB T -f gemm -r d # # You should see something like: # Query device success: there are 1 devices Device ID 0 : Device 66a0 ------------------------------------------------------ with 34.3 GB memory, clock rate 100MHz @ computing capability 3.0 maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64 ------------------------------------------------------------------------- transA,transB,M,N,K,alpha,lda,ldb,beta,ldc,rocblas-Gflops,us N,T,5760,5760,5760,1,5760,5760,0,5760,4370.37,874539 # # From above, the Gflops reported is: 4370.37 # # # Run the help option to get usage details for rocblas-bench # $ ./rocblas-bench --help