THRUST

From HPCC Wiki
Revision as of 20:11, 27 October 2022 by James (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Thrust provides a rich collection of data parallel primitives such as scan, sort, and reduce, which can be combined together to implement complex algorithms with concise, readable source code. By describing your computation in terms of these high-level abstractions you provide Thrust with the freedom to select the most efficient implementation automatically. As a result, Thrust can be utilized in rapid prototyping of CUDA applications, where programmer productivity matters most, as well as in production, where robustness and absolute performance are crucial.

More detail on the Thrust library is available here [1]. There are a collection of example codes here [2]. The Thrust Manual is available here [3]

Here is a basic C++ example code, which creates and fills a vector on the Host, resizes it, copies it to the Device, modifies it there, and prints out the modified values.

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>

#include <iostream>

int main(void)
{
    // H has storage for 4 integers
    thrust::host_vector<int> H(4);

    // initialize individual elements
    H[0] = 14;
    H[1] = 20;
    H[2] = 38;
    H[3] = 46;
    
    // H.size() returns the size of vector H
    std::cout << "H has size " << H.size() << std::endl;

    // print contents of H
    for(int i = 0; i < H.size(); i++)
        std::cout << "H[" << i << "] = " << H[i] << std::endl;

    // resize H
    H.resize(2);
    
    std::cout << "H now has size " << H.size() << std::endl;

    // Copy host_vector H to device_vector D
    thrust::device_vector<int> D = H;
    
    // elements of D can be modified
    D[0] = 99;
    D[1] = 88;
    
    // print contents of D
    for(int i = 0; i < D.size(); i++)
        std::cout << "D[" << i << "] = " << D[i] << std::endl;

    // H and D are automatically deleted when the function returns
    return 0;
}

Assuming this source file were called 'vectcopy.cu', it can be compiled on PENZIAS:

nvcc -o vectcopy.exe vectcopy.cu

Once compiled, the 'vectorcopy.exe' executable can be run using the following SLURM script:

#!/bin/bash
#SLURM -q production_gpu
#SLURM -N THRUST_vcopy
#SLURM -l select=1:ncpus=1:ngpus=1 
#SLURM -l place=free
#SLURM -V

# Find out which compute node the job is using
echo ""
echo -n "Running job on compute node ... " 
hostname

echo ""
echo "SLURM node file is located here ... "  $SLURM_NODEFILE
echo -n "SLURM node file contains ... "
cat  $SLURM_NODEFILE
echo ""

# Change to working directory
cd $SLURM_O_WORKDIR

# Running executable on a single, gpu-enabled
# compute node using 1 CPU and 1 GPU.
echo "CUDA job is starting ... "
echo ""

./vectcopy.exe

echo ""
echo "CUDA job is done!"