# CUDA Functions

*CUDALink* is a built-in Wolfram Language package that provides a simple and powerful interface for using CUDA within the Wolfram Language's streamlined work flow.

*CUDALink* provides you with carefully tuned linear algebra, discrete Fourier transforms, and image processing algorithms. You can also write your own *CUDALink* modules with minimal effort. Using *CUDALink* from within the Wolfram Language gives you access to the Wolfram Language's features, including visualization, import/export, and programming capabilities.

In this section, the built-in *CUDALink* functions are discussed, and a handful of applications are also given.

## List Processing

*CUDALink* list processing functions are designed to mimic the existing Wolfram Language functions, and, while less general than the Wolfram Language's implementation, they do provide the most commonly used functions. *CUDALink* implements the following list processing functions.

CUDAMap | map a function to each element of an input list |

CUDAFold | given an initial value and a function , this returns |

CUDAFoldList | given an initial value and a function , this returns |

CUDASort | sort a given list |

CUDATotal | find the total value of a given list |

*CUDALink*'s list processing functions.

The above functions can be used as any Wolfram Language functions. To use the functions above, first load the *CUDALink* application.

Once loaded, the above functions can be used. This maps the function Cos to a random list.

Computation can be strung together. Here you can find the total of the above list using CUDAFold (called reduction in the GPU programming field).

### Applications

In many cases, the above list operators are pivotal to many algorithms. Here, a few are discussed.

#### Line of Sight

Given a height map, the line of sight problem finds all points on the height map visible from a single point. It does so by first transforming the height map to an angular map, and then performing a Max fold on the angular map. The results from the maxAngle list can then be easily used to determine if a point is visible or not.

This generates a sample height map.

This computes the angular map. Here you use the first point in the height map as a reference angle.

The maxAngle list is computed using CUDAFoldList.

An angular is marked as visible if angularMap_{i}>maxAngle_{i}.

This displays all points visible from the reference point.

#### Random Walk

Random walk is a common tool used in many applications, such as the analysis of Brownian motion in physics. This shows a random walk in one dimension using CUDAFoldList.

Choosing discrete random numbers, the walk can be performed on a lattice.

#### Histogram

Histograms are commonly used in many applications to place elements in bins. Here, you can use CUDASort to simplify the histogram calculation. This sorts the input image.

Once sorted, given a value to count the number of its occurrences, you need to scan the sorted list until its value changes. To find the first element, you have to count the number of elements until the element is not equal to the first.

This resulting histogram is plotted using ListLinePlot.

#### Dot Product

This finds the dot product of two vectors.

## Image Processing

The *CUDALink* Image Processing module can be classified into three categories. The first is convolution, which is optimized for CUDA. The second is morphology, which contains abilities such as erosion, dilation, opening, and closing. Finally, there are the binary operators. These are the image multiplication, division, subtraction, and addition operators. All operations work on either images or lists.

CUDAImageConvolve | convolve the kernel with the specified kernel |

CUDABoxFilter | convolve the kernel with the BoxMatrix kernel |

CUDAErosion | perform morphological erosion |

CUDADilation | perform morphological dilation |

CUDAOpening | perform morphological opening |

CUDAClosing | perform morphological closing |

CUDAClamp | clamp the values between a range |

CUDAColorNegate | invert the values of input |

CUDAImageAdd | add two inputs |

CUDAImageSubtract | subtract two inputs |

CUDAImageMultiply | multiply two inputs |

CUDAImageDivide | divide two inputs |

*CUDALink* Image Base Operations.

To use any of these functions (and if not already done), include the *CUDALink* application.

*CUDALink*'s image processing functions, like the Wolfram Language's, accept images as input. Here you can find the gradient of an input image.

Since the CUDA image processing functions behave like Wolfram Language functions, you can combine them with existing Wolfram Language functions. Here, you can apply CUDAImageMultiply to all combinations of a set of images.

The CUDA image processing functions work with the Wolfram Language's dynamic evaluators, such as Manipulate, Dynamic, and Animate. Here, you can use Animate to create an animation of how an image behaves as it is convolved with different GaussianMatrix radius sizes.

### Applications

#### Creating New Image Processing Operators

*CUDALink*'s image processing operators are building blocks to more complicated operators. Here, you can define the CUDADarker operator, which is similar to the Darker operator in the Wolfram Language.

The function can then be used.

#### Input Smoothing

Many algorithms require the input to be smoothed before processing. This defines a random input list.

This plots the results, showing the input to be very noisy.

You can use the fact that the image processing functions also operate on lists to smooth out the input list.

#### Geographical Data Processing

Since all image processing functions are also list processing functions, you can process any data that can be represented by a Wolfram Language list. In this example, you can use CUDAClamp to process geographic elevation data by clamping values in the elevation map.

This loads the data from the Wolfram servers.

This creates an interface that allows the user to vary the clamp parameters.

#### Acquired Image Processing

The following example requires a web camera. CurrentImage returns an error if no camera is detected.

This creates an interface where the user can process input images from the web camera in real time.

## Linear Algebra and Fourier Transforms

*CUDALink* provides specialized data types that are download data to the GPU and support some basic linear algebra operations to be carried out with CUDA enhanced functions. In addition, these operations also work with general lists and CUDAMemory objects. Typically the new data types are preferred since they automatically reclaim their memory when the expression is no longer used, also they have a simpler operation since they always reside on the GPU. However, they do not yet have quite such a wide support as general lists or CUDAMemory objects.

CUDAVector | a vector of data that resides on the GPU |

CUDAMatrix | a matrix of data that resides on the GPU |

CUDASparseVector | a sparse vector of data that resides on the GPU |

CUDASparseMatrix | a sparse matrix of data that resides on the GPU |

Data types that can work with data stored on a CUDA enabled GPU.

CUDADot | give product of vectors and matrices |

CUDAArgMaxList | give the index with maximum absolute element |

CUDAArgMinList | give the index with minimum absolute element |

CUDAFourier | find the Fourier transform |

CUDAInverseFourier | find the inverse Fourier transform |

Linear algebra and Fourier transform operations using CUDA.

If not done so already, import the *CUDALink* application.

Here, a matrix of data that lives on the GPU is created.

This multiplies the two vectors.

This extracts the data returning a NumericArray.

The data stored can be seen with another application of Normal.

Operations on the GPU can be significantly faster. This creates a large matrix and also makes a GPU version.

This carries out matrix multiplication on the CPU.

The GPU version is much faster.

Another example of faster operations comes from CUDAFourier used on a CUDAVector.

The GPU version is much faster.

Typically operations on the GPU are faster than on the CPU for large amounts of data.

### Applications

Linear algebra and Fourier analysis have many applications that are beyond the scope of this tutorial. Here is a simple example of the kind of operations made possible by these *CUDALink* features.

#### Image Transformation

This transposes an input image.

## Examples

Along with these useful functions, *CUDALink* bundles many examples that showcase the capabilities of programming with *CUDALink*. The source of these examples is bundled with the Wolfram System.

CUDAFluidDynamics | compute and render a fluid dynamics simulation |

CUDAVolumetricRender | render volumetric data |

Example applications of *CUDALink*.

### Fluid Dynamics

This approximates the solution of the Navier–Stokes equations on a torus.