Wolfram Language & System Documentation Center

CUDAMemoryLoad

CUDALink`

CUDAMemoryLoad

CUDAMemoryLoad[list]

registers list into the CUDALink memory manager.

CUDAMemoryLoad[img]

registers img into the CUDALink memory manager.

Details and Options

The CUDALink application must be loaded using Needs["CUDALink`"].
Possible types for CUDAMemoryLoad are:

Integer	Real	Complex
"Byte"	"Bit16"	"Integer"
"Byte[2]"	"Bit16[2]"	"Integer32[2]"
"Byte[3]"	"Bit16[3]"	"Integer32[3]"
"Byte[4]"	"Bit16[4]"	"Integer32[4]"
"UnsignedByte"	"UnsignedBit16"	"UnsignedInteger"
"UnsignedByte[2]"	"UnsignedBit16[2]"	"UnsignedInteger[2]"
"UnsignedByte[3]"	"UnsignedBit16[3]"	"UnsignedInteger[3]"
"UnsignedByte[4]"	"UnsignedBit16[4]"	"UnsignedInteger[4]"
"Double"	"Float"	"Integer64"
"Double[2]"	"Float[2]"	"Integer64[2]"
"Double[3]"	"Float[3]"	"Integer64[3]"
"Double[4]"	"Float[4]"	"Integer64[4]"

The following options can be given:
"Device" $CUDADevice CUDA device used in computation

"TargetPrecision" Automatic precision used in computation

Examples

open all close all

Basic Examples (1)

First, load the CUDALink application:

Wolfram Language code: Needs["CUDALink`"]

This loads memory:

Wolfram Language code: mem = CUDAMemoryLoad[{1, 2, 3}]

Information about memory can be retrieved using CUDAMemoryInformation:

Wolfram Language code: CUDAMemoryInformation[mem]

Memory added must be freed with CUDAMemoryUnload:

Wolfram Language code: CUDAMemoryUnload[mem]

Scope (4)

Adding memory as Real or Complex gets the type based on whether the device supports double precision or not:

Wolfram Language code: CUDAMemoryLoad[{1.0, 2.0, 3.0}, Real]

In this case, the CUDA device has double-precision support:

Wolfram Language code: CUDAInformation[$CUDADevice, "Compute Capabilities"]

The behavior can be forced to change by setting the "TargetPrecision":

Wolfram Language code: CUDAMemoryLoad[{1.0, 2.0, 3.0}, Real, "TargetPrecision" -> "Single"]

Memory added must be freed with CUDAMemoryUnload:

Wolfram Language code: CUDAMemoryUnload[%]

Images can be added with type "UnsignedByte":

Wolfram Language code: mem = CUDAMemoryLoad[[image]]

Getting the memory returns an image with the same properties as the original. Memory is retrieved using CUDAMemoryGet:

Wolfram Language code: CUDAMemoryGet[mem]

The "TypeInformation" contains the image information:

Wolfram Language code: CUDAMemoryInformation[mem]

Memory added must be freed with CUDAMemoryUnload:

Wolfram Language code: CUDAMemoryUnload[mem]

Images can be added with specified type:

Wolfram Language code: CUDAMemoryLoad[[image], "Float"]

Getting the memory returns an image:

Wolfram Language code: CUDAMemoryGet[%]

When adding graphics objects, the object is rasterized:

Wolfram Language code: mem = CUDAMemoryLoad[\!$\*Graphics3DBox[«5»]$]

When getting the memory, an image is returned:

Wolfram Language code: CUDAMemoryGet[mem]//Head

Memory added must be freed with CUDAMemoryUnload:

Wolfram Language code: CUDAMemoryUnload[mem]

Options (1)

"TargetPrecision" (1)

Adding memory as Real or Complex gets the type based on whether the device supports double precision or not:

Wolfram Language code: CUDAMemoryLoad[{1.0, 2.0, 3.0}, Real]

This can be overridden by setting the "TargetPrecision" option to either "Single" or "Double". In this case, the machine has double-precision hardware, but a "Float" is used because the "TargetPrecision" is single:

Wolfram Language code: CUDAMemoryLoad[{1.0, 2.0, 3.0}, Real, "TargetPrecision" -> "Single"]

If type is always single precision, then setting the type as "Float" or "ComplexFloat" may be more readable:

Wolfram Language code: CUDAMemoryLoad[{1.0, 2.0, 3.0}, "Float"]

Applications (2)

This adds two to an input list:

Wolfram Language code: srcf = FileNameJoin[{$CUDALinkPath, "SupportFiles", "addTwo.cu"}]

Wolfram Language code: addTwo = CUDAFunctionLoad[{srcf}, "addTwo", {{_Integer}, _Integer}, 32]

Wolfram Language code: res = addTwo[CUDAMemoryLoad[Range[100]], 100]

Wolfram Language code: CUDAMemoryGet[First@res]

Wolfram Language code: CUDAMemoryUnload[First@res]

Internally, running a CUDAFunction will load and (if the input is not CUDAMemory) will unload the memory. To demonstrate, this color negates an input image:

Wolfram Language code:

src = "__global__ void imageColorNegate(mint * img, mint width, mint height, mint channels) {
    mint ii;
	mint xIndex = threadIdx.x + blockIdx.x*blockDim.x;
	mint yIndex = threadIdx.y + blockIdx.y*blockDim.y;
	mint index = channels*(xIndex + yIndex*width);
	if (xIndex < width && yIndex < height) {
		for (ii = 0; ii < channels; ii++)
			img[index+ii] = 255 - img[index+ii];
    }
}";

This loads the function using CUDAFunctionLoad:

Wolfram Language code: colorNegate = CUDAFunctionLoad[src, "imageColorNegate", {{_Integer}, _Integer, _Integer, _Integer}, {16, 16}]

This defines input parameters:

Wolfram Language code:

{height, width} = ImageDimensions[[image]]
channels = ImageChannels[[image]]

This runs the CUDAFunction:

Wolfram Language code: colorNegate[[image], width, height, channels]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

CUDAMemoryLoad

Details and Options

Examples

Basic Examples (1)

Scope (4)

Options (1)

"TargetPrecision" (1)

Applications (2)

Text

CMS

APA

BibTeX

BibLaTeX

CUDAMemoryLoad

Details and Options

Examples

Basic Examples (1)

Scope (4)

Options (1)

"TargetPrecision" (1)

Applications (2)

See Also

Tech Notes

Related Guides

Text

CMS

APA

BibTeX

BibLaTeX