![]() |
ImFusion SDK 4.3
|
#include <ImFusion/CL/ClReduction.h>
OpenCL-based reduction based on the NVIDIA OpenCL reduction SDK sample. More...
OpenCL-based reduction based on the NVIDIA OpenCL reduction SDK sample.
For small arrays with n <= 128 the GPU code isn't used since the block size is fixed in the kernel. Setting the block size dynamically would require recompilation of the kernel which probably isn't very efficient. Therefore the sum is computed on the CPU in these cases.
Public Member Functions | |
| ClReduction (ClEnvironment *env=nullptr) | |
| Create OpenCL environment. | |
| ~ClReduction () | |
| Use existing context and command queue. | |
| float | reduce (float *data, int size) |
| float | reduce (const ClImage &data, int size, ClImage *outputBuffer=nullptr) |
| Reduce contents of data. | |
| Eigen::VectorXd | reduce (const ClImage &data) |
Protected Attributes | |
| ClEnvironment * | m_env |
| ClProgram * | m_program |
| ClKernel * | m_kernelReduction |
| ClKernel * | m_kernelFinal |
| ClImage * | m_dataOut |
| float * | m_dataHost |
| bool | m_ownContext |
| int | m_maxBlocks |
| int | m_maxThreadsReduce |
| int | m_maxThreadsFinal |
| unsigned int | m_dataType |
Reduce contents of data.
Data has to be a 1D buffer of floats. If outputBuffer is provided it has to be a 1D buffer with 64 floats. The result of the reduction is obtained by summing over the 64 entries. In this case the return value will always be zero. If outputBuffer is not provided the buffer will be downloaded and reduced internally.