Mehran Maghoumi

Author's posts

ArrayFire is Now Open Source!

To my surprise, the CUDA library ArrayFire is now open source and licensed under BSD 3-Clause License which means that commercial use is permitted! ArrayFire is a production oriented library which greatly reduces CUDA application development time. The repository is hosted on GitHub and is located here.

Tutorial : Use CUDA and C++11 Code in MATLAB

As it turns out, incorporating CUDA code in MATLAB can be easily done! 🙂 MATLAB provides functionality for loading arbitrary dynamic libraries and invoking their functions. This is especially easy for invoking C/C++ code in a MATLAB program. Such functionality is possible using the so called MEX functions. Introduction: Mex functions can be created with the …

Continue reading

CImg and NVIDIA’s NPP Interop

Apparently, NPP relies on the pixel order of its input arrays (they need to be interleaved). If you are planning on using CImg with NPP, be sure to check this post out before attempting to do so. Failing to permute CImg image axes will result in wrong filtered values for color images.

CImg does not store pixels in the interleaved format

  Took me hours before I found and read the documentation. CImg stores pixels in a planer format (RRRR…..GGGG…..BBBB). For most tasks in CUDA, it’s much better to store the pixel values in the interleaved format (RGBRGBRGB……). In order to do that, just call the permute_axes method of the CImg object: CImg image(“image.jpg”); image.permute_axes(“cxyz”);    IMPORTANT: After …

Continue reading

Error: “incorrect inclusion of a cudart header file”

If you receive this error while compiling a CUDA program, it means that you have included a CUDA header file containing CUDA specific qualifiers (such as __device__)  in a *.cpp file. CUDA header files with such qualifiers should ONLY be included in *.cu files. This happened to me when I had #inlcude <common_functions.h> in my *.cpp …

Continue reading

Enable C++11 Support for CUDA Compiler (NVCC) – CUDA 6.5+

To enable support for C++11 in nvcc just add the switch -std=c++11 to nvcc. If you are using Nsight Eclipse, right click on your project, go to Properties > Build > Settings > Tool Settings > NVCC Compiler and in the “Command line prompt” section add -std=c++11 The C++11 code should be compiled successfully with nvcc. Nsight’s C++ …

Continue reading

Automount NTFS Partitions with All Permissions

Somethings really need to be burned onto the inside of my skull., since I forget them ALL the time. This is especially true for Linux commands for trivial tasks. Automounting NTFS partitions with execution permission in Linux is one of those things for me. Here’s how to do it in Linux Mint (or probably any other …

Continue reading

NPP’s Convoluion with Border Control Only Partially Implemented

One thing I discovered yesterday is that the image convolution filters implemented in NPP (such as nppiFilterBorder_8u) are only partially implemented! These family of functions are asserted to provide border control for the convolution, thus serving as a robust alternative to the regular image convolution functions in NPP (such as nppiFilter_8u). The catch is that the …

Continue reading

NPP’s Box Filter (nppiFilterBox) is Broken

Surprisingly, the box filter function (nppiFilterBox_8u)  that is shipped with CUDA as a part of the NPP library is broken! It is the same function that is used in the “Box Filter with NPP” sample. If you import this sample from the CUDA SDK and try it with masks of size 13 an above, the filter …

Continue reading

Blog Created

Seeing as how often many programmers struggle with the same issue twice, I decided to start this blog. I will try to note the problems that I encountered during my coding here so that when I, or other programmers, encounter them again the solution is already available somewhere. I will note the issues that required more …

Continue reading