To enable support for C++11 in nvcc just add the switch -std=c++11 to nvcc.
If you are using Nsight Eclipse, right click on your project, go to Properties > Build > Settings > Tool Settings > NVCC Compiler and in the “Command line prompt” section add -std=c++11
The C++11 code should be compiled successfully with nvcc. Nsight’s C++ indexer will also work fine.
Somethings really need to be burned onto the inside of my skull., since I forget them ALL the time. This is especially true for Linux commands for trivial tasks. Automounting NTFS partitions with execution permission in Linux is one of those things for me. Here’s how to do it in Linux Mint (or probably any other Debian-based Linux distro)
1) Find the UUID of your partition by
2) Add the following line in the file /etc/fstab
UUID=<xxxxx> /media/[whatever] ntfs rw,auto,users,exec,nls=utf8,umask=000,gid=46,uid=1000 0 0
3) Run the following command to verify everything is working fine
You can verify the uid for your user by running
Note the option umask=000. This gives execution permission to all files.
One thing I discovered yesterday is that the image convolution filters implemented in NPP (such as nppiFilterBorder_8u) are only partially implemented! These family of functions are asserted to provide border control for the convolution, thus serving as a robust alternative to the regular image convolution functions in NPP (such as nppiFilter_8u). The catch is that the border control is only partially working.
The documentation on these functions is scarce. These functions expect an argument of type NppiBorderType to define their border treatment. Possible options are:
NPP_BORDER_NONE: no border treatment
NPP_BORDER_CONSTANT: (probably) assume constant values at out of bounds pixels
NPP_BORDER_REPLICATE: replicate edge pixels and use them as values for out of bounds pixels
NPP_BORDER_WRAP: round-robin treatment of borders
My experiments showed that the only working option is NPP_BORDER_REPLICATE. Any other option would result in the NPPStatus error code of -9999 (equivalent to NPP_NOT_SUPPORTED_MODE_ERROR, for which I have, again, not found any documentations).
Seeing as the performance of the border-controlled convolutions is inferior to the box filter function (using large mask sizes), my assumption is that the NPP_BORDER_REPLICATE uses the nppiCopyConstBorder_8u function to implement its border-control.
Possible options include implementing the border control manually, if behaviors other than replication are desired.
Surprisingly, the box filter function (nppiFilterBox_8u) that is shipped with CUDA as a part of the NPP library is broken! It is the same function that is used in the “Box Filter with NPP” sample.
If you import this sample from the CUDA SDK and try it with masks of size 13 an above, the filter produces garbage output (tested with CUDA 6.5). At this point, I have no idea why this is happening or why such simple filter may not work for larger mask sizes. An alternative would be to use the convolution filters (such as nppiFilter_8u).
EDIT (12/5/2014): I reported this bug to NVIDIA and today I received an email indicating that this bugs was now fixed and the fixed version will be available in the next version of the CUDA toolkit.
Seeing as how often many programmers struggle with the same issue twice, I decided to start this blog. I will try to note the problems that I encountered during my coding here so that when I, or other programmers, encounter them again the solution is already available somewhere.
I will note the issues that required more than a simple Google search to solve.
Never get stuck on the same issue twice! 🙂