Mar 10

Align Depth and Color Frames – Depth and RGB Registration

By Mehran Maghoumi in Computer Vision, MATLAB, Tutorial

Sometimes it is necessary to create a point cloud from a given depth and color (RGB) frame. This is especially the case when a scene is captured using depth cameras such as Kinect. The process of aligning the depth and the RGB frame is called “registration” and it is very easy to do (and the algorithm’s pseudo-code is surprisingly hard to find with a simple Google search! 😀 )

To perform registration, you would need 4 pieces of information:

The depth camera intrinsics:
1. Focal lengths f_xd and f_yd (in pixel units)
2. Optical centers (sometimes called image centers) C_xd and C_yd
The RGB camera intrinsics:
1. Focal lengths f_xrgb and f_yrgb (in pixel units)
2. Optical centers (sometimes called image centers) C_xrgb and C_yrgb
The extrinsics relating the depth camera to the RGB camera. This is a 4×4 matrix containing rotation and translation values.
(Obviously) the depth and the RGB frames. Note that they do not have to have the same resolution. Applying the intrinsics takes care of the resolution issue. Using camera’s such as Kinect, the depth values should usually be in meters (the unit of the depth values is very important as using incorrect units will result in a registration in which the colors and the depth values are off and are clearly misaligned).
Also, note that some data sets apply a scale and a bias to the depth values in the depth frame. Make sure to account for this scaling and offsetting before proceeding. In order words, make sure there are no scales applied to the depth values of your depth frame.

Let depthData contain the depth frame and rgbData contain the RGB frame. The pseudo-code for registration in MATLAB is as follows:

function [aligned] = ...
                    depth_rgb_registration(depthData, rgbData,...
                    fx_d, fy_d, cx_d, cy_d,...
                    fx_rgb, fy_rgb, cx_rgb, cy_rgb,...
                    extrinsics)

    depthHeight = size(depthData, 1);
    depthWidth = size(depthData, 2);
    
    % Aligned will contain X, Y, Z, R, G, B values in its planes
    aligned = zeros(depthHeight, depthWidth, 6);

    for v = 1 : (depthHeight)
        for u = 1 : (depthWidth)
            % Apply depth intrinsics
            z = single(depthData(v,u)) / depthScale;
            x = single((u - cx_d) * z) / fx_d;
            y = single((v - cy_d) * z) / fy_d;
            
            % Apply the extrinsics
            transformed = (extrinsics * [x;y;z;1])';
            aligned(v,u,1) = transformed(1);
            aligned(v,u,2) = transformed(2);
            aligned(v,u,3) = transformed(3);
        end
    end

    for v = 1 : (depthHeight)
        for u = 1 : (depthWidth)
            % Apply RGB intrinsics
            x = (aligned(v,u,1) * fx_rgb / aligned(v,u,3)) + cx_rgb;
            y = (aligned(v,u,2) * fy_rgb / aligned(v,u,3)) + cy_rgb;
            
            % "x" and "y" are indices into the RGB frame, but they may contain
            % invalid values (which correspond to the parts of the scene not visible
            % to the RGB camera.
            % Do we have a valid index?
            if (x > rgbWidth || y > rgbHeight ||...
                x < 1 || y < 1 ||...
                isnan(x) || isnan(y))
                continue;
            end
            
            % Need some kind of interpolation. I just did it the lazy way
            x = round(x);
            y = round(y);

            aligned(v,u,4) = single(rgbData(y, x, 1);
            aligned(v,u,5) = single(rgbData(y, x, 2);
            aligned(v,u,6) = single(rgbData(y, x, 3);
        end
    end    
end

A few things to note here:

The indices x and y in the second group of for loops may be invalid which indicates that the obtained RGB pixel is not visible to the RGB camera.
Some kind of interpolation may be necessary when using x and y. I just did rounding.
This code can be readily used with savepcd function to save the point cloud into a PCL compatible format.

The registration formulas were obtained from the paper “On-line Incremental 3D Human Body Reconstruction for HMI or AR Applications” by Almeida et al (2011). The same formulas can be found here. Hope this helps 🙂

3d-computer-vision, registration

23 comments

Skip to comment form

- ramine on August 26, 2016 at 1:57 PM
- #
- Reply
Hello
Dear Mehran
Your code is useful for me.
do you know parameter camera for kinect version1?
1. - Mehran Maghoumi on August 27, 2016 at 6:16 PM
    Author
  - #
  - Reply
  Glad you found it useful! Unfortunately, I don’t have any calibration values for a Kinect 1. You may want to calibrate yourself…
  1. Vishnu Teja Yalakuntla on May 21, 2017 at 4:39 AM
    
    #
    
    Reply
    
    The device comes with calibration parameters. For example, in pylibfreenect2, we can access them through
    
    fn.openDevice(serial, pipeline=pipeline).getColorCameraParams()
    
    or
    
    fn.openDevice(serial, pipeline=pipeline).getIrCameraParams()
    
    But I don’t know how accurate they are.
    
    Cheers
    1. Mehran Maghoumi on May 21, 2017 at 8:19 AM
      Author
      
      #
      
      Reply
      
      Not very accurate. See my other post here: https://www.codefull.net/2017/04/practical-kinect-calibration/
- Seereen Noorwali on September 27, 2017 at 2:18 PM
- #
- Reply
Thanks for sharing the pseudo-code … do you have MATLAB code for register depth and color image for Kinect V2?
1. - Mehran Maghoumi on September 27, 2017 at 2:46 PM
    Author
  - #
  - Reply
  It would be literally the exact same code. The only difference in terms of application is, depth and color cameras on Kinect v2 have different resolutions compared to Kinect v1. So those value need to be set in the code I posted accordingly.
  1. Seereen Noorwali on October 3, 2017 at 4:06 PM
    
    #
    
    Reply
    
    Thank you for your replay … I do not know how to get these values from Kinect v2 ? any idea? … all I need is getting the registration image from RGB and Depth images ?
    1. Mehran Maghoumi on October 4, 2017 at 1:09 PM
      Author
      
      #
      
      Reply
      
      Yes, you’d need to calibrate for both the extrinsics and the intrinsics.
      Take a look at this post: https://www.codefull.net/2017/04/practical-kinect-calibration/
- Sayed on November 28, 2017 at 8:23 PM
- #
- Reply
What is depthScale? can you explain?
1. - Mehran Maghoumi on November 29, 2017 at 1:22 AM
    Author
  - #
  - Reply
  This is explained in the last paragraph of #4 above!
- N on January 16, 2018 at 1:08 PM
- #
- Reply
Hello, nice post, but i have a silly question, im using Kinect V2 and for the calibration im using GML C++ Camera Calibration Toolbox, i get the intrinsic parameters of the color and ir images individually, but how do you relate the extrinsics of the depth and rgb camera?
1. - Mehran Maghoumi on January 17, 2018 at 11:48 AM
    Author
  - #
  - Reply
  For that, you need to do an extrinsic calibration (in other words, find the rotation and translation between the IR and the color camera). I believe GML can also do extrinsic calibration. MATLAB can also do this for you. Take a look at https://www.mathworks.com/help/vision/ref/extrinsics.html?requestedDomain=true
  1. N on January 19, 2018 at 3:42 PM
    
    #
    
    Reply
    
    Thank you so much! you’re the best (:
- amin__ on March 6, 2018 at 3:14 PM
- #
- Reply
Are those intrinsic and extrinsic parameters specific to a Kinect v2 sensor? Or they should be same for all Kinect v2 sensor? The situation I am having is, I have rgb and depth data of some vidoes. I want to extract the foreground from those data (by using the depth map). I don’t have the parameters for the Kinect v2 sensor used to collect those data. Is that possible?
1. - Mehran Maghoumi on March 7, 2018 at 3:35 AM
    Author
  - #
  - Reply
  Each Kinect has a different calibration. While the values are generally close between various devices, it’s usually best to calibrate the device before using it.
  You could certainly use the values you find online, but be advised that you may not get accurate registration results.
  Take a look at this post (https://www.codefull.net/2017/04/practical-kinect-calibration/) to get some insight into obtaining good Kinect calibration values.
  1. amin__ on March 7, 2018 at 5:47 PM
    
    #
    
    Reply
    
    I have dataset from another group. They did not publish calibration params. They only published depth, rgb and skeleton data. That’s why I have no way to get params from the Kinect they used to collect data. I will try to use calibration params from online. Thanks
    1. Mehran Maghoumi on March 7, 2018 at 6:17 PM
      Author
      
      #
      
      Reply
      
      Usually, whoever creates a dataset will also include calibration values for the dataset (otherwise the dataset will be useless :D)
      Try to see if that dataset you’re using has anything shipped with it. What dataset is it anyway?
- Dave on July 17, 2018 at 5:25 PM
- #
- Reply
Is there a working example of your code?

and how did u got depthData and rgbData into the code?
- D on July 17, 2018 at 8:00 PM
- #
- Reply
hey, can you show a working example of your code?

how did you feed depthData and rgbData into the code?
- Ankit Jaiswal on November 7, 2018 at 8:20 AM
- #
- Reply
Hi,
I am very new to this and just started with MATLAB.

Before using the above function, I will have to define all input parameters right.
I cant use the code as it is and just run it and get the output right?
- Ting on December 2, 2018 at 4:41 PM
- #
- Reply
Shouldn’t the extrinsics matrix be a 3×4 matrix instead of 4×4?

Also, it looks like matlab function extrinsics can obtain the rotation and translation matrices for RGB and depth camera separately… How do I get the rotation and translation matrices between RGB and depth?

Sorry that I do not know much about image processing, please forgive me if I am asking stupid questions….

Thank you so much for your help!
1. - Mehran Maghoumi on December 2, 2018 at 8:58 PM
    Author
  - #
  - Reply
  Although the extrinsics can be represented as 3×4, it’s usually convenient to represent as a 4×4 because you can easily multiply them together in cases where you have more than 2 cameras.
  
  Also. obtaining the extrinsics itself requires camera calibration. Read more about it here: https://www.codefull.net/2017/04/practical-kinect-calibration
- Ilayda on October 28, 2022 at 1:56 AM
- #
- Reply
Hi!

Thank you very much for the code, I have been using it in my research and it has been very useful.

When I was using your code, I faced the following problem: the created color image became just white (with a black frame where the values are invalid). The solution is to use “im2single” instead of the function “single” when casting the image.

I realize this is only a pseudo code but I thought it would be useful for others.

I am also wondering if it is possible to share the modified code I have created using your code, in github. I am creating a dataset and I thought it would be useful to collect everything in github so anyone can go and verify my results. Of course, I will give a link to this page and make it clear that I used this code as reference where applicable.

Kind regards,

Align Depth and Color Frames – Depth and RGB Registration

23 comments

Leave a Reply Cancel reply

About Me

Public Profiles

Recent Comments

Recent Posts

Categories

Archives