De CAPE et d'Opés

Correct and optimize the code of a publication

Certainly! Here is the translation:

CAPE (Cylinder And Plane Extraction) is an extremely efficient method for extracting planes and cylinders in RGB-D images, based on an AHC (Agglomerative Hierarchical Clustering) method. Despite the effectiveness of this method, the C++ implementation of the paper has many flaws.

In this post, I will explain how I fixed most of these issues.

CAPE is an efficient, reliable, and fast method (less than 4ms for a 640 * 480 pixel image) based on the PEAC method. CAPE improves the PEAC method by speeding up plane detection and adding cylinder detection using a RANSAC algorithm. While the method presented in the paper is impressive, its implementation has many flaws, especially in memory management.

A Brief Presentation of AHC

AHC (Agglomerative Hierarchical Clustering) is a method introduced by PEAC in 2014, which involves analyzing RGB-D images by dividing them into an NxM grid of rectangles. Pixels in these rectangles are then analyzed individually to determine if they belong to a 3D plane. If so, the relative position of these rectangles in the image is used to merge nearby planes and build a graph of all the planes.

This method has the advantage of treating a 3D image as a 2D image, making it much faster than analyzing a point cloud. However, it is currently limited to analyzing RGB-D images.

At the time of its publication, PEAC was the fastest 3D plane detection method by far, with performance equivalent to the state-of-the-art 3D plane detection algorithms.

Fixing CAPE

The CAPE implementation provided by the authors offers impressive speed and reliability. It can handle the plane detection step in less than 4 milliseconds, which is more than sufficient for real-time applications.

However, the code is poorly organized and has many implementation flaws, particularly in memory management and class copies in vectors. Fixing these errors slightly improves the performance of this method.

I won’t specify the correction of common errors such as the absence of ifndef in .hpp files or the lack of const and const& for function and class input parameters.

Class CAPE

Unnecessary Allocations

In the constructor, the Grid vector, containing a list of N pointers to a class (PlaneSegment*), is initialized with a loop that fills it with nullptr (using the reserve method would have been more appropriate). The pointers in Grid are then reassigned in the process function, called for each new RGB-D image, via the line Grid[id] = new PlaneSeg(...);. Using new to replace a nullptr is okay, but on the second call to the process function, a new memory area will be allocated without deallocating the old one, potentially causing a small memory leak.

But wait a minute… Grid uses the vector class with a constant number of elements but never uses the properties of vector. Why use a vector?

I replaced it with an array of unique_ptr initialized via new unique_ptr<PlaneSegment>[N], then added a basic constructor to the PlaneSegment class, called in the constructor, as well as an init function to avoid N allocations with each call to the process function. I also added a destructor to the CAPE class to free Grid. A leak plugged, and a (minor) performance improvement by avoiding N allocations per call.

An Avoidable Reset

In the process function, called for each image analysis, a variable H of the Histogram class is allocated, containing two vectors, each with a constant size of 20 elements. Declaring it in the function block prevents memory leaks but requires adding 40 allocations per call to the process function. So, I made H a member variable of the CAPE class, replaced the vectors of Histogram with arrays, and added a call to fill_n in the initHistogram function to initialize the elements of these arrays to 0. Don’t forget a small destructor to deallocate these arrays.

An Indexing Error

The loop for at line 126 aims to find the index of the cell with the minimum error. The value of minMSE is compared to Grid[seed_candidate]->MSE, and if it is greater, it is assigned min_MSE = Grid[i]->MSE;. So, I corrected the index selection to min_MSE = Grid[seed_candidate]->MSE;. I also modified all classes so that members are private and added getter functions.

A Background Copy

PlaneSeg new_ps = *Grid[seed_id];. At first glance, it’s quite innocent. What happens in the background is a bit less innocent: it’s a call to a copy constructor. If we don’t have a copy constructor declared in our class, the compiler will create a default one, whose behavior may not be desired.

A good practice to avoid such default copies is to declare the default constructor as well as the operator= as private in our class. The compiler will warn us about future copies with a good error. Adding these methods also triggers another error message I didn’t see coming: vector.push_back(class) calls the class’s copy constructor.

In my opinion, the best way to work around this behavior is to use vectors of pointers, especially shared_ptr in this specific case.

Another Indexing Error

At line 238, the distance to the plane is calculated by the dot product of the normal of one plane by the mean value of another plane. However, the first term plane_segments[r].normal[0] should be plane_segments[plane_id].normal[0]. To avoid copy-paste errors, I replaced the normal and mean arrays of the PlaneSegment class with Eigen::Vector3d to use the Vector3d.dot(Vector3d) function directly. Another method to continue using double arrays would have been to define a function double dot(const double& vecA[3], const double& vecB[3]) to avoid redundant code, or to define a member function getDistance in the PlaneSegment class, which has the advantage of not using getters.

Redundant Loops

The loop from line 434 to 445 is used to copy the result of cylinder analysis into the vector with the final cylinders. Its existence is justified for code clarity, but a small optimization can be made by copying the final results in the previous loop (towards line 348).

Similarly, now that we have replaced Grid with a fixed, non-reallocated array at each function call, the loop from 449 to 455 should also be eliminated.

In the getConnectedComponents function, another loop can be removed. It is the double loop from line 476 to 480, which makes the planes_association_matrix matrix symmetric. It is sufficient to directly copy the symmetric values at lines 471 and 472.

Results

I verified my work using valgrind and confirmed the memory leaks were fixed, with a negligible improvement (around half a millisecond) in performance without affecting the output. The next step will be to look for ways to improve the overall performance of the algorithm.

I took advantage of my modifications to add evaluation metrics, inspired by PEAC, in place of certain constants.


See also