** Object Recognition**

**3.3.3 Descriptor Construction**

where

*D = Difference-of-Gaussian (DoG) clouds *
*G = Convolution of Gaussian yield point cloud *

**3.3.3 ** **Descriptor Construction **

Now, all the important keypoints had been detected. Local descriptors were then constructed to describe the local geometry for each keypoint. In this project, there were two methods used to develop the descriptors: Point Feature Histogram (PFH) and Signature of Histograms of OrienTations (SHOT).

Point Feature Histogram (PFH) method was proposed by Rusu, et al.

(2008). The algorithm was developed by using module *pcl::PFHEstimation *
available in Point Cloud Library (PCL) (2020a). This method mainly required
two input information which were the xyz data and surface normals computed
in the previous section. Therefore, the surface normals computed must have a

high quality or else it would affect the performance of the descriptor. Basically, a PFH descriptor mainly uses the difference of surface normals’ directions to collect the geometrical information of a targeted keypoint from its neighbourhood.

After feeding the input keypoint clouds of both model and scene and
their computed surface normals to the algorithm, a k-neighbourhood of point p*i*

was created by setting the radius of a sphere r that used to search for neighbours.

Next, point pairing was performed. According to Grupo De Robotica (2015a),
a targeted point was not only paired with its k neighbours, but the neighbours
were also paired among themselves. For each point pair p*s *and p*t*, a coordinate
frame containing 3 unit vectors was computed by using their normals n*s** and n**t*

at either point, as shown in Equation 3.4. By using the computed coordinate
frame, the difference between the normals of the point pair was expressed as
three angular features, as shown in Equation 3.5. Lastly, four main features, the
three angular features and the point pair’s Euclidean distance *d were binned into *
a histogram with 125 bins. This means that each keypoint would have its own
125-bin histogram. The final PFH descriptor was the combination of all
histograms with their own four features. An empty kD tree was built to perform
the neighbour searching process.

= _{ }, = × ( )

× , = × (3.4)

where

=

= , = ( )

, = arctan( , ) (3.5)

As mentioned by Rusu, Blodow and Beetz (2009), PFH has a very high
complexity of *O(k*^{2}*) in computing descriptors at real time. At first, PFH was *
used to generate descriptors for the keypoints. However, it required too much
time, more than two hours to construct the descriptors for a larger scene point
cloud. Therefore, in order to increase the efficiency of the descriptor

construction process, Signature of Histograms of OrienTations (SHOT) was implemented by using readily available module pcl::SHOTEstimation in PCL (Point Cloud Library (PCL), 2018b). This method was proposed by Tombari, Salti and Stefano (2010a) which combined both signatures and histograms.

In the algorithm, the descriptor first built a weighted covariance matrix,
*M for a targeted keypoint p by using a fixed radius for neighbouring keypoints *
*p**i*, as shown in Equation 3.6. Then, Local Reference Frame (LRF) or the
coordinate system was computed for the keypoint by using eigenvector and
eigenvalue decomposition of the matrix. Then, a 3D isotropic grid in a spherical
form for the targeted keypoint was created as a signature structure and it was
aligned with the corresponding LRF, almost similar like 3D shape context
method proposed by Frome, et al. (2004). In PCL, the spherical grid consists of
32 bins creating from 2 radial divisions, 8 azimuth divisions and 2 elevations.

Next, one dimensional histogram in each bin was obtained by accumulating the geometric details of the keypoint such as cosine angles between normals of the keypoints and of the neighbouring keypoints. Lastly, the final SHOT descriptor was developed by combining all histograms. In the algorithm, the main parameter to be set and adjusted was the radius defining of which neighbouring keypoints were involved and described. Besides, a kD tree was provided to help in radius search for nearest keypoints for both model and scene clouds.

= 1

*M = Weighted Covariance matrix *

=

R = Radius of sphere

**3.3.4 ** **Feature Matching **

Next, once the descriptors of keypoints for both model and scene were computed, the scene and model keypoints were then successively matched through their own descriptors to compute a set of point-to-point correspondences.

In this project, the reference cloud was the scene descriptors and model
descriptor cloud was set as the input cloud for matching. Descriptors of the
scene functioned as a key to match themselves to all the descriptors of the model
in order to establish model-scene point pairs that were identical. The reason of
doing this was to account for the existence of a few model hypotheses. If the
model’s descriptors matched themselves against the scene’s descriptors, the
model instances would not be found. Each descriptor in the scene cloud was
matched and compared with the model descriptors by using
*pcl:KdTreeFLANN<pcl::KdTreeFLANN> module available in Point Cloud *
Library (PCL) (2013). The system then computed a Euclidean distance between
the model and scene descriptor. A distance threshold d was set to determine the
similarity between the scene’s keypoints and the model’s keypoints. Point
-to-point correspondences were obtained and added to a correspondence group if
the distances were within the threshold, meaning that they were similar. All poor
correspondences which had a bigger distance more than the threshold were
eliminated. The threshold should not be too big, or else the number of outliers
or mismatched keypoints would be higher. On the other hand, if a very low
threshold was set, the number of the correspondences would be too less to obtain
the model instances in the hypotheses generation step.