These tutorials require packages that are not open to the public. If you want to run the packages, please write to fmw@ipa.fhg.de .

Hardware Requirements

To use this package you need a stereo camera system or an RGBD device that delivers point clouds with color information. Alternatively you can use a simulated version without any hardware, see cob_gazebo.

ROS API

The cob_object_detection package provides a configurable node for detecting textured objects based on local feature point detection and 3D model matching.

cob_object_detection

The cob_object_detection node takes in sensor_msgs/PointCloud2 messages generated e.g. from a stereo camera pair or an RGBD camera device and performs 6 DOF object detection based on previously trained object models. Object detection and training is triggered using services or actions. Visualization and the detection results are published on topics.

Action Goal

/object_detection/acquire_object_image/goal (cob_object_detection/AcquireObjectImageActionGoal)
  • Goal for training procedure to acquire object
/object_detection/detect_object/goal (cob_object_detection/DetectObjectsActionGoal)
  • Goal for executing object detection procedure
/object_detection/train_object/goal (cob_object_detection/TrainObjectsActionGoal)
  • Goal for training object including the object model name

Action Result

/object_detection/acquire_object_image/result (cob_object_detection/AcquireObjectImageActionResult)
  • empty
/object_detection/detect_object/result (cob_object_detection/DetectObjectsActionResult)
  • List of detect object instances including a confidence and pose estimate
/object_detection/train_object/result (cob_object_detection/TrainObjectActionResult)
  • empty

Action Feedback

/object_detection/acquire_object_image/feedback (cob_object_detection/AcquireObjectImageActionFeedback)
  • Feedback holds number to indicate the approximate progress
/object_detection/detect_object/feedback (cob_object_detection/DetectObjectsActionFeedback)
  • Feedback holds number to indicate the approximate progress
/object_detection/train_object/feedback (cob_object_detection/TrainObjectActionFeedback)
  • Feedback holds number to indicate the approximate progress

Subscribed Topics

/sensor_fusion/stereo/reprojection_matrix (cob_msgs/ReprojectionMatrix)
  • Reprojection matrix to project 3D coordinates given by sensor_msgs/PointCloud2 to the 2D camera image plane. The matrix is necessary to render the detection results with a 2D bounding box.

Published Topics

/object_detection/point_cloud_2 (sensor_msgs/PointCloud2)
  • Point cloud visualization of the training process and the incoming 3D data for training and detection
/object_detection/image (sensor_msgs/Image)
  • 2D visualization of the training images and the object detection results by rendering the bounding boxes of the detected objects

Services

/object_detection/detect_object (cob_object_detection/DetectObjects)
  • Executes object detection procedure
/object_detection/train_object (cob_object_detection/TrainObject)
  • Executes object training procedure
/object_detection/acquire_object_image (cob_object_detection/AcquireObjectImage)
  • Service is internally called by autonomous object training procedure

Services Called

/sensor_fusion/stereo/get_colored_pc (cob_srvs/GetPointCloud2)
  • Point cloud representing a single view on the current scene image. RGB data of point cloud must represent a 2D image in order to be able to perform feature extraction.

Parameters

~object_model_type (string)
  • Specifies the applied object model (MODEL_BAYES or MODEL_6D_SLAM). MODEL_BAYES applies bundle adjustment for 3D model generation and creates feature point clusters using mean shift on all affine transformations (natural and artificial) of all detected feature points. This should be your method of choice. MODEL_6D_SLAM applies FastSLAM for 3D model generation and does not do any clustering of feature points. It is the original object detection algorithm from which MODEL_BAYES emerged.
~visualization (bool)
  • Specifies if visualisation is calculated published on the corresponding topics. Visualisation lets you observe the object model generation process in 3D by drawing the generated object model that has been created after incorporating successive viewpoints of the object. It also gives you an impression of the detected object models in 2D by reprojecting the bounding
~logging (bool)
  • Specifies if log files are created for the training process (logs the number of features per viewpoint and the remaining number of features after clustering) and for the detection process (logs the number of detected features per object as well as the different confidence values)
~use_STAR_detector (bool)
  • Specifies if STAR detector is used for feature detection. For now, this value is not working, yet, and must always be false. However it is intended to integrate an arbitrary selection of feature point detector and descriptor for each object model.
~train_learning_pos_x (float)
  • Specifies the x position in camera coordinates of the training centre. Defining a training centre helps the training procedure to reduce the amount of superfluous data that has nothing to do with the training object and accelerates therefore the training procedure.
~train_learning_pos_y (float)
  • Specifies the y position in camera coordinates of the training centre. Defining a training centre helps the training procedure to reduce the amount of superfluous data that has nothing to do with the training object and accelerates therefore the training procedure.
~train_learning_pos_z (float)
  • Specifies the z position in camera coordinates of the training centre. Defining a training centre helps the training procedure to reduce the amount of superfluous data that has nothing to do with the training object and accelerates therefore the training procedure.
~train_learning_radius (float)
  • Specifies the cube length in meter measured from the training centre, to determine the image are that contains the object. e.g. a value of 0.1 will generate a 3D cube with edge length 0.2 meter and centre specified by the train_learning_pos_x, train_learning_pos_y and train_learning_pos_z variables. All 3D points within this cube are potentially relevant for object model creation. All other points are discarded.
~train_affine_fp_transforms (bool)
  • Determines if each training image should be artificially transformed to generate more views for object model training. This generates in general better detection results, however at the cost of a longer training time (approx. by factor 10).
~train_min_cluster_size (integer)
  • After extracting 2D feature points from all image views, many feature points describe one and the same point in 3D space. To reduce the object model size, the feature points are clustered based on 3D space distance and 2D descriptor distance. The parameter specifies the minimal number of feature points that must constitute a cluster e.g. if you chose 10 degree rotational steps for object model training, each image point is approximately seen from at least 4 different viewpoints. Therefore a cluster size of 3 is reasonable to filter out weak feature points.
~detect_ransac_norm_inlier_dist_thresh (float)
  • RanSaC is used for 3D model fitting. In order evaluate a pose estimate, RanSaC calculates the distance of detected feature points from their predited location based on the pose estimate. The parameters specifies the maximal difference in meter that is acceptable.
~detect_ransac_local_neighbor_dist (float)
  • RanSaC is used for 3D model fitting. In order prevent random drawing of feature point matchings from the whole image, RanSac is limited to draw matchings from a local neighbourhood region. This parameters specifies the size of this region. Larger values increase computation time, but may give better detection results. Too large values may prevent RanSaC to determine a valid pose.
~detect_min_ransac_reward (float)
  • RanSaC is used for 3D model fitting. For each pose estimate it calculates a reward for detection quality estimation. This value determines the minimal RanSaC reward, that is necessary to count as detection hit. The value is evaluated in conjunction with detect_max_sqr_stdv and detect_min_detect_quality by the OR operator.
~detect_max_sqr_stdv (float)
  • RanSaC is used for 3D model fitting. For each pose estimate it calculates the standard deviation of all inliers relative to their estimated position based on the detection results. This value determines the maximal allowed standard deviation value, that is necessary to count as detection hit. The value is evaluated in conjunction with detect_min_ransac_reward and detect_min_detect_quality by the OR operator.
~detect_min_detect_quality (float)
  • Determines the minimal percentage of potentially visible feature points (depending on the current viewpoint) that must be seen to count as a detection hit. The value is evaluated in conjunction with detect_min_ransac_reward and detect_max_sqr_stdv by the OR operator.
~detect_min_inlier_size (integer)
  • RanSaC is used for 3D model fitting. This value determines the minimal number of inliers that must be determined to count as a detection hit. This threshold value must be met for any detection result and presents the first choice to increase in case of false positive detections.

Usage/Examples

The package can be used via a launch file which loads all parameters from a .yaml file and starts the object_detection node.

roslaunch cob_object_detection object_detection.launch

For including the object_detection in your overall launch file use

<include file="$(find cob_object_detection)/ros/launch/object_detection.launch" />

A sample parameter file could look like this

##########################################
# Object model type
##########################################

# MODEL_BAYES or MODEL_6D_SLAM
object_model_type: MODEL_BAYES
# Visualize detection results in 3-D
visualization: true
# Write log files
logging: true
# Feature point detector
use_STAR_detector: false

##########################################
# Parameters for object model training
##########################################

# X Position of object during training in camera coordinates
# Unit: [m]
train_learning_pos_x: -0.02
# Y Position of object during training in camera coordinates
# Unit: [m]
train_learning_pos_y: 0.0
# Z Position of object during training in camera coordinates
# Unit: [m]
train_learning_pos_z: 0.6
# Radius of training area for object segmentation during trainng
# Unit: [m]
train_learning_radius: 0.1
# Apply affine transformation to source images before feature extraction
train_affine_fp_transforms: false
# Minimal cluster size for feature points to be valid
train_min_cluster_size: 3

##########################################
# Parameters for object model detection
##########################################

# Minimal cluster size for feature points to be valid
detect_ransac_norm_inlier_dist_thresh: 0.014
# Neighborhood size for Ransac sample extraction
detect_ransac_local_neighbor_dist: 0.07
# RanSaC Reward qualifying detection quality
detect_min_ransac_reward: 2
# Standard deviation of inlier points
detect_max_sqr_stdv: 300
# Standard deviation of inlier points
detect_min_inlier_size: 10
# Minimal fraction of detected points relative to maximum number of detectable points
detect_min_detect_quality: 0.5

Definition of object coordinate system

The frame of the object is defined like the following picture.

cob_object_detection/frame.png

The front side of the object is the native object front like you would place an object into a shelf. The the coordinate system is oriented to the specific object. Normally the front side show in the negative Y-direction. The Z-direction show vertical to the top.

Tutorials

Tutorials can be found on the Tutorials page.

Wiki: cob_object_detection (last edited 2011-12-02 10:21:30 by MatthiasFreier)