Object detection and segmentation using discriminative learning Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 21, 2019
  • Zhang, Jingdan
    • Affiliation: College of Arts and Sciences, Department of Computer Science
  • Object detection and segmentation algorithms need to use prior knowledge of objects' shape and appearance to guide solutions to correct ones. A promising way of obtaining prior knowledge is to learn it directly from expert annotations by using machine learning techniques. Previous approaches commonly use generative learning approaches to achieve this goal. In this dissertation, I propose a series of discriminative learning algorithms based on boosting principles to learn prior knowledge from image databases with expert annotations. The learned knowledge improves the performance of detection and segmentation, leading to fast and accurate solutions. For object detection, I present a learning procedure called a Probabilistic Boosting Network (PBN) suitable for real-time object detection and pose estimation. Based on the law of total probability, PBN integrates evidence from two building blocks, namely a multiclass classifier for pose estimation and a detection cascade for object detection. Both the classifier and detection cascade employ boosting. By inferring the pose parameter, I avoid the exhaustive scan over pose parameters, which hampers real-time detection. I implement PBN using a graph-structured network that alternates the two tasks of object detection and pose estimation in an effort to reject negative cases as quickly as possible. Compared with previous approaches, PBN has higher accuracy in object localization and pose estimation with noticeable reduced computation. For object segmentation, I cast deformable object segmentation as optimizing the conditional probability density function p(C|I), where I is an image and C is a vector of model parameters describing the object shape. I propose a regression approach to learn the density p(C|I) discriminatively based on boosting principles. The learned density p(C|I) possesses a desired unimodal, smooth shape, which can be used by optimization algorithms to efficiently estimate a solution. To handle the high-dimensional learning challenges, I propose a multi-level approach and a gradient-based sampling strategy to learn regression functions efficiently. I show that the regression approach consistently outperforms state-of-the-art methods on a variety of testing datasets. Finally, I present a comparative study on how to apply three discriminative learning approaches - classification, regression, and ranking - to deformable shape segmentation. I discuss how to extend the idea of the regression approach to build discriminative models using classification and ranking. I propose sampling strategies to collect training examples from a high-dimensional model space for the classification and the ranking approach. I also propose a ranking algorithm based on Rankboost to learn a discriminative model for segmentation. Experimental results on left ventricle and left atrium segmentation from ultrasound images and facial feature localization demonstrate that the discriminative models outperform generative models and energy minimization methods by a large margin.
Date of publication
Resource type
Rights statement
  • In Copyright
  • McMillan, Leonard
Degree granting institution
  • University of North Carolina at Chapel Hill
  • Open access

This work has no parents.