You are here

Sparse Coding and Compressed Sensing: Locally Competitive Algorithms and Random Projections

Download pdf | Full Screen View

Date Issued:
2016
Summary:
For an 8-bit grayscale image patch of size n x n, the number of distinguishable signals is 256(n2). Natural images (e.g.,photographs of a natural scene) comprise a very small subset of these possible signals. Traditional image and video processing relies on band-limited or low-pass signal models. In contrast, we will explore the observation that most signals of interest are sparse, i.e. in a particular basis most of the expansion coefficients will be zero. Recent developments in sparse modeling and L1 optimization have allowed for extraordinary applications such as the single pixel camera, as well as computer vision systems that can exceed human performance. Here we present a novel neural network architecture combining a sparse filter model and locally competitive algorithms (LCAs), and demonstrate the networks ability to classify human actions from video. Sparse filtering is an unsupervised feature learning algorithm designed to optimize the sparsity of the feature distribution directly without having the need to model the data distribution. LCAs are defined by a system of di↵erential equations where the initial conditions define an optimization problem and the dynamics converge to a sparse decomposition of the input vector. We applied this architecture to train a classifier on categories of motion in human action videos. Inputs to the network were small 3D patches taken from frame di↵erences in the videos. Dictionaries were derived for each action class and then activation levels for each dictionary were assessed during reconstruction of a novel test patch. We discuss how this sparse modeling approach provides a natural framework for multi-sensory and multimodal data processing including RGB video, RGBD video, hyper-spectral video, and stereo audio/video streams.
Title: Sparse Coding and Compressed Sensing: Locally Competitive Algorithms and Random Projections.
263 views
153 downloads
Name(s): Hahn, William E., author
Barenholtz, Elan, Thesis advisor
Florida Atlantic University, Degree grantor
Charles E. Schmidt College of Science
Center for Complex Systems and Brain Sciences
Type of Resource: text
Genre: Electronic Thesis Or Dissertation
Date Created: 2016
Date Issued: 2016
Publisher: Florida Atlantic University
Place of Publication: Boca Raton, Fla.
Physical Form: application/pdf
Extent: 287 p.
Language(s): English
Summary: For an 8-bit grayscale image patch of size n x n, the number of distinguishable signals is 256(n2). Natural images (e.g.,photographs of a natural scene) comprise a very small subset of these possible signals. Traditional image and video processing relies on band-limited or low-pass signal models. In contrast, we will explore the observation that most signals of interest are sparse, i.e. in a particular basis most of the expansion coefficients will be zero. Recent developments in sparse modeling and L1 optimization have allowed for extraordinary applications such as the single pixel camera, as well as computer vision systems that can exceed human performance. Here we present a novel neural network architecture combining a sparse filter model and locally competitive algorithms (LCAs), and demonstrate the networks ability to classify human actions from video. Sparse filtering is an unsupervised feature learning algorithm designed to optimize the sparsity of the feature distribution directly without having the need to model the data distribution. LCAs are defined by a system of di↵erential equations where the initial conditions define an optimization problem and the dynamics converge to a sparse decomposition of the input vector. We applied this architecture to train a classifier on categories of motion in human action videos. Inputs to the network were small 3D patches taken from frame di↵erences in the videos. Dictionaries were derived for each action class and then activation levels for each dictionary were assessed during reconstruction of a novel test patch. We discuss how this sparse modeling approach provides a natural framework for multi-sensory and multimodal data processing including RGB video, RGBD video, hyper-spectral video, and stereo audio/video streams.
Identifier: FA00004713 (IID)
Degree granted: Dissertation (Ph.D.)--Florida Atlantic University, 2016.
Collection: FAU Electronic Theses and Dissertations Collection
Note(s): Includes bibliography.
Subject(s): Artificial intelligence
Expert systems (Computer science)
Image processing -- Digital techniques -- Mathematics
Sparse matrices
Held by: Florida Atlantic University Libraries
Sublocation: Digital Library
Links: http://purl.flvc.org/fau/fd/FA00004713
Persistent Link to This Record: http://purl.flvc.org/fau/fd/FA00004713
Use and Reproduction: Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Use and Reproduction: http://rightsstatements.org/vocab/InC/1.0/
Host Institution: FAU
Is Part of Series: Florida Atlantic University Digital Library Collections.