convolution.h File Reference

This file contains some fairly low-level utilities for implementing convolutional neural networks and related methods such as TDNNs, which are mostly used in nnet-convolutional-component.h. More...

#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "itf/options-itf.h"
#include "matrix/matrix-lib.h"
#include "cudamatrix/cu-matrix-lib.h"
#include "nnet3/nnet-common.h"
#include <iostream>
Include dependency graph for convolution.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  ConvolutionModel
 This comment explains the basic framework used for everything related to time-height convolution. More...
 
struct  ConvolutionModel::Offset
 
struct  ConvolutionComputation
 This struct represents the structure of a convolution computation. More...
 
struct  ConvolutionComputation::ConvolutionStep
 
struct  ConvolutionComputationOptions
 This struct contains options for compiling the convolutional computation. More...
 
struct  ConvolutionComputationIo
 

Namespaces

 kaldi
 This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for mispronunciations detection tasks, the reference:
 
 kaldi::nnet3
 
 kaldi::nnet3::time_height_convolution
 

Functions

void CheckModelAndIo (const ConvolutionModel &model, const ConvolutionComputationIo &io, bool allow_extra_input=false)
 Check that this model and this I/O request are compatible in terms of required context, etc, and crash if not. More...
 
void CompileConvolutionComputation (const ConvolutionModel &model, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, const ConvolutionComputationOptions &opts, ConvolutionComputation *computation, std::vector< Index > *input_indexes_modified, std::vector< Index > *output_indexes_modified)
 This function does the compilation for a convolution computation; it's a wrapper for the functions below, which should not have to be called by the end user. More...
 
void ConvolveForward (const ConvolutionComputation &conv_comp, const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &params, CuMatrixBase< BaseFloat > *output)
 This does the forward computation of convolution. More...
 
void ConvolveBackwardData (const ConvolutionComputation &conv_comp, const CuMatrixBase< BaseFloat > &params, const CuMatrixBase< BaseFloat > &output_deriv, CuMatrixBase< BaseFloat > *input_deriv)
 This does the part of the backward derivative computation of convolution, that propagates derivatives back to the input data. More...
 
void ConvolveBackwardParams (const ConvolutionComputation &conv_comp, const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &output_deriv, BaseFloat alpha, CuMatrixBase< BaseFloat > *params_deriv)
 This does the part of the backward derivative computation of convolution, that computes derivatives w.r.t. More...
 
void GetComputationIo (const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, ConvolutionComputationIo *io)
 This function takes lists of input and output indexes to a computation (e.g. More...
 
void GetIndexesForComputation (const ConvolutionComputationIo &io, const std::vector< Index > &orig_input_indexes, const std::vector< Index > &orig_output_indexes, std::vector< Index > *input_indexes, std::vector< Index > *output_indexes)
 This function computes the reordered and possibly padded indexes corresponding to the computation in 'io'. More...
 
void PadComputationInputTime (const ConvolutionModel &model, ConvolutionComputationIo *io)
 This function extends the set of input indexes that the computation has, to account for any required zero-padding in the time dimension. More...
 
void PadModelHeight (const ConvolutionModel &model, ConvolutionModel *model_padded)
 This function takes a model that might require zero padding in the height dimension and outputs a model accepting a possibly-larger input dimension which does not require zero padding. More...
 
void UnPadModelHeight (const ConvolutionComputationOptions &opts, const ConvolutionModel &model, const ConvolutionModel &model_padded, ConvolutionComputation *computation)
 This function modifies, if necessary, a computation that has been built for the model 'model_padded', so that it can work for the original model 'model'. More...
 
void AppendInputFrames (const ConvolutionModel &model, ConvolutionComputationIo *io, ConvolutionModel *model_appended, ConvolutionComputationIo *io_appended)
 This function takes an input model and I/O specification, and it modifies both of them if necessary to ensure that the output 'io_appended' object has the same input and output time strides (i.e. More...
 
void MakeComputation (const ConvolutionModel &model, ConvolutionComputationIo &io, const ConvolutionComputationOptions &opts, ConvolutionComputation *computation)
 

Detailed Description

This file contains some fairly low-level utilities for implementing convolutional neural networks and related methods such as TDNNs, which are mostly used in nnet-convolutional-component.h.

This would not necessarily be suitable as a general-purpose and self-contained setup for convolution, as it is quite linked with the overall framework of the nnet3 library. (the underlying ideas might be usable, though).

We have chosen to implement this here, rather than using CuDNN, because we realized that it was quite easy to efficiently implement CNNs in the nnet3 framework in a way that would support both GPUs and CPUs, at least for the typical setups that have small patch dimensions (like 1x1 or 3x3). In a typical 3x3 convolution, the entire convolution can be done using 3 matrix multiplies (and 3 corresponding CopyColsFromMat calls).

Definition in file convolution.h.