IP LOgo
distance keeper

Research Topics - Detlev Marpe

 
Graphic Element West Graphic Element Middle Graphic Element East
 
Graphic Element Quadgray Start
Graphic Element Quadgray News
Graphic Element Quadgray Organisation
Graphic Element Quadgray Competences
Graphic Element Quadgray Fields of Application
Graphic Element Quadgray Alliances & Committees
Graphic Element Quadgray Products
Graphic Element Quadgray Events
Graphic Element Quadgreen Staff
Graphic Element Quadgray Jobs
Graphic Element Quadgray Visitors
Graphic Element Quadgray Contact
Graphic Element Quadgray HHI Home
   
 

Context-Based Adaptive Binary Arithmetic Coding (CABAC)


Overview and
Historical PerspectiveThe algorithm of context-based adaptive binary arithmetic coding (CABAC) has been developed within the joint standardization activities of ITU-T and ISO/IEC for the design and specification of H.264/AVC as the latest member in the family of international video coding standards. In a first preliminary version, the new entropy-coding method of CABAC was introduced as a standard contribution [VCEG-L13] to the ITU-T VCEG meeting in January 2001. From that time until completion of the final standard specification of H.264/AVC (Version 1) in May 2003, the CABAC algorithm underwent a series of revisions and further refinements.

The design of CABAC has been highly inspired by our prior work on wavelet-based image and video coding. However, in comparison to these studies, additional aspects previously largely ignored have been taken into account during the development of CABAC. These aspects are mostly related to implementation complexity and additional requirements in terms of conformity and applicability. As a consequence of these naturally important criteria within any standardization effort, additional constraints have been imposed on the design of CABAC with the result that some of its original algorithmic components, like the binary arithmetic coding engine have been completely re-designed. Other components that are needed to alleviate potential losses in coding efficiency when using small-sized slices, as further described below, were added at a later stage of the development. Support of additional coding tools such as interlaced coding, variable-block size transforms (as considered for Version 1 of H.264/AVC) as well as the later re-introduced, simplified use of 8x8 transforms have also been integrated at a time when the core design of CABAC was already reaching a level of technical maturity.

At that time - and also at a later stage when the scalable extension of H.264/AVC was designed - another feature of CABAC has proven to be very useful. It turned out that in contrast to entropy-coding schemes based on VLCs, the CABAC coding approach offers an additional advantage in terms of extensibility such that the support of newly added syntax elements can be achieved in a more simple and fair manner. Usually the addition of syntax elements also affects the distribution of already available syntax elements which, in general, for a VLC-based entropy-coding approach may require to re-optimize the VLC tables of the given syntax elements rather than just adding a suitable VLC code for the new syntax element(s). Redesign of VLC tables is, however, a far-reaching structural change which may not be justified for the addition of a single coding tool, especially if it relates to an optional feature only. Since CABAC guarantees an inherent adaptivity to the actually given (conditional) probability, there is no need for further structural adjustments besides the choice of a binarization or context model which, as a first approximation, can be chosen in a canonical way by using the prototypes already specified in the CABAC design.

CABAC has been adopted as a normative part of the H.264/AVC standard; it is one of two alternative methods of entropy coding in the new video coding standard. The other method specified in H.264/AVC is a low-complexity entropy-coding technique based on the usage of context-adaptively switched sets of variable-length codes, so-called Context-Adaptive Variable-Length Coding (CAVLC). Compared to CABAC, CAVLC offers reduced implementation costs at the price of lower compression efficiency. For TV signals in standard- or high-definition resolution, CABAC typically provides bit-rate savings of 10-20% relative to CAVLC at the same objective video quality.


CABAC block diagram

High-Level Summary of Algorithmic FeaturesThe design of CABAC involves the key elements of binarization, context modeling, and binary arithmetic coding. These elements are illustrated as the main algorithmic building blocks of the CABAC encoding block diagram, as shown above.

Binarization: The coding strategy of CABAC is based on the finding that a very efficient coding of syntax-element values in a hybrid block-based video coder, like components of motion vector differences or transform-coefficient level values, can be achieved by employing a binarization scheme as a kind of preprocessing unit for the subsequent stages of context modeling and binary arithmetic coding. In general, a binarization scheme defines a unique mapping of syntax element values to sequences of binary decisions, so-called bins, which can also be interpreted in terms of a binary code tree. The design of binarization schemes in CABAC is based on a few elementary prototypes whose structure enables simple online calculation and which are adapted to some suitable model-probability distributions.

Coding-Mode Decision and Context Modeling: By decomposing each syntax element value into a sequence of bins, further processing of each bin value in CABAC depends on the associated coding-mode decision which can be either chosen as the regular or the bypass mode. The latter is chosen for bins related to the sign information or for lower significant bins which are assumed to be uniformly distributed and for which, consequently, the whole regular binary arithmetic encoding process is simply bypassed. In the regular coding mode, each bin value is encoded by using the regular binary arithmetic-coding engine, where the associated probability model is either determined by a fixed choice, without any context modeling, or adaptively chosen depending on the related context model. As an important design decision, the latter case is generally applied to the most frequently observed bins only, whereas the other, usually less frequently observed bins, will be treated using a joint, typically zero-order probability model. In this way, CABAC enables selective context modeling on a sub-symbol level, and hence, provides an efficient instrument for exploiting inter-symbol redundancies at significantly reduced overall modeling or learning costs.  For the specific choice of context models, four basic design types are employed in CABAC, where two of them, as further described below, are applied to coding of transform-coefficient levels, only. The design of these four prototypes is based on a priori knowledge about the typical characteristics of the source data to be modeled and it reflects the aim to find a good compromise between the conflicting objectives of avoiding unnecessary modeling-cost overhead and exploiting the statistical dependencies to a large extent. 

Pre-Coding of Transform-Coefficient Levels: Coding of residual data in CABAC involves specifically designed syntax elements that are different from those used in the traditional run-length pre-coding approach. For each block with at least one nonzero quantized transform coefficient, a sequence of binary significance flags, indicating the position of significant (i.e., nonzero) coefficient levels within the scanning path, is produced in a first step. Interleaved with these significance flags, a sequence of so-called last flags (one for each significant coefficient level) is generated for signaling the position of the last significant level within the scanning path. This so-called significance information is transmitted as a preamble of the regarded transform block followed by the magnitude and sign information of nonzero levels in reverse scanning order. The context-modeling specifications for coding of binarized level magnitudes are based on the number of previously transmitted level magnitudes greater or equal to 1 within the reverse scanning path, which is motivated by the observation that levels with magnitude equal to 1 are statistical dominant at the end of the scanning path. For context-based coding of the significance information, each significance / last flag is conditioned on its position within the scanning path which can be interpreted as a frequency-dependent context modeling. Furthermore, for each of the different transforms (16x16, 4x4 and 2x2) in H.264/AVC (Version 1) as well as for luma and chroma component, a different set of contexts denoted as context category is employed. This allows the discrimination of statistically different sources with the result of a significantly better adaptation to the individual statistical characteristics.

Probability Estimation and Binary Arithmetic Coding: On the lowest level of processing in CABAC, each bin value enters the binary arithmetic encoder, either in regular or bypass coding mode. For the latter, a fast branch of the coding engine with a considerably reduced complexity is used while for the former coding mode, encoding of the given bin value depends on the actual state of the associated adaptive probability model that is passed along with the bin value to the M coder - a term that has been chosen for the novel table-based binary arithmetic coding engine in CABAC. The specific features and the underlying design principles of the M coder can be found here. In the following, we will present some important aspects of probability estimation in CABAC that are not intimately tied to the M coder design.

FSM-based probability estimator in the M coder

Probability estimation in CABAC is based on a table-driven estimator using a finite-state machine (FSM) approach with transition rules as illustrated above. Each probability model in CABAC can take one out of 128 different states with associated probability values p ranging in the interval [0.01875, 0,98125]. Note that with the distinction between the least probable symbol (LPS) and the most probable symbol (MPS), it is sufficient to specify each state by means of the corresponding LPS-related probability pLPS  [0.01875, 0,5]. The method for designing the FSM transition rules was borrowed from HOWARD and VITTER using a model of "exponential aging" with the following transition rule from time instance t to t+1:
p(t+1)LPS = {
α·p(t)LPS,
    if an MPS occurs (black solid lines in graph above)
α·p(t)LPS + (1 − α),
    if an LPS occurs (red dashed lines in graph above).
According to this relationship, the scaling factor a can be determined by pmin = 0.5 αN-1, with the choice of pmin = 0.01875 and N = 128 / 2 = 64. Note however that the actual transition rules, as tabulated in CABAC and as shown in the graph above, were determined to be only approximately equal to those derived by this exponential aging rule.  

Typically, without any prior knowledge of the statistical nature of the source, each model would be initialized with the state corresponding to a uniform distribution (p = 0.5). However, in cases where the amount of data in the process of adapting to the true underlying statistics is comparably small, it is useful to provide some more appropriate initialization values for each probability model in order to better reflect its typically skewed nature. This is the purpose of the initialization process for context models in CABAC which operates on two levels. On the lower level, there is the quantization-parameter dependent initialization which is invoked at the beginning of each slice. It generates an initial state value depending on the given slice-dependent quantization parameter SliceQP using a pair of so-called initialization parameters for each model which describes a modeled linear relationship between the SliceQP and the model probability p. As an extension of this low-level pre-adaptation of probability models, CABAC provides two additional pairs of initialization parameters for each model that is used in predictive (P) or bi-predictive (B) slices. Since the encoder can choose between the corresponding three tables of initialization parameters and signal its choice to the decoder, an additional degree of pre-adaptation is achieved, especially in the case of using small slices at low to medium bit rates.

Related overview paper:

D. Marpe, H. Schwarz, and T. Wiegand: Context-Based Adaptive Binary Arithmetic Coding in the H.264 / AVC Video Compression Standard, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 620-636, July 2003, invited paper.  [PDF]

Erratum
: On p. 631, left column below Table VI,  "max(...)" must be replaced by "min(...)" in the two expressions for specifying context index increments for bins related to absolute values of transform coefficients.

Related standard contributions (in chronological order, as listed here):

[VCEG-L13], [VCEG-M59], [VCEG-O18], [JVT-B100], [JVT-B101], [JVT-C060], [JVT-C061], [JVT-D019], [JVT-D020], [JVT-D021], [JVT-E059], [JVT-E086], [JVT-E154], [JVT-F039]

Back to Home Page Detlev Marpe