The difference between AVS video standard and H.264 core technology

One of the most important developments in video coding technology in the past few years was the development of the H.264 / MPEG-4 AVC standard by the Joint Video Team (JVT) of ITU and ISO / IEC. In the development process, the industry has adopted many different names for this new standard. ITU began to use important new coding tools to deal with H.26L (long-term) in 1997, and the results were encouraging, so ISO decided to form a JVT with ITU and adopt a common standard. Therefore, you will sometimes hear someone call this standard JVT, although it is not an informal name. The ITU approved the new H.264 standard in May 2003. ISO approved the standard in October 2003 under the name MPEG-4 Part 10, Advanced Video Coding or AVC.

Improvements achieved by H.264 create new market opportunities

H.264 / AVC has made a huge breakthrough in compression efficiency, generally reaching about 2 times the MPEG-2 and MPEG-4 simplified compression efficiency. In the official test conducted by JVT, H.264 achieved a coding efficiency improvement of more than 1.5 times in 78% of 85 test cases, more than 2 times in 77% of cases, and even up to 4 times in some cases. The improvements achieved by H.264 have created new market opportunities. For example, 600Kbps VHS-quality video can achieve video on demand through ADSL lines; high-definition movies can adapt to ordinary DVDs without the need for a new laser head.

H.264 supports three categories during standardization: basic category, main category and extended category. Later, a revision called High Fidelity Range Extension (FRExt) introduced four additional classes called advanced classes. In the early days, it was mainly the basic class and the main class that aroused everyone's interest. Basic classes reduce computing and system memory requirements, and are optimized for low latency. Due to the inherent delay of B frames and the computational complexity of CABAC, it does not include both. The basic category is very suitable for videophone applications and other applications that require low-cost real-time coding.

The main class provides the highest compression efficiency, but its required processing power is also much higher than the basic class, making it difficult to use for low-cost real-time encoding and low-latency applications. Broadcasting and content storage applications are most interested in the main category, they are to get the highest video quality at the lowest possible bit rate.

Although H.264 uses the same main encoding functions as the old standard, it also has many new functions that are different from the old standard, which together improve the encoding efficiency. The main differences are summarized as follows:

Intra prediction and encoding: H.264 uses spatial intra prediction technology to predict pixels in Intra-MB of neighboring pixels of neighboring blocks. It encodes the prediction residual signal and prediction mode, rather than the actual pixels in the encoding block. This can significantly improve intra-frame coding efficiency.

Inter-frame prediction and coding: The inter-frame coding in H.264 uses the main functions of the old standard, while also increasing flexibility and operability, including several block size options for multiple functions, such as: motion compensation , Quarter-pixel motion compensation, multiple reference frames, generalized bidirectional prediction and adaptive loop deblocking.

Variable vector block size: Allows motion compensation to be performed with different block sizes. A single motion vector can be transmitted for blocks as small as 4 (4, so in the case of bidirectional prediction, up to 32 motion vectors can be transmitted for a single MB. In addition, 16 (8, 8 (16, 8 (8, 8 (4 And 4 (8 block size. Reducing the block size can improve the processing power of motion details, thus improving the subjective quality experience, including eliminating large block distortion.

Quarter-pixel motion estimation: Motion compensation can be improved by allowing half-pixel and quarter-pixel motion vector resolution.

Multi-reference frame prediction: 16 different reference frames can be used for inter-frame coding, which can improve the subjective perception of video quality and improve coding efficiency. Providing multiple reference frames also helps improve the fault tolerance of the H.264 bitstream. It is worth noting that this feature will increase the memory requirements of the encoder and decoder, because multiple reference frames must be stored in the memory.

Adaptive loop deblocking filter: H.264 uses an adaptive deblocking filter, which processes the horizontal and vertical block edges in the prediction loop to eliminate distortion caused by block prediction errors. This kind of filtering is usually based on 4 (4 block boundaries as the basis of calculation, in which 3 pixels on each side of the boundary can be updated by a 4-stage filter.

Integer transform: The early standards that adopted DCT must define the tolerance range for rounding errors for the fixed-point implementation of the inverse transform. The drift caused by IDCT accuracy mismatch between encoder and decoder is the source of quality loss. H.264 solves this problem by using integer 4 (4 spatial transform)-this transform is an approximation of DCT. The small block of 4 (4 also helps reduce blocking and ringing distortion.

Quantization and transform coefficient scanning: The transform coefficients are quantized by scalar quantization, without increasing the dead zone. Similar to the previous standard, each MB can choose a different quantization step size, but the step size increases at a compound rate of approximately 12.5% ​​instead of a fixed increment. At the same time, a finer quantization step size can also be used for chrominance components, especially in the case of coarse quantization photometric coefficients.

Entropy coding: Unlike previous standards that provide multiple static VLC tables based on the data types involved, H.264 uses context-adaptive VLC for transform coefficients, while adopting a unified VLC (UniversalVLC) method for all other symbols. The main class also supports the new context-adaptive binary arithmetic encoder (CABAC). CAVLC is superior to previous VLC implementations, but the cost is higher than VLC.

CABAC uses probabilistic models of encoders and decoders to process all syntax elements, including transform coefficients and motion vectors. In order to improve the coding efficiency of arithmetic coding, the basic probability model adapts to the constantly changing statistics in the video frame through a method called context modeling. Context modeling analysis provides conditional probability estimates of encoded symbols. As long as the appropriate context model is used, it is possible to switch between different probability models according to the encoded symbols around the symbol to be encoded, and then make full use of the redundancy between the symbols. Each syntax element can maintain a different model (for example, motion vectors and transform coefficients have different models). Compared with the VLC entropy coding method (UVLC / CAVLC), CABAC can save an additional 10% bit rate.

Weighted prediction: It uses the weighted sum of the forward and backward predictions to establish the prediction of the bidirectional interpolation macroblock, which can improve the coding efficiency when the scene changes, especially in the case of fading.

Fidelity Range Extension: In July 2004, the H.264 standard added a new revision called Fidelity Range Extension (FRExt) [11]. This extension adds a full set of tools to H.264, and allows for additional color gamut, video format, and bit depth. In addition, support for lossless inter-frame encoding and stereoscopic display video has been added. The FRExt revision introduces 4 new categories in H.264, namely:

• High Profile (HP): for standard 4: 2: 0 chroma sampling, 8-bit color per component. This class introduces new tools-detailed later.

• High 10 Profile (Hi10P): Standard 4: 2: 0 chroma sampling for higher definition video display, 10-bit color.

• High 4: 2: 2 10 bit color profile (H422P): used for source editing function.

• High 4: 4: 4 12 bit color profile (H444P): the highest quality source editing and color fidelity, support for lossless encoding of video areas and new integer color gamut conversion (from RGB to YUV and black).

In new application areas, H.264 HP is particularly beneficial for broadcasting and DVD. Some experiments show that the performance of H.264 HP is three times higher than MPEG2. The following introduces the main additional tools introduced in H.264 HP.

Adaptive residual block size and integer 8 (8 transform: the residual block used for transform encoding can be switched between 8 (8 and 4 (4. A new 16-bit integer transform for 8 (8 block is introduced. Small blocks can still use the previous 4 (4 transformation.

8 (8 luma intra-prediction: 8 modes have been added. In addition to the previous 16 (16 and 4 (4 blocks, the luma intra-macro block can also intra-predict 8 (8 blocks.

Quantization weighting: New quantization weighting matrix for quantizing 8 (8 transform coefficients.

Monochrome: supports black / white video encoding.

AVS video standard

In 2002, the Audio and Video Technology Standards (AVS) Working Group established by the Ministry of Information Industry of China announced that it was preparing to write a national standard for mobile multimedia, broadcasting, DVD and other applications. The video standard is called AVS [14] and consists of two related parts: AVS-M for mobile video applications and AVS1.0 for broadcast and DVD. The AVS standard is similar to H.264.

AVS1.0 supports both interlaced and progressive scanning modes. The P frame in AVS can use the forward reference frame of 2 frames, while allowing the B frame to use one frame before and after the other. In interlaced mode, 4 fields can be used as a reference. Frame / field encoding in interlaced mode can be performed only at the frame level, which is different from H.264, which allows MB-level adaptation of this option. AVS has a loop filter similar to H.264 and can be turned off at the frame level. In addition, the B frame does not require a loop filter. Intra prediction is performed in units of 8 (8 blocks. MC allows 1/4 pixel compensation for luminance blocks. The block size of ME can be 16 (16, 16 (8, 8 (16, or 8 (8. Transform It is based on 16-bit 8 (8 integer transform (similar to WMV9). VLC is based on context-adaptive 2D run / level encoding. It uses 4 different Exp-Golomb encoding. The encoding used for each quantized coefficient is adaptive to The same 8 (the previous symbol in 8 blocks. Since the Exp-Golomb table is a parameterized table, the table is smaller. The video quality of AVS 1.0 used for progressive video sequences is slightly inferior to the H.264 master at the same bit rate class.

AVS-M is mainly aimed at mobile video applications, and it overlaps with the H.264 basic specifications. It only supports progressive video, I and P frames, not B frames. The main AVS-M encoding tools include 4 (4 block-based intra prediction, 1/4 pixel motion compensation, integer transform and quantization, context adaptive VLC, and highly simplified loop filter. Similar to the H.264 basic specification The motion vector block size in AVS-M is reduced to 4 (4, so MB can have up to 16 motion vectors. Multi-frame prediction is used, but only 2 reference frames are supported. In addition, A.V-M also defines H. A subset of 264 HRD / SEI messages. The encoding frequency of AVS-M is about 0.3dB, which is slightly inferior to the H.264 basic specifications under the same settings, but the complexity of the decoder is reduced by about 20%.

Background of H.264 and AVS

H.264 / MPEG-4AVC is a new-generation video coding standard jointly developed by ITU-T's VCG (Video Coding Experts Group) and ISO / IEC's MPEG (Moving Picture Experts Group). Applications include video phones, video conferencing, etc. The main feature of H.264 is that it greatly improves the compression rate, which is more than double the compression efficiency of MPEG-2 and MPEG-4. The core technology of H.264 is the same as the previous standard, and the hybrid coding framework based on predictive transformation is still used, but the implementation of the details is very different, that is, the improvement of details leads to a great increase in compression efficiency. And the new generation video coding standard H.264 has good network adaptability and fault tolerance.

The birth of AVS can be said to be a historical opportunity. In the face of high royalties such as H.264 and MPEG-2, China's digital video industry is facing serious challenges. In addition, China is committed to improving the core competitiveness of the domestic digital audio and video industry. The Ministry of Information Industry Science and Technology Department approved the establishment of the "Digital Audio and Video Coding Technology Standard Working Group" in June 2006. The United Nations is engaged in digital audio and video Scientific research institutions and enterprises engaged in the research and development of codec technology have proposed the source coding standard of China's independent intellectual property rights for the needs of China's audio and video industry-the "Advanced Audio and Video Coding of Information Technology" series of standards, abbreviated as AVS ). The independent AVS standard is at the international advanced level in technology and performance. If we seize this opportunity, China may have a comprehensive initiative in the technology-patent-standard-chip-system-industry industrial chain.

Analysis and comparison of H.264 and AVS core technologies

H.264 is the same as the previous standard, it is still a hybrid coding framework. The AVS video standard uses a similar technical framework to H.264, including transformation, quantization, entropy coding, intra prediction, inter prediction, and loop filtering. Wait for the module. The differences in their core technologies include the following:

1. Transformation and Quantization

H.264 uses block-based transform coding for the residual data to remove the spatial redundancy of the original image, so that the image capacity is concentrated on a small part of the coefficient. The DC coefficient value is generally the largest, which can improve the compression ratio and enhance the resistance. Interference ability. The previous standard generally uses DCT transformation. The disadvantage of this transformation is that there will be a mismatch. The original data will have a difference after transformation and inverse transformation recovery, because it is also a large amount of calculation. H.264 uses an integer transform based on 4 & TImes; 4 blocks.

AVS uses integer conversion of 8 & TImes; 8, which can be realized without mismatch on 16-bit processors. The de-correlation of high-resolution video images is more effective than 4 & TImes; 4 transformation. It adopts 64-level quantization, which can adapt to the requirements of different applications and services on code stream and quality.

Second, intra prediction

Both H.264 and AVS technologies use intra prediction, predict the current block with adjacent pixels, and use multiple prediction modes that represent spatial domain textures. H.264 brightness prediction has 4 prediction modes: 4 blocks and 16 × 16 blocks. For 4 × 4 blocks: from -135 degrees to +22.5 degrees plus a DC prediction, there are 9 prediction directions in total; For 16 × 16 blocks: there are 4 prediction directions. Chrominance prediction is 8 × 8 blocks, there are 4 prediction modes, similar to the 4 modes of intra 16 × 16 prediction, where DC is mode 0, horizontal is mode 1, vertical is mode 2, and plane is mode 3.

3. Inter prediction

H.264 inter prediction is a prediction mode that uses encoded video frames and block-based motion compensation. The difference from the previous standard inter prediction is the wider block size range, the use of sub-pixel motion vectors, and the use of multiple reference frames.

H.264 has 16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8 and 4 × 4. There are 8 macroblock and sub-macroblock divisions, while AVS only has 16 × 16, 16 There are 4 kinds of macroblock division methods in total: × 8, 8 × 16 and 8 × 8.

H.264 supports using multiple different reference frames to predict inter-frame macroblocks and slices. In AVS, P frames can use up to 2 frames of forward reference frames, and B frames use one reference frame before and after each.

Four, entropy coding

H.264 has formulated the entropy coding efficiency based on the amount of information. One is to use uniform variable length coding (UVLC) for all the symbols to be coded, and the other is to use content-based adaptive binary arithmetic coding (CABAC, Context-Adaptive Binary Arithmetic Coding), greatly reducing the redundancy of block coding correlation and improving coding efficiency. The calculation complexity of UVLC is relatively low, which is mainly aimed at the application with strict coding time. The disadvantage is low efficiency and high code rate. CABAC is a very efficient entropy coding method, and its coding efficiency is 50% higher than UVLC coding.

AVS entropy coding uses adaptive variable length coding technology. In the AVS entropy coding process, all syntax elements and residual data are mapped into a binary bit stream in the form of exponential Golomb codes.

The advantage of using exponential Columbus code is that on the one hand, its hardware complexity is relatively low, and the code can be parsed according to the closed formula, without looking up the table; on the other hand, it can flexibly determine the K-order exponential Golomb according to the probability distribution of the coding element For coding, if K is selected properly, the coding efficiency can approach the information entropy.

The block transformation coefficients of the prediction residuals are scanned to form (level, run) pairs. Level and run are not independent events, but there is a strong correlation. In AVS, level and run are coded in two dimensions, and are based on The different probability distribution trends of the current level and run adaptively change the order of the exponential Golomb code.

In addition, there are no SI and SP frames in AVS. It can be said that AVS was developed on the basis of H.264, absorbing the essence of H.264, but in order to bypass the troubles of patents, it had to give up some core algorithms of H.264. change

The cost is that the complexity is greatly reduced when the coding efficiency is slightly reduced.

AVS is a standard of China's independent intellectual property rights, which has not yet been used on a large scale and is in its infancy. Most enterprises are in a wait-and-see state, without a large amount of capital investment, and facing many difficulties, but its broad prospects cannot be ignored, and with the strong support of the country, it will definitely develop more perfect.

S15 Side Flat LED Neon

Led Neon,Neon Lights,Neon Light Signs,Light Up Signs

Tes Lighting Co,.Ltd. , https://www.neonflexlight.com