x264 码率控制中自适应量化模式 AQ mode分析-编程知识

AQ mode

Adaptive Quantization mode，即自适应量化模式，根据 MB 的复杂度来调整每个 MB 量化时的量化参数。该模式可以更好地将码率分配到各个宏块中，以获得更好的视频质量和压缩效果。
x264 中与之相关的参数i_aq_mode、f_aq_strength。

i_aq_mode

1. i_aq_mode取值为X264_AQ_NONE(0)、X264_AQ_VARIANCE(1)、X264_AQ_AUTOVARIANCE(2)、X264_AQ_AUTOVARIANCE_BIASED(3)。在 validate_parameters函数中会校验，将取值锁定到[0， 3]中。0 表示禁用 AQ，1 表示方差 AQ（复杂度掩码），2 表示自动方差 AQ，3 表示偏暗场景的自动方差AQ。

在这里插入图片描述

在 x264_param_default中设置为X264_AQ_VARIANCE(1)。在param_apply_preset中 preset=ultrafast时，设置0；在param_apply_tune中tune=psnr时设置X264_AQ_NONE(0)；tune=ssim时设置X264_AQ_AUTOVARIANCE(2)；

f_aq_strength

f_aq_strength取值为 0.0～3.0，自适应量化的强度调节，设置 AQ 偏向于低细节度宏块的强度，在 validate_parameters函数中会校验，将取值锁定到[0.0， 3.0]中。
在x264_param_default中设置为 1.0；在param_apply_tune中 tune=animation 时设置为 0.6，tune=gain 时设置为 0.5，tune=stillimage 时设置为 1.2，tune=touhou 时设置为 1.3。

AQ 的起源

为什么需要 AQ，x264 开发者在分析研究 VP8 之后的一段论述：

For quantization, the core process is basically the same among all MPEG-like video formats, and VP8 is no exception. The primary ways that video formats tend to differentiate themselves here is by varying quantization scaling factors. There are two ways in which this is primarily done: frame-based offsets that apply to all coefficients or just some portion of them, and macroblock-level offsets. VP8 primarily uses the former; in a scheme much less flexible than H.264′s custom quantization matrices, it allows for adjusting the quantizer of luma DC, luma AC, chroma DC, and so forth, separately. The latter (macroblock-level quantizer choice) can, in theory, be done using its “segmentation map” features, albeit very hackily and not very efficiently.
The killer mistake that VP8 has made here is not making macroblock-level quantization a core feature of VP8. Algorithms that take advantage of macroblock-level quantization are known as “adaptive quantization” and are absolutely critical to competitive visual quality. My implementation of variance-based adaptive quantization (before, after) in x264 still stands to this day as the single largest visual quality gain in x264 history. Encoder comparisons have showed over and over that encoders without adaptive quantization simply cannot compete.
Thus, while adaptive quantization is possible in VP8, the only way to implement it is to define one segment map for every single quantizer that one wants and to code the segment map index for every macroblock. This is inefficient and cumbersome; even the relatively suboptimal MPEG-style delta quantizer system would be a better option. Furthermore, only 4 segment maps are allowed, for a maximum of 4 quantizers per frame.
Verdict on Quantization: Lack of well-integrated adaptive quantization is going to be a killer when the time comes to implement psy optimizations. Overall, much worse.

AQ原理

逻辑

aqmode实现逻辑在函数x264_adaptive_quant_frame中完成，根据不同的i_aq_mode和f_aq_strength的取值来对f_qp_offset和f_qp_offset_aq进行赋值。
f_qp_offset和f_qp_offset_aq的值会在函数x264_ratecontrol_mb_qp中应用，在该函数中，对每个宏块在原先的 qp 基础上再加上x264_adaptive_quant_frame计算出来的 qp 偏移值。
f_qp_offset和f_qp_offset_aq也会应用到 mbtree 模块中，在macroblock_tree_finish函数中进一步调整f_qp_offset的取值。

源码分析

流程

x264_adaptive_quant_frame函数在x264_encoder_encode函数中被调用。
禁用 AQ，即i_aq_mode = X264_AQ_NONE 或 f_aq_strength = 0时，外部如果有quant_offsets时，给f_qp_offset和f_qp_offset_aq赋值，否者全部置 0；此时如果开启加权预测，对每个 MB调用ac_energy_mb来计算方差数据。
开启 AQ，第一步骤根据i_aq_mode的值来分情况讨论。当i_aq_mode = 2 或 3 时，对每个 MB 调用ac_energy_mb来完成对strength、avg_adj、bias_strength变量的赋值；当 i_aq_mode=1 时，变量strength根据参数f_aq_strength和一个常量相加得到。
开启 AQ，第二步骤，对每个 MB 根据i_aq_mode不同的取值，得到变量qp_adj的取值；此外如果外部有quant_offsets时，还要对每个 MB 的qp_adj加上额外的值；
开启 AQ，第三步骤，将qp_adj赋值给f_qp_offset和f_qp_offset_aq;
最后从 SSD 计算中移除均值。

源码

void x264_adaptive_quant_frame( x264_t *h, x264_frame_t *frame, float *quant_offsets )
{/* Initialize frame stats */for( int i = 0; i < 3; i++ ){frame->i_pixel_sum[i] = 0;frame->i_pixel_ssd[i] = 0;}/* Degenerate cases */if( h->param.rc.i_aq_mode == X264_AQ_NONE || h->param.rc.f_aq_strength == 0 ){/* Need to init it anyways for MB tree */if( h->param.rc.i_aq_mode && h->param.rc.f_aq_strength == 0 ){if( quant_offsets ){for( int mb_xy = 0; mb_xy < h->mb.i_mb_count; mb_xy++ )frame->f_qp_offset[mb_xy] = frame->f_qp_offset_aq[mb_xy] = quant_offsets[mb_xy];if( h->frames.b_have_lowres )for( int mb_xy = 0; mb_xy < h->mb.i_mb_count; mb_xy++ )frame->i_inv_qscale_factor[mb_xy] = x264_exp2fix8( frame->f_qp_offset[mb_xy] );}else{memset( frame->f_qp_offset, 0, h->mb.i_mb_count * sizeof(float) );memset( frame->f_qp_offset_aq, 0, h->mb.i_mb_count * sizeof(float) );if( h->frames.b_have_lowres )for( int mb_xy = 0; mb_xy < h->mb.i_mb_count; mb_xy++ )frame->i_inv_qscale_factor[mb_xy] = 256;}}/* Need variance data for weighted prediction */if( h->param.analyse.i_weighted_pred ){for( int mb_y = 0; mb_y < h->mb.i_mb_height; mb_y++ )for( int mb_x = 0; mb_x < h->mb.i_mb_width; mb_x++ )ac_energy_mb( h, mb_x, mb_y, frame );}elsereturn;}/* Actual adaptive quantization */else{/* constants chosen to result in approximately the same overall bitrate as without AQ.* FIXME: while they're written in 5 significant digits, they're only tuned to 2. */float strength;float avg_adj = 0.f;float bias_strength = 0.f;if( h->param.rc.i_aq_mode == X264_AQ_AUTOVARIANCE || h->param.rc.i_aq_mode == X264_AQ_AUTOVARIANCE_BIASED ){float bit_depth_correction = 1.f / (1 << (2*(BIT_DEPTH-8)));float avg_adj_pow2 = 0.f;for( int mb_y = 0; mb_y < h->mb.i_mb_height; mb_y++ )for( int mb_x = 0; mb_x < h->mb.i_mb_width; mb_x++ ){uint32_t energy = ac_energy_mb( h, mb_x, mb_y, frame );float qp_adj = powf( energy * bit_depth_correction + 1, 0.125f );frame->f_qp_offset[mb_x + mb_y*h->mb.i_mb_stride] = qp_adj;avg_adj += qp_adj;avg_adj_pow2 += qp_adj * qp_adj;}avg_adj /= h->mb.i_mb_count;avg_adj_pow2 /= h->mb.i_mb_count;strength = h->param.rc.f_aq_strength * avg_adj;avg_adj = avg_adj - 0.5f * (avg_adj_pow2 - 14.f) / avg_adj;bias_strength = h->param.rc.f_aq_strength;}elsestrength = h->param.rc.f_aq_strength * 1.0397f;for( int mb_y = 0; mb_y < h->mb.i_mb_height; mb_y++ )for( int mb_x = 0; mb_x < h->mb.i_mb_width; mb_x++ ){float qp_adj;int mb_xy = mb_x + mb_y*h->mb.i_mb_stride;if( h->param.rc.i_aq_mode == X264_AQ_AUTOVARIANCE_BIASED ){qp_adj = frame->f_qp_offset[mb_xy];qp_adj = strength * (qp_adj - avg_adj) + bias_strength * (1.f - 14.f / (qp_adj * qp_adj));}else if( h->param.rc.i_aq_mode == X264_AQ_AUTOVARIANCE ){qp_adj = frame->f_qp_offset[mb_xy];qp_adj = strength * (qp_adj - avg_adj);}else{uint32_t energy = ac_energy_mb( h, mb_x, mb_y, frame );qp_adj = strength * (x264_log2( X264_MAX(energy, 1) ) - (14.427f + 2*(BIT_DEPTH-8)));}if( quant_offsets )qp_adj += quant_offsets[mb_xy];frame->f_qp_offset[mb_xy] =frame->f_qp_offset_aq[mb_xy] = qp_adj;if( h->frames.b_have_lowres )frame->i_inv_qscale_factor[mb_xy] = x264_exp2fix8(qp_adj);}}/* Remove mean from SSD calculation */for( int i = 0; i < 3; i++ ){uint64_t ssd = frame->i_pixel_ssd[i];uint64_t sum = frame->i_pixel_sum[i];int width  = 16*h->mb.i_mb_width  >> (i && CHROMA_H_SHIFT);int height = 16*h->mb.i_mb_height >> (i && CHROMA_V_SHIFT);frame->i_pixel_ssd[i] = ssd - (sum * sum + width * height / 2) / (width * height);}
}

aqmode 的应用

在函数x264_ratecontrol_mb_qp中将f_qp_offset和f_qp_offset_aq赋值给变量f_qp_offset，再加到变量 qp 上，作为宏块的 qp 值。

int x264_ratecontrol_mb_qp( x264_t *h )
{x264_emms();float qp = h->rc->qpm;if( h->param.rc.i_aq_mode ){/* MB-tree currently doesn't adjust quantizers in unreferenced frames. */float qp_offset = h->fdec->b_kept_as_ref ? h->fenc->f_qp_offset[h->mb.i_mb_xy] : h->fenc->f_qp_offset_aq[h->mb.i_mb_xy];/* Scale AQ's effect towards zero in emergency mode. */if( qp > QP_MAX_SPEC )qp_offset *= (QP_MAX - qp) / (QP_MAX - QP_MAX_SPEC);qp += qp_offset;}return x264_clip3( qp + 0.5f, h->param.rc.i_qp_min, h->param.rc.i_qp_max );
}

此外在 mbtree 中模块也会跟 f_qp_offset 有联系，如果mbtree改变了量化器，我们需要重新计算帧成本，而不需要重新运行lookahead；在slicetype_frame_cost_recalculate函数中f_qp_offset_aq和f_qp_offset用来计算宏块代价；在macroblock_tree_finish函数会进一步改变f_qp_offset的值，在macroblock_tree函数里有处理。
mbtree 的应用也会影响 VBV 模块，在函数vbv_frame_cost中有所体现。
因此，AQ、mbtree、VBV 都会对最终的码率和质量产生影响和作用，属于码率控制的重要部分。