Layer Normalization - EXPLAINED (in Transformer Neural Networks)
Layer Normalization - EXPLAINED (in Transformer Neural Networks)
0~4min:什么是multi-head attention
5~7min:layer norm图示
7~9min:公式举例layer norm
9:54-end:layer norm的代码示例
group norm
- YK油管解说 Group Normalization (Paper Explained)
- 论文Group Normalization