官方文档
官网架构图
innodb 特性
内存
buffer pool
采用优化后的LRU算法,
- 3/8 of the buffer pool is devoted to the old sublist.
- The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist.
- When InnoDB reads a page into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A page can be read because it is required for a user-initiated operation such as an SQL query, or as part of a read-ahead operation performed automatically by InnoDB.
- Accessing a page in the old sublist makes it “young”, moving it to the head of the new sublist. If the page was read because it was required by a user-initiated operation, the first access occurs immediately and the page is made young. If the page was read due to a read-ahead operation, the first access does not occur immediately and might not occur at all before the page is evicted.
- As the database operates, pages in the buffer pool that are not accessed “age” by moving toward the tail of the list. Pages in both the new and old sublists age as other pages are made new. Pages in the old sublist also age as pages are inserted at the midpoint. Eventually, a page that remains unused reaches the tail of the old sublist and is evicted.
自适应哈希索引
change buffer
log buffer
redo log buffer
redo log记录的是 对数据页个改动
Insert buffer
数据写入&读取数据
- 对row来说,有mvcc(一行记录多个版本),同步大内存中也是如此
磁盘
如何保证数据写入成功
- Doublewrite
doublewrite 由两部分组成:1 doublewriterbuffer 位于内存,大小2M; 2 共享表空间磁盘上,2个连续的区(128页)
脏页预先复制到doublewrite buffer,然后分两次写盘,每次1M写入共享表空间物理磁盘。doublewriter 写完之后,再写入各个表空间
- redolog
redolog 记录的是对数据页的改动,如果数据页损坏,redolog不能恢复数据,由doublewriter保证数据页完整
- checkpoint LSN
内存页,redolog、磁盘页军用LSN值,用于判断内存页是否已完成同步
redolog
- redolog
- lsn
- innodb引擎至少有一个重做日志组,每组至少两个文件
- innodb1.2 之前最大4G,1.2 之后扩大到512G
- log block 重做日志按扇区写入,每次 512KB,因扇区是最小写入单位,不需要doublewriter
逻辑存储结构
- 表空间
- 见架构图
- 段
- 数据段
- 索引段
- 回滚段
- 区
- 连续页组成,每个1M,页16KB(连续64页)
- 每次申请4-5个区
- 页
- 默认大小16KB ,若设置完成不可修改,除非数据重放
- 页分类
- 数据页
- undo 页
- 系统页
- 事务数据页
- 插入缓冲位图
- 插入缓冲空闲页
- 未压缩二进制大对像页
- 压缩二进制大对象页
- 页结构
- file header 大小:38B,LSN、页的前后指针、实际存储空间
- page header 大小:56B,页状态信息、已删除记录数、记录数、最后插入位置、当前页最大事务ID、页在索引的位置
- infimun+supremum records 页创建时创建,不会被删除。页中主键最大、最小值
- User records B+树索引组织
- free space 空闲表空间,链表结构,记录删除后会加入到链表中
- page dictionary 记录相对位置
- fill trail 监测页是否完整写入磁盘
- 行
索引
- 聚集索引
- 辅助索引
- 全文索引
锁
- 行级锁
- 共享锁 S
- 排它锁 X
- 表级别锁
- 意向共享锁 IS
- 意向排它锁 IX
X 不兼容任何锁,S、IX不兼容
- infomation schema三张表
- innodb_trx 表
- trx_id 事务id
- trx_state 事务状态
- trx_started 事务开始时间
- trx_requested_lock_id 等待事务的锁id, 当 trx_state 为lock wait时有值
- trx_wait_started 事务等待开始时间
- trx_weight 事务权重
- trx_mysql_thread_id 线程id
- trx_query 事务运行的 sql 语句
- innodb_locks
- lock_id 锁id
- lock_trx_id 事务id
- lock_mode 锁模式
- lock_type 锁类型)表、行
- lock_table 锁住的表
- lock_index 锁住的索引
- lock_space 锁对象的space_id
- lock_page 锁定叶数量,表锁为null
- lock_rec 锁定记录 表锁为null
- lock_data 锁定主键值 表锁为null
- innodb_lock_waits
- requesting_trx_id 申请锁事务id
- requesting_lock_id 申请锁id
- blocking_lock_id 阻塞锁id
- blocking_trx_id 阻塞事务id
- innodb_trx 表
- 一致性非锁定读
- read_commit 读取被锁定行最新一份快照
- repeatable_read 事务开始数据行版本
- 一致性锁定度
- select 。。。 for update (X锁)
- select 。。。 lock in share mode (S锁)
- 自增锁
- 行锁3种算法
- record_lcok 记录锁
- gap_lock 间隙锁,不包含自身
- next_key lock :包含自身+范围
事务
- 实现
- redolog 持久化
- undo log
- Insert undo log
- update undo log
- 事务隔离级别
- reade uncommit
- reade commite
- reapteable read
- serializable