一、背景
MySQL从5.6开始引入了Online DDL,alter操作不再阻塞dml。在MySQL 8.0中,针对Online DDL做了进一步优化,alter table加列操作支持INSTANT算法,意思就是使用这个算法进行加列操作只需要修改表的元数据信息,操作瞬间就完成了。在MySQL 8.0.30以后,instant算法支持加列加到表的任一位置,并且也支持删列、重命名表等DDL操作。实际DDL中支持Online DDL的操作默认都会使用 ALGORITHM=INSTANT。
二、问题
那么既然现在MySQL的DDL这么快,我们是不是随便什么时候都可以去数据库中对表进行DDL呢?其实不是的,即使是Online DDL也要在业务低峰期进行。如果在对表进行Online DDL的时候刚好这个表有个慢查询在执行,那么DDL语句将等待这个查询的元数据锁(metadata_lock),后续对这个表的所有DML语句都将被这个DDL阻塞,进而很容易造成连环堵塞和CPU飙升的状况,对业务系统产生极大的影响。
三、实验
下面,将实际演示一下Online DDL引发的阻塞问题:
会话A:开启一个事务,执行一条select不提交,那这个事务将一直持有表notest的元数据锁。
mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> select * from test.notest where id=1; Empty set (0.00 sec)
会话B:对表进行加列的DDL操作,可以看到该操作被堵塞,其实就是在等待会话A的元数据锁。
mysql> alter table test.notest add age int;
会话C、会话D:再开启两个会话,对该表进行简单的查询,该查询也在等待锁,没有返回结果。
mysql> select * from test.notest;
查看processlist,可以看到ddl和dml都在等待表的元数据锁:
mysql> show processlist; +-------+-----------------+--------------------+------+------------------+-------+-----------------------------------------------------------------+-------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +-------+-----------------+--------------------+------+------------------+-------+-----------------------------------------------------------------+-------------------------------------+ | 5 | event_scheduler | localhost | NULL | Daemon | 15986 | Waiting on empty queue | NULL | | 33 | repl | 10.2.111.193:33644 | NULL | Binlog Dump GTID | 15964 | Source has sent all binlog to replica; waiting for more updates | NULL | | 17805 | root | localhost | test | Query | 64 | Waiting for table metadata lock | alter table test.notest add age int | | 17814 | root | localhost | test | Sleep | 346 | | NULL | | 17973 | root | localhost | NULL | Sleep | 368 | | NULL | | 18370 | root | localhost | NULL | Query | 42 | Waiting for table metadata lock | select * from test.notest | | 18393 | root | localhost | NULL | Query | 24 | Waiting for table metadata lock | select * from test.notest | | 18418 | root | localhost | NULL | Query | 0 | init | show processlist | +-------+-----------------+--------------------+------+------------------+-------+-----------------------------------------------------------------+-------------------------------------+ 8 rows in set (0.00 sec)
查看元数据锁监控表performance_schema.metadata_locks表信息,可以看到当前数据库中存在的元数据锁以及元数据锁的对象和锁类型(在MySQL中,为了提高数据库的并发度,元数据锁被细分为了11种类型)。可以看到DDL语句给表带来的元数据锁类型为EXCLUSIVE,元数据EXCLUSIVE锁被持有期间任何其他的元数据锁都不能被授予,所以就阻塞了后续对表的所有DML操作,也包括select。
mysql> select * from performance_schema.metadata_locks; +-------------+--------------------+----------------+-------------+-----------------------+---------------------+---------------+-------------+--------------------+-----------------+----------------+ | OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | COLUMN_NAME | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID | +-------------+--------------------+----------------+-------------+-----------------------+---------------------+---------------+-------------+--------------------+-----------------+----------------+ | TABLE | test | notest | NULL | 140139226332208 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:6085 | 18049 | 5 | | GLOBAL | NULL | NULL | NULL | 132890928 | INTENTION_EXCLUSIVE | STATEMENT | GRANTED | sql_base.cc:5476 | 17881 | 21 | | BACKUP LOCK | NULL | NULL | NULL | 123469776 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | sql_base.cc:5483 | 17881 | 21 | | SCHEMA | test | NULL | NULL | 125839424 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | sql_base.cc:5463 | 17881 | 21 | | TABLE | test | notest | NULL | 125839520 | SHARED_UPGRADABLE | TRANSACTION | GRANTED | sql_parse.cc:6085 | 17881 | 21 | | TABLESPACE | NULL | test/notest | NULL | 130194048 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | lock.cc:808 | 17881 | 21 | | TABLE | test | #sql-5246_458d | NULL | 124845680 | EXCLUSIVE | STATEMENT | GRANTED | sql_table.cc:17024 | 17881 | 21 | | TABLE | test | notest | NULL | 126124176 | EXCLUSIVE | TRANSACTION | PENDING | mdl.cc:3754 | 17881 | 22 | | TABLE | test | notest | NULL | 140138743169920 | SHARED_READ | TRANSACTION | PENDING | sql_parse.cc:6085 | 18446 | 3 | | TABLE | performance_schema | metadata_locks | NULL | 140138825181536 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:6085 | 17890 | 26 | | TABLE | test | notest | NULL | 140139414229984 | SHARED_READ | TRANSACTION | PENDING | sql_parse.cc:6085 | 18469 | 3 | +-------------+--------------------+----------------+-------------+-----------------------+---------------------+---------------+-------------+--------------------+-----------------+----------------+ 11 rows in set (0.00 sec)
同时,可以查询sys.schema_table_lock_waits视图,该视图显示了当前元数据锁的锁等待信息。可以看到等待和阻塞的会话ID,并且kill掉阻塞会话的语句也直接在sql_kill_blocking_connection这一列给出来了,非常方便。
mysql> select * from sys.schema_table_lock_waits; +---------------+-------------+-------------------+-------------+-----------------+-------------------+-----------------------+-------------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------+--------------------+------------------------+-------------------------+------------------------------+ | object_schema | object_name | waiting_thread_id | waiting_pid | waiting_account | waiting_lock_type | waiting_lock_duration | waiting_query | waiting_query_secs | waiting_query_rows_affected | waiting_query_rows_examined | blocking_thread_id | blocking_pid | blocking_account | blocking_lock_type | blocking_lock_duration | sql_kill_blocking_query | sql_kill_blocking_connection | +---------------+-------------+-------------------+-------------+-----------------+-------------------+-----------------------+-------------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------+--------------------+------------------------+-------------------------+------------------------------+ | test | notest | 17881 | 17805 | root@localhost | EXCLUSIVE | TRANSACTION | alter table test.notest add age int | 53 | 0 | 0 | 18049 | 17973 | root@localhost | SHARED_READ | TRANSACTION | KILL QUERY 17973 | KILL 17973 | | test | notest | 18446 | 18370 | root@localhost | SHARED_READ | TRANSACTION | select * from test.notest | 47 | 0 | 0 | 18049 | 17973 | root@localhost | SHARED_READ | TRANSACTION | KILL QUERY 17973 | KILL 17973 | | test | notest | 18469 | 18393 | root@localhost | SHARED_READ | TRANSACTION | select * from test.notest | 44 | 0 | 0 | 18049 | 17973 | root@localhost | SHARED_READ | TRANSACTION | KILL QUERY 17973 | KILL 17973 | | test | notest | 17881 | 17805 | root@localhost | EXCLUSIVE | TRANSACTION | alter table test.notest add age int | 53 | 0 | 0 | 17881 | 17805 | root@localhost | SHARED_UPGRADABLE | TRANSACTION | KILL QUERY 17805 | KILL 17805 | | test | notest | 18446 | 18370 | root@localhost | SHARED_READ | TRANSACTION | select * from test.notest | 47 | 0 | 0 | 17881 | 17805 | root@localhost | SHARED_UPGRADABLE | TRANSACTION | KILL QUERY 17805 | KILL 17805 | | test | notest | 18469 | 18393 | root@localhost | SHARED_READ | TRANSACTION | select * from test.notest | 44 | 0 | 0 | 17881 | 17805 | root@localhost | SHARED_UPGRADABLE | TRANSACTION | KILL QUERY 17805 | KILL 17805 | +---------------+-------------+-------------------+-------------+-----------------+-------------------+-----------------------+-------------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------+--------------------+------------------------+-------------------------+------------------------------+ 6 rows in set (0.01 sec)
四、结论
通过以上的实验,可以得出结论,如果在对表进行Online DDL时,该表上存在元数据锁,那么DDL将一直等待元数据锁释放,直到超过参数lock_wait_timeout的超时时间,并且该DDL会阻塞后续对该表的所有操作。因此,即使现在支持instant算法的Online DDL可以秒加列,也要在业务低峰期进行,并且执行DDL前,最好查询一下performance_schema.metadata_locks,检查是否存在元数据锁。
文章评论