MindMap Gallery MySQL knowledge map
About MySQL knowledge map, it mainly includes MySQL logical architecture, MySQL storage data structure, Mysql storage engine, SQL execution sequence, index optimization, SQL optimization, etc.
Edited at 2023-12-16 22:43:28One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
No relevant template
MySQLAdvanced
sql_mode basic syntax and verification rules
ONLY_FULL_GROUP_BY
For GROUP BY aggregation operation, if the column in SELECT does not appear in GROUP BY, then this SQL is illegal because the column is not in the GROUP BY clause
NO_AUTO_VALUE_ON_ZERO
This value affects the insertion of auto-increasing columns. Under the default settings, inserting 0 or NULL represents generating the next auto-increasing value. If the user wants the inserted value to be 0 and the column is auto-increasing, remove this option.
STRICT_ALL_TABLES
STRICT_TRANS_TABLES
For transaction-enabled tables, the two modes are identical: if a value is found to be missing or illegal, MySQL will throw an error and the statement will stop running and be rolled back.
NO_ZERO_IN_DATE
In strict mode, day and month zeros are not allowed
NO_ZERO_DATE
Setting this value, MySQL database does not allow the insertion of zero dates, and inserting zero dates will throw an error instead of a warning.
ERROR_FOR_DIVISION_BY_ZERO
Divide by 0 error
NO_ENGINE_SUBSTITUTION
If the required storage engine is disabled or does not exist, an error is thrown. When this value is not set, the default storage engine is used instead.
MySQL logical architecture
Overall overview of logical architecture
connection layer
The top layer is some client and connection services, including local socket communication and most TCP/IP-like communication based on client/server tools.
It mainly completes some connection processing, authorization authentication, and related security solutions. The concept of thread pool is introduced on this layer to provide threads for clients that securely access through authentication.
service layer
The second-layer architecture mainly completes most of the core service functions, such as SQL interfaces, and completes cached queries, SQL analysis and optimization, and the execution of some built-in functions.
At this layer, the server will parse the query and create the corresponding internal parse tree, and complete the corresponding optimization: such as determining the order of the query table, whether to use indexes, etc., and finally generate the corresponding execution operation
core services
Management Serveices & Utilities: System management and control tools
SQL Interface: SQL interface
Accept users' SQL commands and return the results that users need to query. For example, select from is to call SQL Interface
Parser: Parser
When the SQL command is passed to the parser, it will be verified and parsed by the parser.
Optimizer: query optimizer
The SQL statement will use the query optimizer to optimize the query before querying.
Cache and Buffer: Query cache
If the query cache has a hit query result, the query statement can directly go to the query cache to retrieve the data.
This caching mechanism is composed of a series of small caches. For example, table cache, record cache, key cache, permission cache, etc.
Versions after MySQL 8.0 directly delete the query cache
engine layer
Storage engine layer, the storage engine is really responsible for the storage and retrieval of data in MySQL, and the server communicates with the storage engine through API. Different storage engines have different functions, so we can choose according to our actual needs.
storage layer
The data storage layer mainly stores data on the file system running on the raw device and completes the interaction with the storage engine.
SQL execution cycle
Turn on SQL query cache
Method 1: Modify the configuration file and add configuration information
Under windows it is my.ini
Under linux it is my.cnf
Method 2: Use mysql command
set global query_cache_type = 1;
Query whether caching is enabled
show variables like '%profiling%';
MySQL storage data structure
Pursuing efficient query speed B-tree
B-tree
time complexity
O (log 2 N)
Non-leaf nodes store data
The disk stores fewer node elements and requires more disk IO operations.
The query needs to be matched layer by layer, and the range query needs to be traversed
Deleting nodes will cause the tree to rearrange
B-tree
time complexity
O(log3N)
3 degrees
Non-leaf nodes do not store data
The disk stores many node elements, which can reduce disk IO operations. One IO operation can obtain more index information.
When querying data, you only need to query leaf nodes and maintain bidirectional pointers, organized in a linked list, and support range queries.
Deleting node B will not rearrange the tree.
In order to prevent B-tree index rearrangement, MySQL will not delete the non-leaf nodes when the index is deleted, but mark the data on the leaf nodes as invalid. When more and more indexes on the table are deleted, a large number of indexes will be generated. Hole, index information of non-leaf nodes can be found, but the data on leaf nodes is invalid
Note that the index hole is caused by this. Because the index is continuously deleted, the index has a hole effect. The solution is to rebuild the index (delete the previous index and create a new index).
Mysql storage engine
InnoDB engine
Versions after MySQL 5.5 use InnoDB by default. InnoDB is MySQL's default transactional engine. It is designed to handle a large number of short-lived transactions. Can ensure complete commit and rollback of transactions
Cluster index (must exist, and there is only one)
non-clustered index
MyISAM storage engine
The default storage engine before MySQL5.5, MyISAM provides a large number of features, including full-text indexing, compression, spatial functions (GIS), etc., but MyISAM does not support transactions and row-level locks. There is no doubt that MyISAM cannot Safe recovery.
Archive (love enlightenment) engine
Only insert operations and select operations can be performed, which is suitable for log and data collection (archive) applications. Archive tables are approximately 75% smaller than MyISAM tables and approximately 83% smaller than InnoDB tables that support transaction processing.
Memory engine
If you need to access data quickly, and the data will not be modified or lost after restarting, then using a Memory table is very useful. The memory engine is the fastest and is suitable for storing temporary data. The data is stored in the memory. When the server is restarted, the data will be lost, but the table structure will be retained. The hash index used by memory does not support range queries and does not support sorting.
SQL execution order
from -> on -> join -> where -> group by ->having -> select ->distinct -> orderby -> limit
Motto: Buddha told me to work. Speed ol
Index optimization
Index definition:
Index is a data structure that helps MySQL obtain data efficiently. The essence of index is a data structure, a sorted data structure
storage location
The database index is stored on the disk. InnoDB uses pages as the basic unit when reading disk data. The default page size of MySQL is 16KB (16384K)
Advantages and Disadvantages of Indexing
Advantage
Improve data retrieval efficiency and reduce IO costs
Reduce data sorting costs and reduce CPU consumption
Disadvantages
The index is actually a table that stores the primary key and index fields and points to the records of the entity table, so the index columns also take up space.
Although indexes improve query speed, they reduce table update speed.
MySQL index
Cluster index and non-cluster index
poly index
generate
There is a primary key in the table: Construct a B-tree based on the primary key in the table
There is no primary key in the table
If there is a unique index, use the first unique index as the cluster index
There is no primary key and no unique index. Use the row number of the table to build a cluster index.
The primary key index must be a cluster index. The cluster index is not necessarily a primary key index. It may also be a unique index and a row number index.
Characteristics of tree structure: leaf nodes store table row record data
Features of finding data: Find the cluster index key, and then read the corresponding data in the leaf node IO
There must be a clustered index in MySQL, and there is only one. However, currently only InnoDB supports clustered indexes, and MYISAM does not.
Non-clustered index (secondary index)
Generation: In addition to being clustered indexes, other indexes are non-clustered indexes.
Characteristics of the tree structure: Leaf nodes store the index keys of the cluster index
Characteristics of data search: Find the non-clustered index key, obtain the corresponding clustered index key, and then go to the clustered index table to read the corresponding data based on the clustered index key IO (table return). You should try to avoid the table return operation every time. A return to the table may be an IO operation
Index related syntax
Create index
create index index name (index_xxx) on table name (field name)
Create prefix index
create index index_field name_number of prefixes on table name (field (number of prefixes))
Create a union index
create index index_field name 1_field name n on table name (field name 1, field name n)
Delete index
drop index index name on table name
Query table index
show INDEX from table name
Index classification
Single value index
Create an index for a single field
unique index
Make sure that the value of a certain column must be unique and is often used for queries. It can be used as a unique index. The primary key index is a unique index.
primary key index
When MySQL creates a table, we specify that field as the primary key. MySQL creates an index for the primary key by default. If the primary key is not specified when creating the table, MySQL will not create a primary key index. But there will be an implicit index (using the row number as the index )
Both can be used as cluster indexes, and the primary key index can be used as a cluster index.
Composite index (joint index)
Use multiple fields as indexes. Proper use of joint indexes can avoid table return queries.
Leftmost prefix principle The leftmost column must exist, otherwise the joint index will fail
When using a joint index, if a column is skipped, the subsequent column indexes will be invalid.
When querying in a range, the column index after greater than or less will be invalid. Try to use greater than or equal to, or less than or equal to.
Try to cover the index
Try to place large fields at the far left of the joint index
Classification of whether index is created or not
Need to create index
Primary key automatically creates primary key index
Frequently queried fields should be indexed
Fields associated with other tables in the query and fields used as foreign keys need to be indexed
Fields to sort on
grouped fields
Sort first and then remove duplicates
No need to create index
Too few table records
Tables that are frequently added, deleted, or modified
Fields not used in where query
Index design principles
Create indexes for fields with large amounts of data and frequent queries
For fields that are often used as query conditions, such as Group By, where, order by
If it is a string type field and the length of the field is long, you can create a prefix index based on the characteristics of the string.
Try to use joint indexes and reduce the use of singleton indexes. When querying, joint indexes can cover indexes, save disk space, avoid table backs, and improve efficiency.
Control the number of indexes. The more indexes, the better. Indexes will also take up a lot of disk space, and it takes a long time to maintain the index, which will affect the efficiency of additions, deletions and modifications.
If the index field cannot store NULL values, use NOT NULL to constrain it when creating the table. This facilitates optimization by the optimizer.
Do not use * when querying. Query the corresponding fields according to the required fields. Use covering indexes.
Index failure
Function operations are used on index fields, including ordinary operations, and the index is invalid.
String type fields are not enclosed in single quotes, resulting in implicit type conversion and index failure, which is equivalent to adding a function to the field for conversion.
When performing fuzzy queries, if % is added in front, the index will also become invalid.
There must be indexes before and after the or connection. If there is no index on either side, the index will be invalid.
Impact of data distribution: If MySQL evaluates that a full table scan is faster than a full index, the index will also become invalid.
The joint index after the range query will be invalid < , > , !=
is not null will cause the index to fail, is null can use the index
Index optimization analysis
Optimizer optimizer
ExplainView execution plan
grammar
Explain sql statement
Efficiency comparison of grouping and sorting
Grouping needs to be sorted first and then deduplicated, so the efficiency of sorting is high, but the efficiency of grouping is low.
Result analysis
ID
A unique ID representing a query
The execution of a SQL statement has the same ID from top to bottom, and different IDs from large to small.
As you can see from the picture, there are two ids with the same ID of 2. At this time, the execution order is from top to bottom. There is also an id with 1, first larger and then smaller. After all the ones with id 2 are executed, the one with id 1 is executed.
Subqueries will lead to multiple queries. Use less subqueries if you can. The number of queries reflects the number of IOs.
There are several select ids in the sql statement. The maximum value is the number. The same id value means the same query.
select_type
Represents the type of query here
SIMPLE: This query is a simple query, without using subqueries or unions.
PRIMARY: This query is a primary query
SUBQUERY: This query is a subquery
UNION: This query uses UNION
DERIVED: Derived table
table
Which table is being operated on this time?
partitions
Partition
type
Represents how MySQL uses this table to query (connect)
system
This query returns the results directly (the query is MySQL system information) without querying the database.
const
This query was queried in one go (reading the index data once).
eq_ref
Indexes are used to associate with the table, and the associated fields of the main table can only search for only one piece of data in the table.
ref: reference
eq: equivalent
eq_ref: equal value reference; the use of multi-table joint query, the data of A and B are uniquely related
ref
Represents a multi-valued reference. Both indexes and table associations are used, and the associated fields of the main table can search for multiple data in the table.
range
The index field performs interval query
between and
in (...)
>= and <=
index
A full index scan means that it only needs to traverse the index tree.
The field of select must be an index (non-cluster index field and cluster index field)
all
Full table scan
Connection performance (from high to low)
system -> const -> eq_ref -> ref -> index_merge -> range -> index -> all
possible_keys
Fields that may be indexed
key
Query the actual index used
key_len
Actual index length used, number of bytes
Calculation formula
int
NULL allowed
4 1
NULL is not allowed
4
bigint
NULL allowed
8 1
NULL is not allowed
8
char(n)
NULL allowed
3 * n 1
NULL is not allowed
3*n
varchar(n)
NULL allowed
3 * n 2 (variable reserved 2 bytes) 1
NULL is not allowed
3 * n 2
text(n)
NULL allowed
3 * n 2 (variable reserved 2 bytes) 1
NULL is not allowed
3 * n 2
ref
Displays the reference relationship between this table and which table and which column. What kind of reference relationship can be determined by type (ref, eq_ref)
rows
Represents how many rows of records in this table have been operated, the fewer the better
filtered
The percentage of the final number of record rows obtained through the query conditions to the number of record rows searched through the search method specified by the type field.
Extra
MySQL provides us with other additional analysis information for this query
Using filesort
Use file sorting; the fields after order by are not indexed, or there are indexes that cannot be used (in this case, the sorted fields should be indexed)
Using where
Query with where condition
Using temporary
Temporary table used
Using index
Index coverage
You can get all the data using only the index, without going back to the table to look up the real data (the fields in the select are all indexed fields, and the conditions are met without going back to the table to look up)
The field to be queried happens to be the key of the non-clustered index and the key of the clustered index.
Backward reverse index
Sort index fields in descending order
Using index condition
Index pushdown:
During the traversal process of the joint index (secondary index || non-primary key index), all fields included in the index are judged first, and records that do not meet the conditions are filtered out before returning to the table, which can effectively reduce the number of table returns.
For example, when sorting secondary index fields of non-clustered indexes, index pushdown will be used.
Note: Index pushdown is a new feature after 5.6
impossible where
mysql determines that the where condition of this sql cannot be satisfied
For example, where id =1 and id = 2, this where condition cannot be satisfied.
using join buffer
The related fields have no index. If this prompt appears, you should consider adding indexes to the fields in the related query.
SQL optimization
insert insert optimization in three aspects
When inserting in batches, use insert into table name values (), () do not insert multiple times sequentially, use mybatis's dynamic sql <foreach>
Use manual submission of transactions. Do not automatically submit the transaction every time a piece of data is inserted. Frequent creation and submission of transactions consumes performance.
The primary keys of inserted data are in order, and the speed of primary key sequential insertion is higher than the speed of primary key out-of-order insertion.
Primary key optimization
Two page phenomena
page splitting
When the data in data page one and data page two are both full, and a primary key needs to be inserted out of order in the middle of one of the pages, then innodb moves the data after 50% of the page to the new page. On the created page of data, then insert the out-of-order primary key into it, and then place the newly created page between data page one and data page two, and maintain it with a doubly linked list. The purpose is to maintain the validity of the leaf nodes. Sequence
Page merge phenomenon
When index data is deleted in innodb, the data of the leaf nodes will be marked as deleted (for details, please refer to the index hole problem). When the data marked as deleted accounts for 50% of the data on this page, innodb will Determine whether the data on one page before and after can be merged. The purpose is to save space.
Three methods for primary key optimization
The length of the primary key should not be too long, which will occupy space, generate more pages, and cause multiple IOs.
When inserting the primary key, keep it in order and avoid page splits.
Do not use uuid or other natural primary keys such as ID cards
The UUID is random, causing primary keys to be out of order and causing page splits.
UUID length is too long and takes up space
order by optimization
Avoid Using filesort and pursue Using index
Index the sorted field
When sorting by order, it is taboo to use sleect *. Only query the required fields.
Single-way and dual-way sorting
one-way sort
It is to take out all the fields of the rows that meet the condition at one time, and then sort them in the sort cache.
Fast, takes up memory space
two-way sort
First, take out the corresponding sorting field and the row ID that can directly locate the row data according to the corresponding conditions, and then sort in the sorting cache. After sorting, you need to return to the table again to retrieve other required records;
Slow, saves memory space
group by optimization
Grouped fields are indexed
Sorting and grouping optimization
No filtering, no indexing
When not using where, you can try adding limit
Wrong order, must be sorted
The fields of the joint index are in the wrong order, so the whole table must be sorted.
The direction is reversed and must be sorted.
The field sorting rules of joint index sorting must be consistent, either all ascending or all descending order, otherwise the whole table sorting must be used
limit paging optimization
Optimize with covering indexes and subqueries and range queries
For example select * from t limit 1900000,10
method one
select id from t order by id limit 1900000,10
First query the primary key within the query range, using a covering index
select t.* from t , ( select id from t order by id limit 1900000,10 ) t1 where t.id = t1.id
Then perform primary key eq_ref query through subquery
Method 2
select id from t limit 1900000,1
First perform a const query to obtain the first primary key id in the range
select * from t,(select id from t limit 1900000,1) t1 where t.id > t1.id LIMIT 0,10
Query the last page by const,range
count optimization
The MyISAM engine stores the total number of rows of a table on disk, so when count(*) is executed, the total number of rows will be returned directly.
InnoDB needs to read the data from the engine and count it row by row. If it is not null, it will accumulate by row and return the cumulative number or a new table. When executing insert again, use redis to increase the value of count by one. When deleting, decrease the value by one.
count(primary key)
The InnoDB engine will traverse the entire table, take out the primary key, and then return it to the service layer. After the service layer gets the primary key, it will accumulate it by row.
count(field)
There is no not null constraint
The InnoDB engine will traverse the entire table, take out the fields, and return them to the service layer. After the service layer gets the primary key, it will judge whether it is null or not, and then accumulate it by row.
There is not null constraint
The InnoDB engine will traverse the entire table, take out the fields, and then return them to the service layer. After the service layer gets the primary key, it will accumulate it by row.
count(number)
The InnoDB engine will traverse the entire page without taking a value. The server will put a corresponding number into each row and accumulate it directly by row.
count(*) The InnoDB engine will traverse the entire page without taking a value. The server has made special optimizations and accumulates data directly by row.
The two are the fastest because there is no need to take a value
When counting fields (non-indexed and without not null constraint fields), pay special attention to whether it is null or not. If it is null, it will not be accumulated.
update optimization
update table name set field name = value where field name (this field must be indexed) = value
The updated condition field name must be indexed, otherwise the row lock will be upgraded to a table lock.
MySQL master-slave replication
Master-slave replication principle
The slave starts the thread to read the binlog log (binary log) from the master, copies it to the Readlog log (relay log), and then the SQL thread reads the contents of the Readlog log.
Basic principles of replication
Each slave can only have one master
Each slave can only have a unique server ID
Each master can have multiple slaves
Host configuration
Modify the configuration in the core configuration file my.cnf
All configuration items of master and slave are configured under the [mysqld] node, and they are all in lowercase letters.
binlog_format
STATEMENT (default)
statement level
Binlog will record every statement that performs a write operation, which saves space compared to ROW mode, but may cause data inconsistency problems.
For example, the host executes update tt set create_date=now(). Due to the inconsistent execution time of the slave, the data is inconsistent.
Advantages: Saves space
Disadvantages: May cause data inconsistency.
ROW
row level
Binlog will record the changes in each row after each operation.
Advantages: Maintain absolute consistency of data. Because no matter what the sql is or what function it refers to, it only records the effect after execution.
Disadvantages: takes up a lot of space.
MIXED
The upgraded version of statement solves to a certain extent the problem of statement mode inconsistency caused by some situations.
Create an account on the host and authorize the slave
grant all privileges on *.* to root@'%' identified by 'root'; # Create a root user and authorize remote access!
flush privileges write down the values of File and Position
Query the status of the master: show master status
After performing this step, do not operate the main server MYSQL again to prevent the status value of the main server from changing.
Slave configuration
Modify the configuration in the core configuration file my.cnf
[Required] Unique ID of slave server: server-id=2
[Optional] Enable relay log: relay-log=mysql-relay
CHANGE MASTER TO MASTER_HOST='Host ip address', MASTER_USER='X',MASTER_PASSWORD='X',
CHANGE MASTER TO MASTER_HOST='mall_mysql_master',MASTER_PORT=3306,MASTER_USER='slave',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=0;
MASTER_LOG_FILE='mysql-bin.Specific number',MASTER_LOG_POS=Specific value;
Start the replication function from the server
start slave;
View status
show slave status\G;
Slave_IO_Running: Yes
lave_SQL_Running: Yes
The master-slave restart configuration takes effect.
systemctl restart mysqld
Turn off the firewall on both master and slave machines
systemctl stop firewalld
Stop the slave service replication function and reconfigure the master and slave
Executed on the slave machine. Function description: Stop the operations of the I/O thread and SQL thread
mysql> stop slave;
Executed on the slave machine. Function description: Used to delete the relaylog log file of the SLAVE database and re-enable the new relaylog file.
mysql> reset slave;
Executed on the host. Function description: Delete all binglog log files, clear the log index files, and restart all new log files. It is used to initialize the binlog of the main library when building the master-slave library for the first time;
mysql> reset master;