MySQL is a widely used relational database management system, and following best practices is essential to ensure performance, reliability, and security. Here are some tips and best practices for working with MySQL
Use normalized tables :
- First Normal Form (1NF): In the first normal form, each column must contain only one value and no table should store repeating groups of related data.
- Each column of your table should be single-valued.
- The values stored in each column must be of the same type.
- Each column in a table should have a unique name.
- You can store the data in the table in any order.
- Second Normal Form (2NF): In the second normal form, first the database must be in the first normal form, it should not store duplicate rows in the same table. And if there are duplicate values in the row, they should be stored in their own separate tables and linked to the table using foreign keys. The ideal way to a database in second normal form is to create one to many relationship tables.
- The table should be in first normal form (1NF).
- There should not be any partial dependency.
- Third Normal Form (3NF): In the third normal form, the database is already in the third form, if it is in the second normal form and every non-key column is mutually independent. Identify any columns in the table that are interdependent and break those columns into their own separate tables.
- The table should be in Second Normal Form (2NF)
- There should not be any transitive dependency for non-prime attributes.
- Boyce Codd Normal Form (BCNF): It is the highest form of the third normal form which deals with different types of anomalies that are not handled by the 3NF.
- The table should be in 3NF
- For dependency like a-> B, A should be a super key which means A cannot be a non-prime attribute if B is a prime attribute.
Refer : https://www.simplilearn.com/tutorials/sql-tutorial/what-is-normalization-in-sql
Always use proper datatype
- One of the most important MySQL best practices is to use datatypes based on the nature of data. Using irrelevant datatypes may consume more space or lead to errors.
- For example: Using varchar (20) instead of DATETIME datatype for storing date time values will lead to errors in date time-related calculations. Also, it is possible that invalid data will be stored.
- Use CHAR (1) over VARCHAR (1) - VARCHAR (1) takes extra bytes to store information, so if you string a single character, it better to use CHAR (1).
- Use the CHAR datatype to store only fixed length data - For example: If the length of the data is less than 1000, using char (1000) instead of varchar (1000) will consume more space.
- Avoid using regional date formats - When using DATETIME or DATE datatype, always use the YYYY-MM-DD date format or ISO date format suitable for your SQL Engine. Regional formats like DD-MM-YYYY or MM-DD-YYYY will not be stored properly.
- Use the smallest data types possible - avoid large char (255) text fields when a varchar or smaller char is enough. If you use the right data type, more records will fit in memory or index key block. This leads to fewer reads and faster performance.
- Use ENUM rather than VARCHAR
Best Practices for InnoDB Tables
- Specify a primary key for every table using the most frequently queried column or columns, or an auto-increment value if there is no obvious primary key.
- Use joins wherever data is pulled from multiple tables based on identical ID values from those tables. For fast join performance, define foreign keys on the join columns, and declare those columns with the same data type in each table. Adding foreign keys ensures that referenced columns are indexed, which can improve performance. Foreign keys also propagate deletes and updates to all affected tables, and prevent insertion of data in a child table if the corresponding IDs are not present in the parent table.
- Turn off autocommit. Committing hundreds of times a second puts a cap on performance (limited by the write speed of your storage device).
- Group sets of related DML operations into transactions by bracketing them with START TRANSACTION and COMMIT statements. While you don't want to commit too often, you also don't want to issue huge batches of INSERT, UPDATE, or DELETE statements that run for hours without committing.
- Do not use LOCK TABLES statements. InnoDB can handle multiple sessions all reading and writing to the same table at once without sacrificing reliability or high performance. To get exclusive write access to a set of rows, use the SELECT ... FOR UPDATE syntax to lock just the rows you intend to update.
- Enable the innodb_file_per_table variable or use general tablespaces to put the data and indexes for tables into separate files instead of the system tablespace. The innodb_file_per_table variable is enabled by default.
- Evaluate whether your data and access patterns benefit from the InnoDB table or page compression features. You can compress InnoDB tables without sacrificing read/write capability.
- Run the server with the --sql_mode=NO_ENGINE_SUBSTITUTION option to prevent tables from being created with storage engines that you do not want to use.
Optimizing SELECT Statements
The main considerations for optimizing queries are:
- To make a slow SELECT ... WHERE query faster, the first thing to check is whether you can add an index. Set up indexes on columns used in the WHERE clause, to speed up evaluation, filtering, and the final retrieval of results. To avoid wasted disk space, construct a small set of indexes that speed up many related queries used in your application.
- Indexes are especially important for queries that reference different tables, using features such as joins and foreign keys. You can use the EXPLAIN statement to determine which indexes are used for a SELECT. Optimizing Queries with EXPLAIN”.
- If a performance issue is not easily solved by one of the basic guidelines, investigate the internal details of the specific query by reading the EXPLAIN plan and adjusting your indexes, WHERE clauses, join clauses, and so on. (When you reach a certain level of expertise, reading the EXPLAIN plan might be your first step for every query.)
- Isolate and tune any part of the query, such as a function call, that takes excessive time. Depending on how the query is structured, a function could be called once for every row in the result set, or even once for every row in the table, greatly magnifying any inefficiency.
- Minimize the number of full table scans in your queries, particularly for big tables.
- Keep table statistics up to date by using the ANALYZE TABLE statement periodically, so the optimizer has the information needed to construct an efficient execution plan.
- Avoid transforming the query in ways that make it hard to understand, especially if the optimizer does some of the same transformations automatically.
- Deal with locking issues, where the speed of your query might be affected by other sessions accessing the tables at the same time.
Here are some types of optimizations MySQL knows how to do: -
- Reordering joins
- Converting OUTER JOINs to INNER JOINs
- Applying algebraic equivalence rules
- COUNT(), MIN(), and MAX() optimizations
- Evaluating and reducing constant expressions
- Use Covering indexes
- Subquery optimization
- Early termination
- Equality propagation
- IN() list comparisons
- Use SELECT * only if needed
- Use ORDER BY Clause only if needed
Few More :
- Partition Large Tables: For tables with a large number of rows, consider partitioning to improve query performance and management.
- Use Connection Pooling: Connection pooling helps manage and reuse database connections, reducing the overhead of opening and closing connections for each query.
- Cache Frequently Used Queries: Implement query caching mechanisms like MySQL Query Cache or use external caching systems like Memcached or Redis to reduce query load on the database server.
- Regularly Review and Tune: Regularly review and tune your MySQL setup based on the changing needs of your applications.