The values making most sense to monitor are:
Slave_running: If the system is a slave replication server, this is an indication of the slave's health.
Threads_connected: The number of clients currrently connected. This should be less than some preset value (like 200), but you can also monitor that it is larger than some value to ensure that clients are active.
Threads_running: If the database is overloaded you'll get an increased number of queries running. That also should be less than some preset value (20?). It is OK to have values over the limit for very short times. Then you can monitor some other values, when the Threads_running was more than the preset value and when it did not fall back in 5 seconds.
Aborted_clients: The number of clients that were aborted (because they did not properly close the connection to the MySQL server). For some applications this can be OK, but for some other applications you might want to track the value, as aborted connects may indicate some sort of application failure.
Questions: Number of queries you get per second. Also, it's total queries, not number per second. To get number per second, you must divide Questions by Uptime.
Handler_*: If you want to monitor low-level database load, these are good values to track. If the value of Handler_read_rnd_next is abnormal relative to the value that you normally would expect, it may indicate some optimization or index problems. Handler_rollback will show the number of queries that have been rolled back. You might want to wish to investigate them.
Opened_tables: Number of table cache misses. If the value is large, you probably need to increase table_cache. Typically you would want this to be less than 1 or 2 opened tables per second.
Select_full_join: Joins performed without keys. This should be zero. This is a good way to catch development errors, as just a few such queries can degrease the system's performance.
Select_scan: Number of queries that performed a full table scan. In some cases these are OK but their ratio to all queries should be constant. if you have the value growing it can be a problem with the optimizer, lack of indexes or some other problem
Slow_queries: Number of queries longer than --long-query-time or that are not using indexes. These should be a small fraction of all queries. If it grows, the system will have performance problems.
Threads_created: This should be low. Higher values may mean that you need to increase the value of thread_cache or you have the amount of connections increasing, which also indicates a potential problem.
Command: mysqladmin -u root -p processlist
You can get the number of threads connected and running by using other statistics, but this is a good way to check how long queries that are running take. If there are some very long-running queries (e.g. due to being badly formulated) the admin should be informed. You might also want to check how many queries are in "Locked" state - these are not counted as running but are inactive, i.e. a user is waiting on the database to respond.
Command: mysql -u root -p -e "SHOW INNODB STATUS"
This statement produces a great deal of information, from which you should extract the parts in which you are interested. The first thing you need to check is: "Per second averages calculated from the last xx seconds". InnoDB rounds stats each minute.
Pending normal aio reads: These are InnoDB IO request queue sizes. If they are bigger than 10-20 you might have some bottleneck.
reads/s, avg bytes/read, writes/s, fsyncs/s: These are IO statistics. Large values for reads/writes means the IO subsystem is being loaded. Proper values for these depend on your system configuration.
Buffer pool hit rate: The hit rate also depends a lot on your application. Check your hit rate, when there are problems.
inserts/s, updates/s, deletes/s, reads/s: These are low level row operations that InnoDB does. You might use these to check your load if it is in expected range.
You can examine the state of the cache from the mysql command line:
mysql> show status like 'qcache%';
+-------------------------+----------+
| Variable_name | Value |
+-------------------------+----------+
| Qcache_free_blocks | 9990 |
| Qcache_free_memory | 34431360 |
| Qcache_hits | 2165383 |
| Qcache_inserts | 461500 |
| Qcache_lowmem_prunes | 113692 |
| Qcache_not_cached | 1894 |
| Qcache_queries_in_cache | 28203 |
| Qcache_total_blocks | 66628 |
+-------------------------+----------+
8 rows in set (0.00 sec)
Here we can see that there have been 2 million cache hits and that about half of the allocated cache is still available (free_memory). The cache has been flushed (lowmem_prunes), so it might be useful to increase the value of cache_size in my.cnf slightly.
No comments:
Post a Comment