- Home
- MySQL & MariaDB
- Relational Databases
- Optimization to skip index dives with FORCE INDEX.
Optimization to skip index dives with FORCE INDEX.
Feed: Planet MySQL
;
Author: MySQL Server Dev Team
;
Index dives are performed during the optimization phase to help decide which index to use. Currently when user specifies FORCE INDEX, optimizer still always calculates cost using index dives. Under some circumstances it is possible to avoid the index dives and this could speed up execution.
With WL#6526 (present in mysql-8.0.3) optimizer skips index dives when the query has a FORCE INDEX.
What is an index dive?
As mentioned in Jorgen’s blog,
For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan.
For the below queries, two index dives will be performed for each range (one each for min and max values in the range).
SELECT * FROM t1 WHERE (c1 > 1 AND c1 < 10) OR (c1 > 10 AND c1 < 20) ;
SELECT * FROM t1 WHERE c1 IN (11, 22) ;
Index dives can be skipped when there is a FORCE INDEX because the choice of index is pre-decided and access method depends on the WHERE clause (ref-access is used only when a single equality condition is present, in all other cases range access is chosen).
Optimizer skips index dives when:
- Only a single table is accessed in the query.
- FORCE INDEX applies to a single index.
- No subquery is present.
- Fulltext Index is not involved.
- No GROUP-BY or DISTINCT clause.
- No ORDER-BY clause.
This optimization currently does not apply to multi-table queries.
Identify whether a query is using this optimization:
- EXPLAIN:
- EXPLAIN FORMAT=TRADITIONAL FOR CONNECTION will see the following changes:
- Number of “rows” will change to NULL for the table where index dive is skipped.
- The value for “filtered” will be NULL.
- EXPLAIN FORMAT=JSON FOR CONNECTION will see the following changes:
- “rows_examined_per_scan” and “rows_produced_per_join” will not be
mentioned when index dive is skipped. - Number of “rows” will change to NULL for the table where index dive is
skipped. - The value for “filtered” will be NULL.
- “rows_examined_per_scan” and “rows_produced_per_join” will not be
- No change in output of EXPLAIN FORMAT=TRADITIONAL and EXPLAIN FORMAT=JSON
- EXPLAIN FORMAT=TRADITIONAL FOR CONNECTION will see the following changes:
- Optimizer trace:
- Optimizer trace contains the tag “skipped_due_to_force_index”.
“range_scan_alternatives”: [{“index”: “c1_idx”,“ranges”: [“1 < c1 < 10",“10 < c1 < 20"],“index_dives_for_range_access”: “skipped_due_to_force_index”,“rowid_ordered”: false,“using_mrr”: false,“index_only”: true,“rows”: “not applicable”,“cost”: “not applicable”,“chosen”: true}],
- Optimizer trace contains the tag “skipped_due_to_force_index”.
So what kind of queries could benefit from this optimization?
- Queries that contain large number of values in IN clause OR
- Queries with a large number of range conditions. (see queries below).
In the example below the first query has no FORCE INDEX and hence this optimization doesn’t apply. The second query has FORCE INDEX and takes significantly less time for the “statistics” phase of the query execution.
mysql> TRUNCATE TABLE performance_schema.events_stages_history;
Query OK, 0 rows affected (0.00 sec)
mysql> TRUNCATE TABLE performance_schema.events_statements_history ;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT c1 FROM t1 WHERE ( c1 > 10 AND c1 < 11) OR ... <200 clauses>;
| c1 |
| 25 |
<more rows>
832 rows in set (0.02 sec)
mysql> SELECT c1 FROM t1 FORCE INDEX(c1_idx) WHERE ( c1 > 10 AND c1 < 11) OR ... <200 clauses>;
| c1 |
| 25 |
<more rows>
832 rows in set (0.01 sec)
mysql> select SQL_TEXT, stages.EVENT_NAME, stages.TIMER_WAIT/1000000000 “Time (ms)”
-> from performance_schema.events_stages_history AS stages JOIN performance_schema.events_statements_history AS statements
-> ON (stages.EVENT_ID > statements.EVENT_ID AND stages.EVENT_ID <= statements.END_EVENT_ID)
-> WHERE stages.EVENT_NAME LIKE ‘%statistics%’ AND SQL_TEXT LIKE ‘%SELECT c1 FROM t1%’ ;
+————————————————————————————————————–+———————————+——————+
| SQL_TEXT | EVENT_NAME | Time (ms) |
+————————————————————————————————————–+———————————+——————+
| SELECT c1 FROM t1 WHERE ( c1 > 10 AND c1 < 11) ... | stage/sql/statistics | 29.5890 |
| SELECT c1 FROM t1 FORCE INDEX(c1_idx) WHERE ( c1 > 10 AND c1 < 11) ... | stage/sql/statistics | 2.3026 |
+————————————————————————————————————–+———————————+——————+
2 rows in set (0.00 sec)
|
This is a feature request from Facebook.
The optimization applies by default starting from MySQL 8.0.3. You do not need to enable any settings or run any commands to turn it on, Please try it out, and let us know your results.
Thank you for using MySQL!
Recent Articles
- Data Analytics is a Core Element Creating Business Value
- Key customer benefits of the expanded SAP and Microsoft partnership
- Amazon Keyspaces (for Apache Cassandra) now supports JSON syntax to help you read and write data from other systems more easily
- New AWS Public Datasets available from the Illumina, the University of Alaska Fairbanks, IntelinAir, and others
- BrandPost: How a multicloud architecture can deliver flexibility
- 4 booming job domains to pursue in 2021
- Email design trends: What’s in store for 2021?
- Top tools to help you manage your remote software development team
- This Week in Neo4j – SKOS taxonomy, Asset Management System, Introduction to Liquigraph
- Running an R Script on a Schedule: Azure Functions (Serverless)
- SEM Time Series Modeling
- covidcast package for COVID-19-related data
- SwimmeR version 0.7.2 – Now Better than Ever
- The MySQL X DevApi: Working with NULL Values
- Ryan Lambert: Better OpenStreetMap places in PostGIS
- Amazon CloudWatch Agent now supports OpenTelemetry APIs and SDKs
- Reticulate webinar – R and Python – a happy union
- What we learned from SAP customers at AWS re:Invent 2020
- AWS App Mesh is now available in Africa (Cape Town) AWS Region
- Amazon GuardDuty enhances security incident investigation workflows through new integration with Amazon Detective
- Amazon CloudWatch Agent Now Supports macOS on Amazon EC2 Mac instances
- AWS Cloud Map Updates Service Level Agreement
- 5 data scientists offer advice on improving your visibility
- Amazon Kendra achieves ISO and PCI compliance
- How cash reconciliation automation can work (with real-world example)
- Capturing and Visualizing Multi-Tenant Metrics Inside a SaaS Application on AWS
- Launching Amazon Lex streaming conversation APIs for improved virtual agent conversational experiences
- Launching Amazon Lex V2 console and APIs for an improved bot building experience
- Announcing new Amazon EC2 T4g instances powered by AWS Graviton2 processors along with a T4g free trial in Asia Pacific (Sydney, Singapore), Europe (London), North Americas (Canada Central, San Francisco), and South Americas (Sao Paulo) regions
- Building a cost efficient, petabyte-scale lake house with Amazon S3 lifecycle rules and Amazon Redshift Spectrum: Part 2
Leave a Reply
You must be logged in to post a comment.