Lab 2 : Query Optimization

Part A. Creating statistics for cost models and using explain plan.

  1. Create a secondary index on the POPULATION field.
  2. Execute the statement "analyze table country compute statistics ", so that Oracle uses cost based optimization.
  3. Rerun SQL expressions from Lab 1 to verify creation of appropriate statistics about user-defined tables and indices. Submit results of queries showing the number of rows, blocks, average row length for each table. Submit the number of distinct values, number of nulls for each column. Submit the number of rows, leaf blocks, clustering factor, average leaf block s per key for each index.
  4. Execute the statement @$ORACLE_HOME/rdbms/admin/utlxplan. This will create a table called PLAN_TABLE that ORACLE will use to store the results of the plan analysis.
  5. Clean out current data in PLAN_TABLE:
    DELETE FROM PLAN_TABLE;
    
  6. Create execution plan and store it in PLAN_TABLE:
     
    EXPLAIN PLAN
    SET STATEMENT_ID = '<some-name>'
    FOR
    <select statement to be analyzed>;
    
  7. Review execution plan using
     
     SELECT id, parent_id, level, position, operation, options, object_name, cost, bytes
       FROM PLAN_TABLE
       CONNECT BY PRIOR ID = PARENT_ID and STATEMENT_ID = '<some-name>'
       START WITH ID=1 and STATEMENT_ID = '<some-name>'
       ORDER BY ID;
    
    or
     
    SELECT LPAD(' ', 2*LEVEL)||OPERATION||' '||OPTIONS||' '||OBJECT_NAME Query_Plan
    FROM PLAN_TABLE
    CONNECT BY PRIOR ID = PARENT_ID and STATEMENT_ID = '<some-name>'
    START WITH ID=1 and STATEMENT_ID = '<some-name>'
    ORDER BY ID;
    
  8. You may save this query via "SAVE <some-name>" command, recall it via "@<some-name>" and edit it via "EDIT <some-name>".
  9. Note: You may use "spool <some-name>" command to start recording a session and "spool off" to stop recording.
  10. Additional Information on Explain Plan is available from
    1. Pages 107-110 (1st Ed.) or 114-116 (2nd Ed.), supplemantary book
    2. Range query processing stratagies are discussed in section 18.2.2 of the text book.
    3. Oracle tutorials from MIT (tuning, dwh, java) ... web nerds series ,

Part B: Range Queries and Secondary Indices

  1. Write SQL statements for the POPULATION based queries which retrieve either very few tuples or a large fraction of all the tuples. Illustrative examples are given below and you may adjust the thresholds on population.
  2. Let Q1 be the POPULATION based query which retrieves a couple of tuples and Q2 be the POPULATION based query which retrieve most of the tuples. Create the explain plan for Q1 and Q2 using the instructions given above.
  3. Submit the SQL expressions and the results of the explain plan for the above two queries.
  4. Explain why the index on population is not always used for range queries based on population attribute. Derive an algebric condition on the selectivity of range query in terms of record size, block size, key-value size, pointer value size and depth of secondary index to quantify your arguement.

Part C: Aggregate Operation comparisons

  1. Recall that there is a secondary index on the POPULATION field.
  2. Consider following two equivalent SQL queries for the POPULATION based range queries.
  3. Let Q1 be the first queries in the above-mentioned list, Q2 to be the second. Create the explain plan for Q1, and Q2 using the instructions given above. Submit results of the explain plan for the two queries.
  4. Is there any difference in the execution plans for Q1 and Q2? Which query is more efficient to execute? Explain.

Part D: Range Query syntax comparisons

  1. Recall that there is a secondary index on the POPULATION field.
  2. Comment /*+ rule */ asks Oracle to use heuristic fixed-priority rule based plan selection instead of cost based plan selection. See page 111 (2nd Ed.) or page 105 (1st Ed.) of the supplementary book for details.
  3. Consider following two equivalent SQL queries for the POPULATION based range queries.
  4. Let Q1 be the first queries in the above-mentioned list, Q2 to be the second. Create the explain plan for Q1, and Q2 using the instructions given above. Submit results of the explain plan for the two queries.
  5. Is there any difference in the execution plans for Q1 and Q2? Which query is more efficient to execute? Explain.

Part E: Queries with join and selection

  1. Recall that there are indices on NAME and POPULATION.
  2. Consider following two equivalent SQL queries for retrieving names of countries with population larger than that of United Kingdom .
  3. Let Q1 be the first queries in the above-mentioned list, Q2 to be the second. Create the explain plan for Q1, and Q2 using the instructions given above. Submit results of the explain plan for the two queries.
  4. Is there any difference in the execution plans for Q1 and Q2? Which query is more efficient to execute? Explain.

Part F: String Matching and Indices

  1. Recall that there is a primary index on the NAME field.
  2. Consider following SQL queries for the NAME based range queries using LIKE operator.
  3. Let Q1 be the first queries in the above-mentioned list, Q2 to be the second and Q3 to be the third. Create the explain plan for Q1, Q2 and Q3 using the instructions given above. Submit results of the explain plan for the three queries.
  4. Compare execution plans for the three queries. Which query is cheapest? Why?
  5. Explain why the index on NAME is not always used for range queries based on NAME attribute.

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.