Proactive Re-Optimization Shivnath Babu, Pedo Bizarro, David DeWitt SIGMOD 2005 (presented by Steve...

23
Proactive Re- Optimization Shivnath Babu, Pedo Bizarro, David DeWitt SIGMOD 2005 (presented by Steve Blundy & Oleg Rekutin)
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    2

Transcript of Proactive Re-Optimization Shivnath Babu, Pedo Bizarro, David DeWitt SIGMOD 2005 (presented by Steve...

Proactive Re-Optimization

Shivnath Babu, Pedo Bizarro, David DeWitt

SIGMOD 2005

(presented by Steve Blundy & Oleg Rekutin)

Overview

What’s wrong with reactive? Proactive via 3 core techniques Experiments

Reactive Re-optimization

0

50

100

150

200

250

300

350

400

450

500

550

R S

select from R, S where R.a=S.a and R.b>K1 and R.c>K2σ

buffer

σ(R) actual

σ(R) estimated

A:

B:

!

!

Single-Point Limitation

A:

B:

Limited Information for Re-opt

select from R, S, T where R.a=S.a and S.b=T.b and R.c>K1 and R.d=K2

0

20

40

60

80

100

120

140

160

180

200

R S T

σ(R) act

σ(R) est!

!!

Choosing a plan

1. Compute bounding boxes

2. Use them to generate robust plans and switchable plans

3. Use randomization to collect statistics

Bounding Boxes

“Representing Uncertainty in Statistics” Are the upper and lower bounds for each

estimated statistic

Bounding Boxes

Optimal Plan

1 Plan is optimal for all 3 points

Choice is easy

Optimal Plan

0 50 100 150 200 250 300 350

Robust Plan

1 plan is, or close to, optimal for all 3 points

1 plan can be safely chosen

Robust Plans

0 50 100 150 200 250 300 350

Switchable Plan

There is a plan with close to optimal cost plan at each point

Additional Requirements The decision can be

deferred Actual statistics lie must

within bounding box It is possible to switch

between the plans

Switchable

0 50 100 150 200 250 300 350

What is a “Switchable” Plan

“Any two members of a switchable plan are said to be switchable with each other.”

Collecting statistics

1. Each operator collects some % in buffer2. The eos(f) is emitted & statistics are calculated3. Plan is chosen from switch plan members or

re-optimization is run4. Query processing proceeds

Questions

Prevalence of switchable plans vs. case 4 How good is Rho at preventing re-

optimizations How is Rho affected by large # estimates

Experiments

Traditional Optimizer (TRAD) Validity-Ranges Optimizer (VRO)

2-Way Join Queries: Robust

0

40

80

120

160

200

240

280

320

360

400

440

A C

σ(A) est

2-Way Join Queries: Switchable

0

40

80

120

160

200

240

280

320

360

400

440

A C

σ(A) est

σ(A) b. box

3-Way Join Example

Shows the use of a Switchable Plan Some re-optimization still necessary

Pt |σ1(A)| TRAD VRO Rio Opt

A 6 MB P17a Inside range, P17a Outside box, re-optimize, P17a P17a

B 80 MB P17a Inside range, P17a Inside box, P17a P17a

C 160 MB P17a Outside range, re-optimize, P17d

Inside box, P17d P17b

D 310 MB P17a Outside range, re-optimize, P17d

Outside box, re-optimize, P17b P17b

Correlation-Based Mistakes

Query Complexity

Conclusion

Rho refines statistics and uses switchable plans to forestall re-optimizations and prevent partial data loss

Questions?