Monday, January 16, 2017

Oracle Parallel Execution Deep Dive Session

Here is a recording of a session I did a while ago, covering how to understand the essentials of Oracle Parallel Execution and how to read the corresponding execution plans.


7 comments:

Mayank said...

Excellent Post , please comment on below queries if possible thanks
1) why in SQL RTMON there is no info about TQ and IN-OUT
2) if i have 4 big table SQL query, which table is best contender of using parallel hint
3) If i use 4 parallel with all 4 tables it will together open 32 Parallel Slaves (4*4*2)
4) How to reduce Temp tablespace while using Parallel hints

Randolf said...

Hi Mayank,

1) why in SQL RTMON there is no info about TQ and IN-OUT

I don't know and therefore can't answer this question, but this is one of the reasons I've created the XPLAN_ASH script that - among other things - combines the formatted DBMS_XPLAN output with ASH / Real Time SQL Monitoring information, so you have all that information in a single place

2) if i have 4 big table SQL query, which table is best contender of using parallel hint

If all four tables are big you probably want to have all of them processed in parallel. If you haven't decorated them on object level as DEGREE n then you can simply go for the statement-level (AutoDOP based) PARALLEL hint, like PARALLEL(n), that - ignoring some subtle details - in principle attempts to run the whole statement at DOP n

3) If i use 4 parallel with all 4 tables it will together open 32 Parallel Slaves (4*4*2)

Generally speaking no, you should see only 4 * 2 PX slaves no matter how many tables are marked parallel. There are exceptions to this rule as outlined in this video and other tutorials I've published - if you end up with a parallel plan consisting of multiple so called DFO trees then nasty things can happen and you might end up with more than 4 * 2 PX slaves.

4) How to reduce Temp tablespace while using Parallel hints

Due to the limitation outlined in this video and also several times explained here on my blog (search for HASH JOIN BUFFERED if you're interested in more details) Parallel Execution typically has a much higher demand for PGA and TEMP than serial execution.

The only thing (apart from trying to tweak Oracle to use more PGA) you can do about this is trying to physically design your data so that data re-distributions (PX SEND / RECEIVE operations in the execution plan) are minimized. Depending on your actual data model you might be able to achieve that by equi-partitioning your data (typically by hash) and perform partition-wise operations, ideally full partition wise. These are the most efficient way to minimize these re-distributions and hence minimize PGA / TEMP usage.

Randolf

Mayank said...

Thanks Sir, for you clear detailed explanation
I am currently going through all your video series of XPLAN_ASH and its treasure worth more than any Performance book or Script i have ever used(sna.. etc) , now i am feeling more confident in checking Performance issues. One area i am trying to explore more is using Indexes(new or drop existing) to improve Performance in Complex Prod queries and what import points to note in INDEX_STATS and check for fragmentation if any and rebuild , I am trying to find if any of your video tutorial is having same explanation or can expect in future video blogs
Also hoping to find more on Hint PQ_DISTRIBUTE (HASH,BROADCAST,PARTITON usage) in any one of your videos

Regards,Mayank

Randolf said...

Hi Mayank,

you won't find much regarding index efficiency and potential for rebuilds in my video tutorials. The best source for that is very likely Jonathan Lewis' blog, search for "index rebuilds" and "index efficiency" there. Jonathan has published over the time a lot of notes covering that topic.

Of course you should also visit Richard Foote's blog, which is all about indexing in general.

Your point about parallel distribution methods is a good one, maybe I'll pick this up in the future for a separate video tutorial.

Randolf

Yuri Pedan said...

Hello Randolf,

Could you please help me to understand one thing about real-world plan (please see link https://goo.gl/wqdNbl).
My question is, where does DFO Q1,03 store data when it receives it from DFO Q1,01?

Thank you in advance

Randolf said...

Hi Yuri,

When the DFO marked as Q1,03 receives data from the DFO marked as Q1,01 it builds in each Parallel Slave the hash table for the data received. The corresponding PX SEND / RECEIVE pair is operation 12 and 11, the HASH JOIN OUTER operation 9 is the one that processes then the data received.

Does that answer your question?

Randolf

Yuri Pedan said...

Yes, Randolf. Thank you very much.