Tuesday, April 24, 2007

Performing PL/SQL functions in parallel which are not "pure"

If you find out that a particular process is CPU bound that involves calling user-defined PL/SQL functions, then one of the measures you might want to consider - if your hardware is powerful enough - is executing your statement in parallel and in particular execute the PL/SQL function in parallel, which might not always be the same.

It is possible in Oracle that you run a statement in parallel but the execution of the PL/SQL function is performed by the statement coordinator process. In that case you probably won't achieve those improvements that you expected from parallel execution, because it is very likely that the PL/SQL function is actually using most of the CPU time and by being executed by the statement coordinator process it is still being executed serially although there are parallel slaves performing some of the work, e.g. scanning a particular part of a table/partition segment, but those are just queued up waiting to deliver data to the coordinator process which is busy executing the PL/SQL function.

Oracle executes a user-defined PL/SQL function in parallel only if it is a "pure" function. "Pure" means that it does not depend on a package state (it does not read or write variables defined on package/session level, in addition a really "pure" function should not read or write the database either, but here we focus on the package/session state/variables). The "purity" of a function can be determined in several ways:
  • If it is a stand-alone PL/SQL function, Oracle is able to determine by itself (I think since Orace 8i) whether the function is pure or not. This does not apply to functions defined within a package
  • If it is a package function or you want to make sure that your function is "pure", you can use the compiler directive "pragma restrict_references" to declare the function as pure by specifying that it is not allowed to read or write the package state. If your function attempts to access variables defined on package level the compiler will show corresponding errors and your package will fail to compile (unless you use the option "TRUST" of the pragma).
  • You can tell Oracle to "trust" you by specifying "parallel_enable" as part of the function declaration in the package specification. In this case Oracle does not enforce anything, it will execute your function in parallel and you are responsible that the function does not screw up when being executed in parallel.

The reason for all these measures is that Oracle implements parallel execution by spreading the execution of the statement across separate sessions which do not share the package/session context, which means that a variable that has been initialised in the statement coordinator session with a particular value will not be available to the parallel slaves being executed as child processes of the statement coordinator process. So in case your user-defined function depends on variables defined on package scope they will run into trouble when executed in parallel since each parallel execution slave has its own non-shared session environment in which the variables at package level will be initialised separately from each other, so each parallel execution slaves potentially ends up with blank or different values of the package level variables.

Now what options do you have if you need to use a user-defined PL/SQL function that depends on information that can not be passed as parameters but needs to be defined in your session before the actual statement is executed in parallel?

First thing to consider is of course, if it is not possible to pass the required information as parameter, so you don't have the need to read additional information from anywhere else.

If that is not an option then two potential solutions are presented here. The first uses a database table to pass information to the parallel slaves, the second uses the concept of contexts.

Let's have a closer look at approach number one.

The key concept behind the first approach is that the information that is supposed to be made available to the user-defined PL/SQL functions executed in the parallel slaves is stored in a database table and the PL/SQL function then needs to be able to identify which particular set of data in the table it has been assigned to use.

This can be achieved by using the SID (or more generally the INSTANCE_ID plus SID for RAC environments) of the statement coordinator session as unique identifier in the table. So the PL/SQL function only needs to find out the SID and INSTANCE_ID of the parent process to determine which rows of the parameter table to use.

The parameter table could have the following generic layout:

CREATE TABLE PX_PARAMETER (
COORDINATOR_ID VARCHAR2(20) NOT NULL,
PARAMETER_NAME VARCHAR2(128) NOT NULL,
PARAMETER_VALUE VARCHAR2(4000),
CONSTRAINT PX_PARAMETER_PK PRIMARY KEY (COORDINATOR_ID, PARAMETER_NAME));

In order to find out what the coordinator id of the parallel slave's parent is, the following information can be used from V$SESSION (the following description has been taken from the Oracle documentation, "Reference" and applies to Oracle 9i and 10g):

Column OWNERID: "The column contents are invalid if the value is 2147483644. Otherwise, this column contains the identifier of the user who owns the migratable session.
For operations using Parallel Slaves, interpret this value as a 4-byte value. The low-order 2 bytes of which represent the session number, and the high-order bytes the instance ID of the query coordinator"

So by shifting the bits a bit around in OWNERID, the parallel slave should be able to determine the SID and INSTANCE_ID of its parent process and can use this information to query the parameter table accordingly.

Here is a sample script showing the whole stuff in action:

SQL>ִ
SQL>ִdropִtableִtest_source_dataִpurge;

Tableִdropped.

SQL>ִdropִtableִtest_dest_dataִpurge;

Tableִdropped.

SQL>ִdropִpackageִpk_px_parameter_test;

Packageִdropped.

SQL>ִdropִpackageִpk_px_parameter;

Packageִdropped.

SQL>ִdropִtableִpx_parameterִpurge;

Tableִdropped.

SQL>ִ
SQL>ִCREATEִTABLEִPX_PARAMETERִ(
ִִ2ִִCOORDINATOR_IDִVARCHAR2(20)ִNOTִNULL,
ִִ3ִִPARAMETER_NAMEִVARCHAR2(128)ִNOTִNULL,
ִִ4ִִPARAMETER_VALUEִVARCHAR2(4000),
ִִ5ִִCONSTRAINTִPX_PARAMETER_PKִPRIMARYִKEYִ(COORDINATOR_ID,ִPARAMETER_NAME));

Tableִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameterִas
ִִ2ִִִִ--ִutilityִfunctions
ִִ3ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ4ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ5ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִִ6ִִִִ--ִexecutingִaִstatementִinִparallel
ִִ7ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִִ8ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2);
ִִ9ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ10ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ11ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enable;
ִ12ִִendִpk_px_parameter;
ִ13ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameterִas
ִִ2ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ3ִִ
ִִִִn_my_sidִnumber;
ִִ4ִִִִbegin
ִִ5ִִ ִ
ִִselectִsidִintoִn_my_sidִfromִv$mystatִwhereִrownumִ=ִ1;
ִִ6ִִ ִ
ִִreturnִn_my_sid;
ִִ7ִִִִendִget_my_sid;
ִִ8ִִ
ִִ9ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ10ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ11ִִִִbegin
ִ12ִִ ִ
ִִs_my_inst_idִ:=ִsys_context('USERENV',ִ'INSTANCE');
ִ13ִִ ִ
ִִreturnִs_my_inst_id;
ִ14ִִִִendִget_my_inst_id;
ִ15ִִ
ִ16ִִִִfunctionִget_my_idִreturnִvarchar2ִdeterministicִis
ִ17ִִ ִ
ִִn_my_sidִnumber;
ִ18ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ19ִִ ִ
ִִs_my_idִvarchar2(50);
ִ20ִִִִbegin
ִ21ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ22ִִ ִ
ִִs_my_inst_idִ:=ִget_my_inst_id;
ִ23ִִ ִ
ִִs_my_idִ:=ִs_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ24ִִ
ִ25ִִ ִ
ִִreturnִs_my_id;
ִ26ִִִִendִget_my_id;
ִ27ִִ
ִ28ִִִִfunctionִget_coordinator_idִreturnִvarchar2ִdeterministicִis
ִ29ִִ ִ
ִִn_my_sidִnumber;
ִ30ִִ ִ
ִִn_owneridִnumber;
ִ31ִִ ִ
ִִn_parent_sidִnumber;
ִ32ִִ ִ
ִִn_parent_inst_idִnumber;
ִ33ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ34ִִִִbegin
ִ35ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ36ִִ ִ
ִִSELECTִownerid,
ִ37ִִ ִ
ִִround(bitand(ownerid,ִ65535))ִasִparent_session_sid,
ִ38ִִ ִ
ִִround(bitand(ownerid,ִ16711680)ִ/ִ65536)ִasִparent_session_instid
ִ39ִִ ִ
ִִintoִn_ownerid,ִn_parent_sid,ִn_parent_inst_id
ִ40ִִ ִ
ִִfromִv$sessionִwhereִsidִ=ִn_my_sid;
ִ41ִִ
ִ42ִִ ִ
ִִifִn_owneridִ=ִ2147483644ִthen
ִ43ִִ ִִִ
ִִ--ִnoִparallelִexecution,ִuseִcurrentִinst_idִandִsid
ִ44ִִ ִִִ
ִִs_coordinator_idִ:=ִget_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ45ִִ ִ
ִִelse
ִ46ִִ ִִִ
ִִ--ִparallelִexecution,ִuseִparentִinfo
ִ47ִִ ִִִ
ִִs_coordinator_idִ:=ִto_char(n_parent_inst_id,ִ'TM')ִ||ִ'|'ִ||ִto_char(n_parent_sid,ִ'TM');
ִ48ִִ ִ
ִִendִif;
ִ49ִִ
ִ50ִִ ִ
ִִreturnִs_coordinator_id;
ִ51ִִִִendִget_coordinator_id;
ִ52ִִ
ִ53ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִ54ִִִִ--ִexecutingִaִstatementִinִparallel
ִ55ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִ56ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2)ִis
ִ57ִִ ִ
ִִs_my_idִvarchar2(50);
ִ58ִִִִbegin
ִ59ִִ ִ
ִִs_my_idִ:=ִget_my_id;
ִ60ִִ
ִ61ִִ ִ
ִִinsertִintoִpx_parameterִ(coordinator_id,ִparameter_name,ִparameter_value)ִvaluesִ(s_my_id,ִin_s_parameter_name,ִin_s_parameter_value);
ִ62ִִִִendִset_px_parameter;
ִ63ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ64ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ65ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enableִis
ִ66ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ67ִִ ִ
ִִs_parameter_valueִvarchar2(4000);
ִ68ִִִִbegin
ִ69ִִ ִ
ִִs_coordinator_idִ:=ִget_coordinator_id;
ִ70ִִ
ִ71ִִ ִ
ִִselectִparameter_value
ִ72ִִ ִ
ִִintoִs_parameter_value
ִ73ִִ ִ
ִִfromִpx_parameter
ִ74ִִ ִ
ִִwhereִcoordinator_idִ=ִs_coordinator_id
ִ75ִִ ִ
ִִandִparameter_nameִ=ִin_s_parameter_name;
ִ76ִִ
ִ77ִִ ִ
ִִreturnִs_parameter_value;
ִ78ִִִִendִget_px_parameter;
ִ79ִִendִpk_px_parameter;
ִ80ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִcreateִtableִtest_source_dataִparallelִnologgingִasִselectִ*ִfromִall_objects;

Tableִcreated.

SQL>ִ
SQL>ִselectִcount(*)ִfromִtest_source_data;

ִִCOUNT(*)
----------
ִִִִִ98440

SQL>ִ
SQL>ִcreateִtableִtest_dest_dataִ(
ִִ2ִִthe_sidִnumberִnotִnull,
ִִ3ִִthe_test_dataִvarchar2(4000)ִnotִnull)
ִִ4ִִparallelִnologging;

Tableִcreated.

SQL>ִ
SQL>ִalterִsessionִenableִparallelִdml;

Sessionִaltered.

SQL>ִ
SQL>ִbeginִpk_px_parameter.set_px_parameter('THE_DATA',ִ'Thisִis
ִreadִbyִtheִparallelִslave!');ִend;
ִִ2ִִ/

PL/SQLִprocedureִsuccessfullyִcompleted.

SQL>ִ
SQL>ִcolumnִparameter_nameִformatִa20
SQL>ִcolumnִparameter_valueִformatִa40
SQL>ִ
SQL>ִselectִ*ִfromִpx_parameter;

COORDINATOR_IDִִִִִִִPARAMETER_NAMEִִִִִִִPARAMETER_VALUE
--------------------ִ--------------------ִ----------------------------------------
1|124ִִִִִִִִִִִִִִִִTHE_DATAִִִִִִִִִִִִִThisִis
ִreadִby theִparallelִslave!

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameter_testִis
ִִ2ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ3ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ4ִִendִpk_px_parameter_test;
ִִ5ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameter_testִis
ִִ2ִִִִg_n_the_sidִnumber;
ִִ3ִִִִg_s_the_test_dataִvarchar2(4000);
ִִ4ִִ
ִִ5ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ6ִִִִbegin
ִִ7ִִ ִ
ִִifִg_n_the_sidִisִnullִthen
ִִ8ִִ ִִִ
ִִg_n_the_sidִ:=ִpk_px_parameter.get_my_sid;
ִִ9ִִ ִ
ִִendִif;
ִ10ִִ
ִ11ִִ ִ
ִִreturnִg_n_the_sid;
ִ12ִִִִendִget_the_sid;
ִ13ִִ
ִ14ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ15ִִִִbegin
ִ16ִִ ִ
ִִifִg_s_the_test_dataִisִnullִthen
ִ17ִִ ִִִ
ִִg_s_the_test_dataִ:=ִ'I''mִsession:ִ'ִ||ִpk_px_parameter.get_my_sidִ||ִ'ִandִIִgotִthis:ִ'ִ||ִpk_px_parameter.get_px_parameter('THE_DATA');
ִ18ִִ ִ
ִִendִif;
ִ19ִִ
ִ20ִִ ִ
ִִreturnִg_s_the_test_data;
ִ21ִִִִend;
ִ22ִִendִpk_px_parameter_test;
ִ23ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִinsertִ/*+ִappendִparallel(d)ִ*/ִintoִtest_dest_dataִdִ(the_sid,ִthe_test_data)
ִִ2ִִselectִpk_px_parameter_test.get_the_sid,ִpk_px_parameter_test.get_the_test_data
ִִ3ִִfromִtest_source_data;

98440ִrowsִcreated.

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcolumnִthe_test_dataִformatִa40
SQL>ִ
SQL>ִselectִthe_sid,ִthe_test_data,ִcount(*)ִfromִtest_dest_data
ִִ2ִִgroupִbyִthe_sid,ִthe_test_data;

ִִִTHE_SIDִTHE_TEST_DATAִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִCOUNT(*)
----------ִ----------------------------------------ִ----------
ִִִִִִִ117ִI'mִsession:ִ117ִandִIִgotִthis:ִThisִisִִִִִִ52910
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!

ִִִִִִִ142ִI'mִsession:ִ142ִandִIִgotִthis:ִThisִisִִִִִִ45530
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!


SQL>ִ
SQL>ִspoolִoff

As you can see, the parallel slaves were able to pick up the data which has been stored by the coordinator process prior to executing the parallel DML.

The provided package PK_PX_PARAMETER can be used as a starting point for your own development. It lacks features like exception handling, but it shows the basic functionality required to set and get parameters used in parallel slaves. It requires SELECT privileges on V$MYSTAT and V$SESSION at least. As always with AUTH_ID DEFINER packages, these privileges need to be granted directly and explicitly to the user owning the packages, granting a role won't be sufficient.

Additionally, the PX_PARAMETER_TEST package shows how to cache information obtained via PK_PX_PARAMETER in order to prevent the parallel execution from causing excessive latching/logical I/Os by executing the recursive SQL to read the data from the table PX_PARAMETER over and over again. Of course, in a real world application you need to ensure that you are able tell whether the data your cache holds is outdated or not in case the already established parallel slave sessions are going to be reused. One possible approach is to mark each of your operations with a unique "run_id" (could be stored in a context, see below for description how to use this), and if the current run_id is different from your cached run_id then you can discard your current cache and re-load the actual data into the cache.

Let's turn to the the second approach now.

A different approach which can be used if the data to be passed to the PL/SQL function is not too large is by using a context.

Since Version 8i Oracle offers the concept of contexts which is basically a user defined namespace where you can define an arbitrary number of string variables which can be set via DBMS_SESSION.SET_CONTEXT and accessed via the SYS_CONTEXT() function. Contexts are mainly used in conjunction with Row-Level Security, but they can be used on their own as well.

In 10g, the variables defined in the context namespace on statement coordinator level will be propagated to the parallel execution slaves, so all you need to do is set your variables and read them in your user-defined PL/SQL function being executed in parallel. It should be able to access the variables, although they have been defined in the parent session.

In 9i, the official documentation (Application Developer's Guide - Fundamentals, Chapter 12: "Implementing Application Security Policies") says the following:

"If SYS_CONTEXT is used inside a SQL function which is embedded in a parallel query, the function cannot pick up the application context. This is true because the application context exists only in the user session. To use these features in combination, you must call SYS_CONTEXT directly from the query."

This seems to be outdated resp. a documentation bug, because when testing the 10g version with 9.2.0.8, it worked exactly the same, so the context information was propagated to the parallel execution slaves. Note: Tests with 9.2.0.2 terminated with a "Parallel slave unexpectedly died" error message when accessing the context information in the parallel slave, so make sure that you run a current version of 9iR2 if you plan to use this.

Here is a sample script showing the second approach. It does not need a table to store the parameters, but of course requires more memory, since your parameter data is held in the variables of the namespace defined by the context.

SQL>ִ
SQL>ִdropִtableִtest_source_dataִpurge;

Tableִdropped.

SQL>ִdropִtableִtest_dest_dataִpurge;

Tableִdropped.

SQL>ִdropִpackageִpk_px_parameter_test;

Packageִdropped.

SQL>ִdropִpackageִpk_px_parameter;

Packageִdropped.

SQL>ִdropִcontextִct_px_parameter;

Contextִdropped.

SQL>ִ
SQL>ִcreateִorִreplaceִcontextִct_px_parameterִusingִpk_px_parameter;

Contextִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameterִas
ִִ2ִִִִ--ִutilityִfunctions
ִִ3ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ4ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ5ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִִ6ִִִִ--ִexecutingִaִstatementִinִparallel
ִִ7ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִִ8ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2);
ִִ9ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ10ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ11ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enable;
ִ12ִִendִpk_px_parameter;
ִ13ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameterִas
ִִ2ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ3ִִ ִ
ִִn_my_sidִnumber;
ִִ4ִִִִbegin
ִִ5ִִ ִ
ִִselectִsidִintoִn_my_sidִfromִv$mystatִwhereִrownumִ=ִ1;
ִִ6ִִ ִ
ִִreturnִn_my_sid;
ִִ7ִִִִendִget_my_sid;
ִִ8ִִ
ִִ9ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ10ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ11ִִִִbegin
ִ12ִִ ִ
ִִs_my_inst_idִ:=ִsys_context('USERENV',ִ'INSTANCE');
ִ13ִִ ִ
ִִreturnִs_my_inst_id;
ִ14ִִִִendִget_my_inst_id;
ִ15ִִ
ִ16ִִִִfunctionִget_my_idִreturnִvarchar2ִdeterministicִis
ִ17ִִ ִ
ִִn_my_sidִnumber;
ִ18ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ19ִִ ִ
ִִs_my_idִvarchar2(50);
ִ20ִִִִbegin
ִ21ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ22ִִ ִ
ִִs_my_inst_idִ:=ִget_my_inst_id;
ִ23ִִ ִ
ִִs_my_idִ:=ִs_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ24ִִ
ִ25ִִ ִ
ִִreturnִs_my_id;
ִ26ִִִִendִget_my_id;
ִ27ִִ
ִ28ִִִִfunctionִget_coordinator_idִreturnִvarchar2ִdeterministicִis
ִ29ִִ ִ
ִִn_my_sidִnumber;
ִ30ִִ ִ
ִִn_owneridִnumber;
ִ31ִִ ִ
ִִn_parent_sidִnumber;
ִ32ִִ ִ
ִִn_parent_inst_idִnumber;
ִ33ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ34ִִִִbegin
ִ35ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ36ִִ ִ
ִִSELECTִownerid,
ִ37ִִ ִ
ִִround(bitand(ownerid,ִ65535))ִasִparent_session_sid,
ִ38ִִ ִ
ִִround(bitand(ownerid,ִ16711680)ִ/ִ65536)ִasִparent_session_instid
ִ39ִִ ִ
ִִintoִn_ownerid,ִn_parent_sid,ִn_parent_inst_id
ִ40ִִ ִ
ִִfromִv$sessionִwhereִsidִ=ִn_my_sid;
ִ41ִִ
ִ42ִִ ִ
ִִifִn_owneridִ=ִ2147483644ִthen
ִ43ִִ ִִִ
ִִ--ִnoִparallelִexecution,ִuseִcurrentִinst_idִandִsid
ִ44ִִ ִִִ
ִִs_coordinator_idִ:=ִget_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ45ִִ ִ
ִִelse
ִ46ִִ ִִִ
ִִ--ִparallelִexecution,ִuseִparentִinfo
ִ47ִִ ִִִ
ִִs_coordinator_idִ:=ִto_char(n_parent_inst_id,ִ'TM')ִ||ִ'|'ִ||ִto_char(n_parent_sid,ִ'TM');
ִ48ִִ ִ
ִִendִif;
ִ49ִִ
ִ50ִִ ִ
ִִreturnִs_coordinator_id;
ִ51ִִִִendִget_coordinator_id;
ִ52ִִ
ִ53ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִ54ִִִִ--ִexecutingִaִstatementִinִparallel
ִ55ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִ56ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2)ִis
ִ57ִִ ִ
ִִ-- s_my_idִvarchar2(50);
ִ58ִִִִbegin
ִ59ִִ ִ
ִִ-- s_my_idִ:=ִget_my_id;
ִ60ִִ
ִ61ִִ ִ
ִִdbms_session.set_context('CT_PX_PARAMETER',ִin_s_parameter_name,ִin_s_parameter_value);
ִ62ִִִִendִset_px_parameter;
ִ63ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ64ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ65ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enableִis
ִ66ִִ ִ
ִִ-- s_coordinator_idִvarchar2(50);
ִ67ִִ ִs_parameter_valueִvarchar2(4000);
ִ68ִִִִbegin
ִ69ִִ ִ
ִִ-- s_coordinator_idִ:=ִget_coordinator_id;
ִ70ִִ
ִ71ִִ ִ
ִִs_parameter_valueִ:=ִsys_context('CT_PX_PARAMETER',ִin_s_parameter_name);
ִ72ִִ
ִ73ִִ ִ
ִִreturnִs_parameter_value;
ִ74ִִִִendִget_px_parameter;
ִ75ִִendִpk_px_parameter;
ִ76ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִcreateִtableִtest_source_dataִparallelִnologgingִasִselectִ*ִfromִall_objects;

Tableִcreated.

SQL>ִ
SQL>ִselectִcount(*)ִfromִtest_source_data;

ִִCOUNT(*)
----------
ִִִִִ98445

SQL>ִ
SQL>ִcreateִtableִtest_dest_dataִ(
ִִ2ִִthe_sidִnumberִnotִnull,
ִִ3ִִthe_test_dataִvarchar2(4000)ִnotִnull)
ִִ4ִִparallelִnologging;

Tableִcreated.

SQL>ִ
SQL>ִalterִsessionִenableִparallelִdml;

Sessionִaltered.

SQL>ִ
SQL>ִbeginִpk_px_parameter.set_px_parameter('THE_DATA',ִ'Thisִis
ִreadִbyִtheִparallelִslave!');ִend;
ִִ2ִִ/

PL/SQLִprocedureִsuccessfullyִcompleted.

SQL>ִ
SQL>ִcolumnִparameter_valueִformatִa40
SQL>ִ
SQL>ִselectִsys_context('CT_PX_PARAMETER',ִ'THE_DATA')ִasִparameter_valueִfromִdual;

PARAMETER_VALUE
----------------------------------------
Thisִis
ִreadִbyִtheִparallelִslave!

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameter_testִis
ִִ2ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ3ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ4ִִendִpk_px_parameter_test;
ִִ5ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameter_testִis
ִִ2ִִִִg_n_the_sidִnumber;
ִִ3ִִִִg_s_the_test_dataִvarchar2(4000);
ִִ4ִִ
ִִ5ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ6ִִִִbegin
ִִ7ִִ ִ
ִִifִg_n_the_sidִisִnullִthen
ִִ8ִִ ִִִ
ִִg_n_the_sidִ:=ִpk_px_parameter.get_my_sid;
ִִ9ִִ ִ
ִִendִif;
ִ10ִִ
ִ11ִִ ִ
ִִreturnִg_n_the_sid;
ִ12ִִִִendִget_the_sid;
ִ13ִִ
ִ14ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ15ִִִִbegin
ִ16ִִ ִ
ִִifִg_s_the_test_dataִisִnullִthen
ִ17ִִ ִִִ
ִִg_s_the_test_dataִ:=ִ'I''mִsession:ִ'ִ||ִpk_px_parameter.get_my_sidִ||ִ'ִandִIִgotִthis:ִ'ִ||ִpk_px_parameter.get_px_parameter('THE_DATA');
ִ18ִִ ִ
ִִendִif;
ִ19ִִ
ִ20ִִ ִ
ִִreturnִg_s_the_test_data;
ִ21ִִִִend;
ִ22ִִendִpk_px_parameter_test;
ִ23ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִinsertִ/*+ִappendִparallel(d)ִ*/ִintoִtest_dest_dataִdִ(the_sid,ִthe_test_data)
ִִ2ִִselectִpk_px_parameter_test.get_the_sid,ִpk_px_parameter_test.get_the_test_data
ִִ3ִִfromִtest_source_data;

98445ִrowsִcreated.

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcolumnִthe_test_dataִformatִa40
SQL>ִ
SQL>ִselectִthe_sid,ִthe_test_data,ִcount(*)ִfromִtest_dest_data
ִִ2ִִgroupִbyִthe_sid,ִthe_test_data;

ִִִTHE_SIDִTHE_TEST_DATAִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִCOUNT(*)
----------ִ----------------------------------------ִ----------
ִִִִִִִ116ִI'mִsession:ִ116ִandִIִgotִthis:ִThisִisִִִִִִ42005
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!

ִִִִִִִ117ִI'mִsession:ִ117ִandִIִgotִthis:ִThisִisִִִִִִ56440
ִִִִִִִִִִִreadִby
ִtheִparallelִslave!


SQL>ִ
SQL>ִspoolִoff

This version of the script requires the CREATE ANY CONTEXT system privilege to work properly. If you want to be able to drop contexts, you need in addition the DROP ANY CONTEXT system privilege granted.

Note that this version of PK_PX_PARAMETER does not need a table to store the parameter information, but therefore consumes potentially more memory depending on the amount of data you attempt to pass and the way Oracle handles internally the propagation of the context to the parallel execution slaves.

In case you need to pass larger content I recommend using the database table based approach. If the data you need to pass is more of lightweight nature, you could use the context/namespace approach.

3 comments:

Anonymous said...

We're using the SYS_CONTEXT trick for implementing parametrized views on "Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit" - Solaris E25k. So this seems to work already below 9.2.0.8. Reason we needed parametrized views: aggregate in parallel on instance A, transmitting result only to instance B over DB link, and inserting stuff on instance B in parallel. Accessing table functions seem to not work (cannot access remote object type). Btw.: intersting blog!

Michael Reitsma said...

Hi,

I am looking around to find an answer to my problem and stumbled over your site.
Perhaps you can share me some of your insights...
I am not really looking for ready baked code, just a hint or answer if it is possible what i want.
Anyway ....

In my plsql function we currently do this:
-look at table A, group by Colom A
-for every row in group by set call another function, B.
-Function B just writes all entries in Table A (that match entry) to a file.
-Thus creating a file for every group by record conatining all records belonging to that group by record.

We use utl_file for this and buffer technique so that it is sped up the most.
Indeed the time to create a 12Mb file is roughly the same as it is for a 10K file.
However every file takes about 1 to 2 minutes to create.
When there are 70 files to be made it takes 1,5 hrs !

So this is my problem: can i speed this up by creating these files in parallel ?
Sort of like calling Function B in parallel ?

kind regards Mike

Randolf said...

Mike,

as long as you don't know where most of your time is spent, i.e. why it takes 1 to 2 minutes to create the file, everything is just speculation. This means even when running your process in parallel you still might need the same time if the jobs wait in parallel for the same bottleneck.

You can actually call your function in parallel, e.g. by simply submitting a job per group and making sure the database is allowed to spawn a sufficient number of job processes, e.g. using the JOB_QUEUE_PROCESSES parameter when using DBMS_JOB.

Search e.g. on AskTom for "Do it yourself parallelism" or e.g. this

Randolf