CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER
If you specify the partition information when building a local-partitioned index, the number of partitions must match the number of partitions in the table on which the partitioned index is built.
Oracle automatically keeps local index partitions in sync with the table partitions. You can’t explicitly add a partition to or drop a partition from a local index. When you add or drop a table partition, Oracle automatically performs the corresponding work for the local index. Oracle manages the local index partitions regardless of how the local indexes have been assigned to tablespaces.
Local indexes are common in data-warehouse and decision-support systems. If you query frequently by using the partitioned column(s), a local index is appropriate. This approach lets Oracle use the appropriate index and table partition to quickly retrieve the data.
There are two types of local indexes: local
prefixed
and local
nonprefixed
. A local-prefixed index is one in which the leftmost column of the index matches the table partition key. The previous example in this section is a local-prefixed index because its leftmost column (D_DATE_ID) is also the partition key for the table.
A nonprefixed-local index is one in which the leftmost column doesn’t match the partition key used to partition the corresponding table. For example, this is a local-nonprefixed index: SQL> create index f_sales_idx1 on f_sales(sales_amt) local;
The index is partitioned with the SALES_AMT column, which isn’t the partition key of the table, and is therefore a nonprefixed index. You can verify whether an index is considered prefixed by querying the ALIGNMENT column from USER_PART_INDEXES:
select
index_name
,table_name
,alignment
,locality
from user_part_indexes
where table_name = 'F_SALES';
Here’s some sample output:
INDEX_NAME TABLE_NAME ALIGNMENT LOCALI
-------------------- -------------------- ------------ ------
F_SALES_FK1 F_SALES PREFIXED LOCAL
F_SALES_IDX1 F_SALES NON_PREFIXED LOCAL
You may wonder why the distinction exists between prefixed and nonprefixed. A local-nonprefixed index means the index doesn’t include the partition key as a leading edge of its index definition. This can have performance implications, in that a range scan accessing a nonprefixed index may need to search every index partition. If there are a large number of partitions, this can result in poor performance.
You can choose to create all local indexes as prefixed by including the partition-key column in the leading edge of the index. For example, you can create the F_SALES_IDX2 index as prefixed as follows: SQL> create index f_sales_idx2 on f_sales(d_date_id, sales_amt) local; Is a prefixed index better than a nonprefixed index? It depends on how you query your tables. You have to generate explain plans for the queries you use and examine whether a prefixed index is able to better take advantage of partition
pruning
(eliminating partitions to search in) than a nonprefixed index.
Also keep in mind that a multicolumn prefixed local index consumes more space and resources than a nonprefixed local index.
302
CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER
operations on the underlying partitioned table require that the global index partitions be rebuilt. The following operations on a heap-organized table render a global index unusable:
• ADD (HASH)
• COALESCE (HASH)
• DROP
• EXCHANGE
• MERGE
• MOVE
• SPLIT
• TRUNCATE
Consider using the UPDATE INDEXES clause when you perform maintenance operations. Doing so keeps the global index available during the operation and eliminates the need for rebuilding. The downside of using UPDATE INDEXES is that the maintenance operation takes longer due to the indexes being maintained during the action.
Global indexes are useful for queries that retrieve a small set of rows via an index. In these situations, Oracle can eliminate (prune) any unnecessary index partitions and efficiently retrieve the data. For example, global range-partitioned indexes are useful in OLTP environments where you need efficient access to individual records.
Partition Pruning
Partition pruning can greatly improve the performance of queries executing against partitioned tables. If a SQL query specifically accesses a table on a partition key, Oracle only searches the partitions that contain data the query needs (and doesn’t access any partitions that don’t contain data that the query requires—pruning them, so to speak).
For example, say a partitioned table is defined as follows:
create table f_sales (
sales_id number
,sales_amt number
,d_date_id number)
tablespace p1_tbsp
partition by range(d_date_id)(
partition y10 values less than (20110101)
tablespace p1_tbsp
,partition y11 values less than (20120101)
tablespace p2_tbsp
,partition y12 values less than (20130101)
tablespace p3_tbsp
);
Additionally, you create a local index on the partition-key column:
SQL> create index f_sales_fk1 on f_sales(d_date_id) local;
For this example, insert some sample data:
304
CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER
SQL> insert into f_sales values(1,100,20090202);
SQL> insert into f_sales values(2,200,20110202);
SQL> insert into f_sales values(3,300,20120202);
To illustrate the process of partition pruning, enable the autotrace facility: SQL> set autotrace trace explain;
Now, execute a SQL statement that accesses a row based on the partition key: select
sales_amt
from f_sales
where d_date_id = '20110202';
Autotrace displays the explain plan. Some of the columns have been removed in order to fit the output on the page neatly:
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Pstart| Pstop|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | |
| 1 | PARTITION RANGE SINGLE | | 1 | 2 | 2 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| F_SALES | 1 | 2 | 2 |
|* 3 | INDEX RANGE SCAN | F_SALES_FK1 | 1 | 2 | 2 |
----------------------------------------------------------------------------------
In this output, Pstart shows that the starting partition accessed is number 2. Pstop shows that the last partition accessed is number 2. In this example, partition 2 is the only partition used to retrieve data; the other partitions in the table aren’t accessed at all by the query.
If a query is executed that doesn’t use the partition key, then all partitions are accessed. For example:
SQL> select * from f_sales;
Here’s the corresponding explain plan:
----------------------------------------------------------------
| Id | Operation | Name | Rows| Pstart| Pstop|
----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | | |
| 1 | PARTITION RANGE ALL| | 3 | 1 | 3 |
| 2 | TABLE ACCESS FULL | F_SALES | 3 | 1 | 3 |
----------------------------------------------------------------
Notice in this output that the starting partition is number 1 and the stopping partition is number 3.
This means partitions 1 through 3 are accessed by this query with no pruning of partitions.
This example is simple but demonstrates the concept of partition pruning. When you access the table by the partition key, you can drastically reduce the number of rows Oracle needs to inspect and process. This has huge performance benefits for queries that are able to prune partitions.
Summary
Oracle provides a partitioning feature that is critical for implementing large tables and indexes.
Partitioning is vital for building highly scalable and maintainable applications. This feature works on the concept of logically creating an object (table or index) but implementing the object as several separate 305
CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER
objects. A partitioned object allows you to build, load, maintain, and query on a partition-by-partition basis. Maintenance operations such as deleting, archiving, updating, and inserting data are manageable because you’re working on only a small subset of the large logical table.
If you work in data-warehouse environments or with large databases, you must be highly knowledgeable of partitioning concepts. As a DBA, you’re required to create and maintain partitioned objects. You have to make recommendations about table-partitioning strategies and where to use local and global indexes. These decisions have a huge impact on the usability and performance of the system.
After this chapter, the book focuses on utilities used to copy and move users, objects, and data from one environment to another. Oracle’s Data Pump and External Table feature are covered in the next two chapters.
306
CHAPTER 13 ■ DATA PUMP
Data Pump Architecture
Data Pump consists of the following components:
• expdp (Data Pump export utility)
• impdp (Data Pump import utility)
• DBMS_DATAPUMP PL/SQL package (Data Pump API)
• DBMS_METADATA PL/SQL package (Data Pump Metadata API)
The expdp and impdp utilities use the DBMS_DATAPUMP and DBMS_METADATA built-in PL/SQL packages when exporting and importing data and metadata. The DBMS_DATAPUMP package moves entire databases or subsets of data between database environments. The DBMS_METADATA package exports and imports information about database objects.
■
Note
You can call the DBMS_DATAPUMP and DBMS_METADATA packages independently (outside of expdp and impdp) from SQL*Plus. I rarely call these packages directly from SQL*Plus; but you may have a specific scenario where it’s desirable to interact directly with them. See the Oracle Database PL/SQL Packages and Types Reference guide (available on OTN) for more details.
When you start a Data Pump export or import job, a master operating-system process is initiated on the database server. This master process name has the format ora_dmNN_
oracle 14950 717 0 10:59:06 ? 0:10 ora_dm00_STAGE
Depending on the degree of parallelism and the work specified, a number of worker processes are also started. The master process coordinates the work between master and worker processes. The worker process names have the format ora_dwNN_
Also, when a user starts an export or import job, a database status table is created (owned by the user who starts the job). This table exists for the duration of the Data Pump job. The name of the status table is dependent on what type of job you’re running. The table is named with the format SYS_