Oracle Essentials Oracle Database 11g (58 page)

BOOK: Oracle Essentials Oracle Database 11g
7.79Mb size Format: txt, pdf, ePub

The Oracle Optimized Warehouse Initiative is a series of Oracle-based data warehousing reference configurations, with key platform and storage providers such as HP, IBM, Sun, and EMC/Dell. Several of Oracle’s platform partners have also announced Oracle Optimized Warehouses that are pretested hardware configurations with preinstalled Oracle databases. The reference configurations include a variety of node, system, and storage configurations offering a variety of upgrade paths. As a starting point, you should understand the complexity of queries in the workload, the amount of data, and the number of concurrent users. Oracle and its partners continue to update these configurations as hardware platforms are improved. For more information, search on “Optimized Warehouse” on Oracle’s main web site.

Disk Deployment Strategies

Disks are often directly attached to systems—more expensive systems offer faster disk controllers and I/O. As network bandwidth has improved, Network Attached Storage (NAS) and Storage Area Networks (SAN) have appeared as cost-effective alternatives. Disks are also configured in a variety of ways for redundancy, eliminating the possibility of single points of disk failure resulting in loss of access to data.

Disk is commonly deployed in arrays, the industry standard being RAID (Redundant Array of Inexpensive/Independent Disks). Youcan use RAID as a part of any of the configurations we’ve discussed to provide higher performance and reliability.

RAID disk arrays were introduced in this book in
Chapter 7
and discussed in the context of their use in high-availability scenarios in
Chapter 11.
Please refer to those chapters for more information about RAID disk arrays. In addition, since Oracle
Disk and Storage Technology

|

301

Database 10
g
, Automatic Storage Management (ASM) delivers much of the functionality of a RAID array, such as striping and mirroring, with a collection of commodity

disks. ASM is further described in Chapter 5.

Oracle9
i
introduced table compression in the database as a means of decreasing disk storage requirements primarily in data warehousing. Duplicate values in a data block are eliminated because values that are duplicated are stored in a symbol table at the beginning of the block, and all additional occurrences are replaced with a short reference to the symbol table. Oracle Database 11
g
also features an Advanced Compression Option for insert, update, and delete operations important in OLTP

operations. Data compression of 50 percent is commonly observed today. In addition to reducing disk storage, compressed data can also be advantageous for performance when it fits entirely into cache (instead of requiring disk access).

Since storage capacities are growing and disk is available at lower cost points, many organizations are now storing all relevant data online in disk storage for data warehousing and business intelligence implementations. Given that disks delivering the best performance are typically more expensive and of lower capacity, many now deploy such disks in combination with higher capacity but lower performing (and cheaper) disk for less frequently accessed data. Information Lifecycle Management (ILM) in the Oracle database, particularly the ILM Assistant, first available in 2007, provide the capability to manage such an environment.

Which Platform Deployment Solution?

In a world in which there was no limit to the amount of money you could spend on hardware, you could make a simple decision about the most appropriate hardware: simply choose the level of throughput and reliability you need, and go buy it! Unfortunately, we have yet to discover the location of this kind of world, so your choice of a hardware solution will often be a compromise. But since this book was first written, relative price points have continued to collapse making this selection much easier.

Platform Comparison

The most commonly implemented hardware platform for an Oracle server is the SMP system, which strikes a nice balance between power and price. SMP systems are popular for the following reasons:

• SMP systems offer more and simpler scalability options for the future than uniprocessor systems.

• 64-bit processors and operating systems with large memory support enable SMP

systems to handle the needs of very large databases (even containing dozens of terabytes of data).

302

|

Chapter 12: Oracle and Hardware Architecture

• SMP systems have a single operating system and a single Oracle instance to manage and maintain, unlike clusters.

• More applications are certified to run on SMP systems than clusters.

• SMP systems can be less expensive than NUMA, clusters, or grid configurations in similar CPU configurations because memory and I/O subsystems are not duplicated to the same degree

This is not to say that other configurations should not be considered. Certainly, if scalability demands exceed the capabilities of SMP machines, clusters or a grid may provide the only viable solution. Clusters can prove cheaper through use of

“commodity” nodes in RAC configurations. With careful planning and an enterprise-computing management style, such configurations do provide powerful and highly available solutions.

Today, one of the key tradeoffs in determining the type of system to deploy is the cost associated with deploying multicore CPUs versus CPUs consisting of single cores. This analysis extends beyond simply hardware costs since database vendors have adopted new pricing models to take this technology into account. Oracle’s pricing policy has changed in reaction to accepted industry practices in this regard.

Many organizations purchase CPU-based licenses of Oracle based on the number of CPUs in their platforms. However, where multicore CPUs are deployed, the incremental Oracle license price is not at a 1:1

ratio with the number of additional cores. This is because platform vendors and Oracle Corporation recognize there is overhead associated with multicore technology, so Oracle license prices increase incrementally based on expected performance gains. Of course,

technologies and industry pricing practices often change more frequently than major database releases, and that is one of the reasons why we don’t dwell on pricing in this book. To figure out current tradeoffs, youwill likely need the help of both your platform provider and Oracle.

Table 12-2
provides a comparison of the relative strengths of the different deployment platforms for scalability, manageability, and availability.

Table 12-2. Relative strengths of deployment platforms

Ranking

Scalability

Manageability

Availability

Best

Grid

Uniprocessor

Grid

Cluster

SMP

Cluster

SMP

Grid

SMP

Worst

Uniprocessor

Cluster

Uniprocessor

Which Platform Deployment Solution?

|

303

You should select a storage technology based on your performance and recovery requirements and budget. In general, more expensive solutions offer better performance and more flexible availability options. Be sure to consider throughput requirements as you choose storage.

Approaches to Choosing Platforms

When selecting a solution for deployment, most organizations choose systems that will meet anticipated performance and scalability needs for the near future, taking into account management and availability requirements. However, there are two additional approaches to be considered.

First is the truism with which we’re all familiar—the longer you wait, the cheaper computer hardware (and related components) get. According to Moore’s Law, credited by Intel to Gordon Moore in 1965 (and proven many times over since then), each chip will double in computing power every 18–24 months, each time providing huge leaps in performance. Today, such performance increases are driven by increased clock speeds and the introduction of more cores in the processors.

This continual reduction in price and increase in performance characteristics is an ongoing fact of life in the computer hardware industry. But how can you use this fact in planning deployment strategies for your organizational system architecture?

Buy what youneed, when youneed it, and plan for the obsolescence of hardware by recycling it into the organization when it no longer meets the needs of an individual application. For instance, today’s departmental server may turn into tomorrow’s web server. With grid deployment, youmight continue to leverage older hardware as part of the existing computing solution.

Second, remember to consider the effect of hardware upgrades, particularly CPU

upgrades, in nongrid solutions. SMP systems and nodes require that all CPUs be identical within them, so if youupgrade one youwill have to upgrade all of them. At some point the vendor will recommend a new system anyway because other internal features (e.g., memory and I/O bus technologies) will have improved, partly to match the increased capabilities of the new CPUs.

The grid is tempting to consider since new machine types can be added to the grid as they become available. Oracle’s self-tuning and advanced management capabilities available in Oracle Database 10
g
and further improved in Oracle Database 11
g
make grid computing more practical by eliminating difficult manual tuning efforts that formerly needed to take into account variations in systems.

304

|

Chapter 12: Oracle and Hardware Architecture

Chapter 13

CHAPTER 13

Oracle Distributed Databases and

Distributed Data13

Data in large and mid-sized companies can be spread over many different databases.

The data can be on different servers running different operating systems or even different database management systems. The data needed to answer any specific business question may need to be accessed from more than one server. A user may need to access this separate data on several servers simultaneously, or the data required for an answer may need to be moved to a local server. Inserts, updates, or deletions of data across these distributed servers may also be necessary.

There are two basic ways to deal with data in distributed databases: as part of a single distributed entity in which the distributed architecture is transparent, or by using a variety of replication techniques to create copies of the data in more than one location. This chapter examines both of these options and the technologies associated with each solution.

Accessing Multiple Databases As a Single Entity

Users sometimes need to query or manipulate data that resides in multiple Oracle databases or in a mixture of Oracle and non-Oracle databases. This section describes a number of techniques and architectures you can use to interact with data in a distributed environment.

Distributed Data Access Across Multiple Oracle Databases

For many years, Oracle has offered access to distributed data residing on multiple Oracle database servers on multiple systems or
nodes
. Users don’t need to know the location of the data in distributed databases. Data is accessed using a unique identifier to a specific table name. Administrators can create simple identifiers so that data in an Oracle table in a separate machine can appear to users to be part of a single logical database.

305

Developers can create connections between individual databases by creating database links in SQL. These connections form a distributed database. The statement: CREATE PUBLIC DATABASE LINK employees.northpole.bigtoyco.com creates a path to a remote database containing a table with Bigtoyco’s North Pole employees. Any application or user attached to a local employees database can access the remote North Pole database by using the global access name (employees.

northpole.bigtoyco.com) in SQL queries, inserts, updates, deletions, and other statements. Oracle Net (previously known as Net8 or SQL*Net in older database releases) handles the interaction with any network protocols used to communicate with the remote database transparently.

Although the database link makes data access transparent to users, Oracle still has to treat interactions over distributed databases differently. Let’s look briefly at how queries and updates issued for distributed Oracle databases differ from those issued for a single Oracle database. When using distributed data in a query, your primary concern is to properly optimize the retrieval of data for a query. Queries in a single Oracle database are optimized for performance, most frequently using the cost-based optimizer, as discussed in
Chapter 4.
Oracle7 added global cost-based optimization for the improvement of query performance across distributed databases as well. For example, the cost-based optimizer considers indexes on remote databases when choosing a plan, whereas the rule-based optimizer does not. The cost-based optimizer also considers statistics on remote databases. Improvements to the Oracle8
i
optimizer included optimizing for join and set operations to be performed on the nodes offering the best performance and also minimizing the amount of data sent between systems. Since Oracle Database 10
g
, the cost-based optimizer is Oracle’s only recommended optimizer for single and for distributed databases.

Other books

Flash by Jayne Ann Krentz
One Night Three Hearts by Adele Allaire
An Available Man by Hilma Wolitzer
The Blue Rose by Anthony Eglin
The Collective by Jack Rogan
The House at Midnight by Lucie Whitehouse
Bardelys the Magnificent by Rafael Sabatini