PDF

IBM Cognos Proven Practices: Using DB2 Row
Compression in Adaptive Warehouse Projects
Product(s): Adaptive Analytics Framework, Adaptive Warehouse,
Adaptive Analytics, DB2 LUW; Area of Interest: Performance
Greg Harris
Software Engineer
IBM
14 December 2010
This document describes implementation of DB2 row compression in an Adaptive Warehouse
generated data warehouse.
View more content in this series
Introduction
Purpose
This document describes implementation of DB2 row compression in an Adaptive Warehouse
generated data warehouse. DB2 row compression features can decrease disk storage
requirements while at the same time improve sequential read access of large queries which
perform full table scans.
Applicability
Adaptive Warehouse all versions using DB2 ESE LUW 9.1 with the row compression option or
greater as a target warehouse backing store.
Exclusions and Exceptions
None.
About DB2 Row Compression
DB2 row compression was designed to reduce disk storage space requirements of database
objects. It does this by creating a compression dictionary of symbols which replace repeating data
patterns at the row level. There is some CPU overhead in performing the dictionary translation,
however there is also an improvement in disk IO efficiency as more data can be retrieved per IO
unit. A data page also remains compressed in the DB2 buffer pool there by allowing more data to
be cached in memory for potential reuse.
© Copyright IBM Corporation 2010
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Trademarks
Page 1 of 6
developerWorks®
ibm.com/developerWorks/
Applications implemented on DB2 using the Adaptive Warehouse (AW) which contain large fact
tables are suitable candidates for implementing DB2 row compression. Report queries where
full table scans are performed can gain marked improvement in query performance when row
compression is applied.
Example Query Performance without and with Row Compression
Enabled
Following are DB2BATCH run sample summaries illustrating query execution times before
and after row compression has been enabled. These examples refer to the fact table
ORDER_ACTIVITY_MEASURES from the GOSales sample warehouse application. This report
query requires a full table scan of the fact table.
* Timestamp: Mon Jun 29 2009 12:08:05 Eastern Daylight Time
--------------------------------------------* SQL Statement Number 1:
SELECT "T0"."C0" "Yearkey"
,
"T0"."C1" "C__memberCaption" ,
"T0"."C0" "Ordinal0"
,
"T0"."C2" "Quarterkey"
,
"T0"."C3" "C__memberCaption0" ,
"T0"."C4" "Ordinal1"
,
"T0"."C5" "Monthkey"
,
"T0"."C6" "C__memberCaption1" ,
"T0"."C7" "Ordinal"
,
"T0"."C8" "C___PMT"
,
"T0"."C9" "C___SMT1"
FROM
(SELECT "ALL_TIME"."GREGORIAN_CALENDAR_YEAR"
"C0" ,
"ALL_TIME"."GREGORIAN_YEAR_CAPTION"
"C1" ,
"ALL_TIME"."GREGORIAN_CALENDAR_QUARTER"
"C2" ,
"ALL_TIME"."GREGORIAN_QUARTER_CAPTION"
"C3" ,
"ALL_TIME"."GREGORIAN_CALENDAR_YEAR" * 4
+ "ALL_TIME"."GREGORIAN_CALENDAR_QUARTER"
"C4" ,
"ALL_TIME"."GREGORIAN_CALENDAR_MONTH"
"C5" ,
"ALL_TIME"."GREGORIAN_MONTH_CAPTION"
"C6" ,
"ALL_TIME"."GREGORIAN_CALENDAR_YEAR" * 12
+ "ALL_TIME"."GREGORIAN_CALENDAR_MONTH"
"C7" ,
SUM("Order_Activity_Measures"."ORDER_NET_AMOUNT")
"C8" ,
SUM("Order_Activity_Measures"."ORDER_MARGIN_AMOUNT")
"C9"
FROM
"DB2ADMIN"."ALL_TIME" "ALL_TIME",
"DB2ADMIN"."ORDER_ACTIVITY_MEASURES" "Order_Activity_Measures"
WHERE
"ALL_TIME"."GREGORIAN_CALENDAR_YEAR" * 12
+ "ALL_TIME"."GREGORIAN_CALENDAR_MONTH" BETWEEN 24062 AND 24073
AND (
'Rolling Periods' = 'Rolling Periods'
OR 'Rolling Periods' = 'YTD'
)
AND "ALL_TIME"."CALENDAR_TYPE_NAME" IN ('Gregorian',
'-'
,
'Late Date',
'Early Date')
AND NOT "ALL_TIME"."GREGORIAN_CALENDAR_YEAR" IS NULL
AND "ALL_TIME"."CALENDAR_TYPE_NAME" IN ('Gregorian',
'-'
,
'Late Date',
'Early Date')
AND NOT "ALL_TIME"."GREGORIAN_CALENDAR_QUARTER" IS NULL
AND "ALL_TIME"."CALENDAR_TYPE_NAME" IN ('Gregorian',
'-'
,
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Page 2 of 6
ibm.com/developerWorks/
developerWorks®
'Late Date',
'Early Date')
AND NOT "ALL_TIME"."GREGORIAN_CALENDAR_MONTH" IS NULL
AND "Order_Activity_Measures"."ORDER_DATE_ALL_TIME_SID"
= "ALL_TIME"."ALL_TIME_SID"
GROUP BY "ALL_TIME"."GREGORIAN_CALENDAR_YEAR"
"ALL_TIME"."GREGORIAN_YEAR_CAPTION"
"ALL_TIME"."GREGORIAN_CALENDAR_QUARTER"
"ALL_TIME"."GREGORIAN_QUARTER_CAPTION"
"ALL_TIME"."GREGORIAN_CALENDAR_YEAR" * 4
+ "ALL_TIME"."GREGORIAN_CALENDAR_QUARTER",
"ALL_TIME"."GREGORIAN_CALENDAR_MONTH"
"ALL_TIME"."GREGORIAN_MONTH_CAPTION"
"ALL_TIME"."GREGORIAN_CALENDAR_YEAR" * 12
+ "ALL_TIME"."GREGORIAN_CALENDAR_MONTH"
) "T0" FOR
FETCH ONLY;
,
,
,
,
,
,
Yearkey
C__memberCaption
Ordinal0
Quarterkey C__memberCaption0
Ordinal1
Monthkey
C__memberCaption1
Ordinal
C___PMT
C___SMT1
----------- -------------------- ----------- ----------- -------------------- --------------------- -------------------- ----------- --------------------------------- -------------------------------2006 2006
2006
1 Q1/2006
8025
1 Jan/2006
24073
2704738640.30
1129361652.90
* 1 row(s) fetched, 1 row(s) output.
* Elapsed Time is:
12.079891 seconds
* Summary Table:
Type
Number
Repetitions Total Time (s) Min Time (s)
Max Time (s)
Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output
--------- ----------- ----------- -------------- -------------- ---------------------------- -------------- -------------- ------------Statement
1
1
12.079891
12.079891
12.079891
12.079891
12.079891
1
1
* Total Entries:
1
* Total Time:
12.079891 seconds
* Minimum Time:
12.079891 seconds
* Maximum Time:
12.079891 seconds
* Arithmetic Mean Time:
12.079891 seconds
* Geometric Mean Time:
12.079891 seconds
--------------------------------------------* Timestamp: Mon Jun 29 2009 12:08:18 Eastern Daylight Time
After enabling compression the query executes nearly 30% more quickly, as shown below:
* 1 row(s) fetched, 1 row(s) output.
* Elapsed Time is:
8.547579 seconds
* Summary Table:
Type
Number
Repetitions Total Time (s) Min Time (s)
Max Time (s)
Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output
--------- ----------- ----------- -------------- -------------- ---------------------------- -------------- -------------- ------------Statement
1
1
8.547579
8.547579
8.547579
8.547579
8.547579
1
1
* Total Entries:
* Total Time:
1
8.547579 seconds
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Page 3 of 6
developerWorks®
ibm.com/developerWorks/
* Minimum Time:
8.547579 seconds
* Maximum Time:
8.547579 seconds
* Arithmetic Mean Time:
8.547579 seconds
* Geometric Mean Time:
8.547579 seconds
--------------------------------------------* Timestamp: Mon Jun 29 2009 12:36:24 Eastern Daylight Time
Altering Adaptive Warehouse Fact Tables to enable Row
Compression
There are some differences in how row compression is implemented using DB2 version 9.1 and
later versions.
For DB2 9.1:
a. Run AW load management to deliver some initial data for a fact build. Before clicking the run
button, check the synchronization tab and note the name of any table objects ending with
“_MEASURES”. This will be the physical name of the fact table created in the warehouse.
Complete the data load.
b. Using the DB2cmd SQL tool, alter the AW fact tables noted in (a) to enable
row compression. For example to enable row compression for the table
MYSCHEMA.JOB_ACTION_MEASURES, enter the command:
ALTER TABLE MYSCHEMA.JOB_ACTION_MEASURES COMPRESS YES;
c. Reorganize the table to create a compression dictionary and compress the existing
contents.Eg.
REORG TABLE MYSCHEMA.JOB_ACTION_MEASURES RESETDICTIONARY;
d. Continue load management operations as you normally would. All subsequent data added to
the table will be compressed.
For later versions of DB2:
NaN From DB2 version 9.5 the database can dynamically create the compression dictionary as
data is loaded into the fact table. Using load management check the synchronization tab for
the fact build table objects, locate those with names ending with “_MEASURES” and note the
name and status.
NaN If the synchronization status is “will be created”, select “Script” and choose the “Create a
separate script for database synchronisation” option then click OK . Edit the generated SQL
script adding the COMPRESS YES clause to the CREATE TABLE statement for the noted
fact tables. Run the SQL script on the warehouse schema.
NaN If the synchronization status is “same as db”, alter the table using the DB2cmd SQL editor.
For example.
ALTER TABLE MYSCHEMA.JOB_ACTION_MEASURES COMPRESS YES;
NaN Continue load management operations as you would normally. All subsequent data added to
the table will be compressed.
Script to enable Row Compression across AW Fact Tables
The following script can be edited and run on your warehouse database to generate a script to
enable row compression across all AW created fact tables. The script below will scan for AW
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Page 4 of 6
ibm.com/developerWorks/
developerWorks®
fact tables named using the wildcard expression %_MEASURES and scripts ALTER TABLE and
REORG commands. Note the script runs the generated commands immediately so review the
script accordingly.
If running DB2 9.1 you will have had to run load management to initially deliver some data to your
fact tables. For later versions of DB2 you can remove or ignore the REORG statements or their
generation SQL, which are not required to be run.
A copy of the script is installed in the <AW root >\PW\scripts folder when installing the Adaptive
Application Framework, see the file example_row_compression.db2.
db2
db2
db2
db2
db2
db2
db2
db2
db2
db2
db2
db2
db2
db2
-- example_row_compression.db2
--- Description:
This script enables db2 row compression
-on all AAF described fact tables.
-Note that DB2 Enterprise Edition
-with the row compresson option or
-DB2 Data Warehouse Edition is
-require to support this.
-Ensure DB is idle before using and
-warehouse is populated with data.
--- Usage: Edit for proper <database_name> and
-<schema_name> . Run from DB2 CLIP window.
--
db2 connect to <database_name>
db2 -x -z comp1.sql "select distinct 'ALTER TABLE <schema_name>.'||a.name||'
COMPRESS YES;' from <schema_name>.pwd_physical_object a,
<schema_name>.pwd_physical_object_type b where a.type_code=b.type_code
and trim(b.type_name)='fact_detail'"
db2 -tvf comp1.sql
db2 commit
db2 -x -z comp2.sql "select distinct 'REORG TABLE <schema_name>.'||a.name||'
RESETDICTIONARY;' from <schema_name>.pwd_physical_object a,
<schema_name>.pwd_physical_object_type b where a.type_code=b.type_code
and trim(b.type_name)='fact_detail'"
db2 -tvf comp2.sql
db2 commit
db2 terminate
Sample script output created:
ALTER TABLE WH.ORDER_ACTIVITY_MEASURES COMPRESS YES;
ALTER TABLE WH.RETURN_ACTIVITY_MEASURES COMPRESS YES;
ALTER TABLE WH.SHIPPED_ORDERS_SUMM_MEASURES COMPRESS YES;
REORG TABLE WH.ORDER_ACTIVITY_MEASURES RESETDICTIONARY;
REORG TABLE WH.RETURN_ACTIVITY_MEASURES RESETDICTIONARY;
REORG TABLE WH.SHIPPED_ORDERS_SUMM_MEASURES RESETDICTIONARY;
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Page 5 of 6
developerWorks®
ibm.com/developerWorks/
About the author
Greg Harris
Greg Harris is a Database Administrator and ERP software specialist who has been a
member of the Adaptive Applications Framework development team since 2004.
© Copyright IBM Corporation 2010
(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)
IBM Cognos Proven Practices: Using DB2 Row Compression in
Adaptive Warehouse Projects
Page 6 of 6