Performance Modeling of Storage Ariel Landau Nava Aizikowitz Daniel Fishkov Bilha Mendelson November 21, 2002 IBM Research Lab in Haifa Background Application stack components: Application - Database - Storage Existing configuration tools adjust each component separately are complicated require high expertise focus on ease-of-administration rather than performance None for the application stack as a whole WebSphere DB2/ Oracle SAP ESS/ EMC Our Goal Address performance issues of system-level interactions in application stacks Focus on I/O aspects Provide knowledge and tools for the development and integration of improved performance policies Expected interest Performance-tuning experts Marketing teams Architects How Storage performance model Simulation-based Performance of system-level interactions Feed storage model with I/O traces from real and synthetic hosts Impact of I/O on performance Analyze sensitivity to configurations Data placement Related Work Building actual environment for measurements A model is more flexible and amenable to modifications A modeling solution can address future directions Analytic queuing model for steady state behavior A simulation model captures dynamic behavior Current Status Storage performance model - working prototype Configured to represent ESS Focused on Open System hosts, Fibre Channel attachments and RAID-5 Stack analysis Tracing of DB2 I/O on AIX Through AIX trace facility Experimetation with data placement host configuration Storage Performance Model Simulation-based queuing model Built on top of CSIM simulation engine Receives request attributes read/write target address data amount time stamp channel Simulates transaction processing paths read, fast write, prestage, destage Collects relevant statistics Model Paths Read Cache hit due to prestaging Data staging due to cache miss Write Fast write into non-volatile storage (NVS) and cache Prestage Data staging triggered by sequential read pattern Destage Flushing of cache written data to storage triggered by NVS and cache thresholds DA CHANNELS HOST ADAPTERS UPPER BUSES 4-way SMP 4-way SMP LOWER BUSES DA DA DA DA DEVICE BUSES RAID RANKS DISKS DA DA DA Model Output Detailed per-resource output Utilization Interarrival times Service times Configurable statistical output Transaction response time Data throughput Cache-hit rates Model Flexibility May represent different storage subsystems (ESS, EMC) Configurable collection of resources Configurable time overheads for each resource CHANNELS HOST ADAPTERS UPPER BUSES 4-way SMP 4-way SMP LOWER BUSES DA DA DA DA DA DEVICE BUSES RAID RANKS DISKS DA DA DA Example - Data Placement Assignment of Logical Volumes (LV) to RAid-5 Ranks (RR) An example 0 2 3 1 DA DA 4 6 8 9 "blue" RRs are 6 + P + S "black" RRs are 7 + P "even" RRs are handled by cluster 0 "odd" RRs are handled by cluster 1 7 5 Data Placement - Results Semi-synthetic database trace with intensive I/O bursts Configuration Empty-system read-miss One LV 6+P RR One LV 7+P RR Two LVs Two 7+P RRs Same cluster Two LV Two 7+P RRs Different clusters Average Read Improvement (w.r.t. Response Time (ms) previous configuration) 11 -43 -- 35 19% 23 34% 21 9% Example - Join Join I/O requests Example Timestamp LV 920433 0 922033 0 923916 0 924154 0 924531 0 925755 0 Blk-Add BF168 BF190 BF198 BF1B0 BE200 BE210 Amount 1000 READ 1000 READ 3000 READ 4000 READ 1000 READ 3000 READ Example - Join Join I/O requests if consecutive in space - space difference (SD) = 0 Example Timestamp LV 920433 0 922033 0 923916 0 924154 0 924531 0 925755 0 924154 0 Blk-Add BF168 BF190 BF198 BF1B0 BE200 BE210 Amount 1000 READ 1000 READ 3000 READ 4000 READ 1000 READ 3000 READ BF190 8000 READ SD 0 0 30 8 Example - Join Join I/O requests if consecutive in space - space difference (SD) = 0, and not too "close in time" - timestamp difference (TD) > 10 us Example Timestamp LV Blk-Add Amount SD TD 920433 0 BF168 1000 READ 922033 0 BF190 1000 READ 923916 0 BF198 3000 READ 0 1883 924154 0 BF1B0 4000 READ 0 238 924531 0 BE200 1000 READ 30 377 925755 0 BE210 3000 READ 8 1224 924154 0 BF190 8000 READ Join - Results Semi-synthetic database trace with intensive I/O bursts Number of Joins Average Read Improvement Response (w.r.t. base Time (ms) configuration) Configuration Base Trace -- 39 -- All Joins 60 35 9% Safe Join 33 32 18% Best Join 50 29 26% Future Directions Extend storage model Extend analysis of application stack Integrate into network performance model Integrate with monitoring environment Develop modeling-based reasoning