IBM Research Autonomic Computing: The First Decade Jeff Kephart ([email protected]) IBM Thomas J Watson Research Center Hawthorne, NY, USA © 2009 IBM Corporation IBM Research Outline § Birth § Formative Years § What Have we Accomplished? – And what we have not? 2 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research In the beginning there was Chaos 3 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Where it All Began: The Autonomic Computing Manifesto § IBM Senior Research VP Paul Horn first set forth the idea of Autonomic Computing in keynote to National Academy of Engineers § Harvard University, October 2001 § Autonomic Computing Manifesto released immediately thereafter 4 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research 5 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Eight Key Elements of an Autonomic Computing System 6 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research This was soon boiled down to four … 7 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Outline § Birth § Formative Years § What Have we Accomplished? – And what have we not? 8 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research IBM s Internal Realignment to Support AC § Created new Autonomic Computing group within Systems Management division – Alan Ganek, VP of Autonomic Computing – Autonomic Computing architecture board § Created a new Autonomic Computing department within Research Division in 2002 – Approximately 20 individuals – Approximately 100 researchers working on AC across IBM § Created a new Joint Program to guide and fund AC Research – Dave Kaminsky/Tom Corbi and Jeff Kephart 9 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research IBM wanted to help drive a new research agenda 10 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Autonomic Computing Advisory Board § We established an AC Advisory Board in 2002 – Sponsors: IBM Research VPs Alfred Spector, Robert Morris, Tilak Agrawala – Chair: Jeff Kephart § Mission – Help define appropriate research agendas and curricula – Contribute insights on what are the relevant problems – Stimulate interest in AC issues of relevance within and across their respective fields – Endorse and legitimize autonomic computing within industry and academia § We recruited 8 top academics and 5 key industry experts – Professors of AI, Distributed Systems, Grid Computing § We presented IBM s AC research and solicited – Feedback on our research • More on self-healing, self-protection, human interaction; deeper work on policy; clarify architecture; build system prototypes – Advice on how to enlist academia to work on the great AC challenges 11 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research AC Advisory Board Recommendations for Recruiting Academia 1. Publish a well-placed, high-quality manifesto 2. Show that AC is radical, revolutionary, world-changing a. Publicize IBM’s own high-quality research in AC b. Target top academics • • Define problem in their specific terms If they write good papers, rest of field will follow 3. Demonstrate industry-wide interest in AC (not just IBM hype) 4. Organize, sponsor, and participate in workshops, conferences a. International conferences and workshops b. Special IBM AC workshops 12 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research AC Advisory Board Recommendations for Recruiting Academia 1. Publish a well-placed, highquality manifesto IEEE Computer Cover Feature, January 2003 2. Show that AC is radical, revolutionary, world-changing a. Publicize IBM’s own high-quality research in AC b. Target top academics • Define problem in their specific terms • If they write good papers, rest of field will follow 13 ICAC 2011 Keynote IBM Systems Journal Vol. 42, No. 1, 2003 Autonomic Computing June 15, 2011 © 2009 IBM Corporation IBM Research Targeting top professors; organizing conferences: Timeline 9 AC FAs awarded by Research 2001 AC Manifesto University Days: UK, Supported ~100 faculty-years 9 AC FAs Germany, France awarded by Research Ga Tech 13 FAs, equipment grant, 2 SURs 2 Toronto CAS awds Start 25 profs 15 profs contacting supported via supported via universities FA, CAS, FA, CAS, SUR, equipment grant, etc. IIG, etc. 2002 AC Vision papers (IEEE Computer, IBM Sys Journal) 2003 AASMS03 AMS03 (FCRC) (HPDC) 5 AC workshops 2 AC journals 14 ICAC 2011 Keynote 2004 IJCAI 03 ICAC04 Wkshop on AC & AI 26 AC confs/ wkshops 2 AC journals 2005 2006 2006 WRAC05 ICAC06, HotAC06, ICAC05 SelfMan05 SelfMan06 Latin American AC symposium 30 AC confs/ >40 AC confs/ wkshops wkshop 3-4 AC journals >4 AC journals June 15, 2011 © 2009 IBM Corporation IBM Research Organize and sponsor workshops, conferences AASMS 03 AMS 03 Algorithms & Architectures for Self-Managing Systems Active Middleware Workshop on Autonomic Computing Federated Computing Research Conference High Performance Distributed Computing June 03, San Diego, CA June 03 Seattle, WA Chase (Duke), Goldszmidt&Keeton (HP), Kephart&Tetzlaff (IBM) Hariri (Arizona), Parashar (Rutgers) ICAC 04 Establish an AC research community to work together to realize the vision of large-scale self-managing systems Develop and nurture the AC research community International Conference on Autonomic Computing May 17-18, 2004, New York ICAC 05 10 demos 3 tutorials June 13-16, 2005, Seattle ICAC 06 15 ICAC 2011 Keynote 12 demos; 4 tutorials; workshops June 12-16, 2006, Dublin June 15, 2011 5 © 2009 IBM Corporation IBM Research AC is catching on! Excerpt from Report to Paul Horn on Autonomic Computing as an Academic Discipline , late 2006 I m flabbergasted! § Initially spurred by our efforts – Faculty awards, equipment grants – Workshops, conferences (some IBM Academy) – University visits – Several classes taught by IBMers at Duke, UNC, St Andrews, Brazil § But increasingly on its own – AC classes being taught around the world • • • • >30 universities have AC content in their curricula “Self-Managing Systems”, Shivnath Babu, Duke University “Autonomic Computing”, Omer F. Rana, Cardiff U., UK, ½ day seminar. “Parallel and Distributed Computing”, Manish Parashar, Rutgers. – Government support: EPSRC in UK funds “Semantic Grid and Autonomic Computing Programme” – Over a dozen AC workshops, conferences initiated by non-IBMers – Publications • • • IEEE Task Force on Autonomous and Autonomic Systems newsletter Special Issue of IEEE Internet Computing Jan 2007 on AC ACM Transactions on Complex Adaptive Systems – Web site: www.autonomiccomputing.org 16 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Outline § Birth § Formative Years § What Have we Accomplished? – And what have we not? 17 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Have we kept the momentum (five years later)? § Over 8000 papers on autonomic computing – Approximately 160 ICAC papers (2% of literature) § Over 200 patents issued on autonomic computing – >100 more under evaluation § Nearly 200 conferences or workshops solicit papers on autonomic computing § Government funding – FP6: Situated autonomic communications • ANA, BioNETS, CASCADAS, HAGGLE, ACCA – FP7: Self-awareness in autonomic systems 18 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Let s take a closer look at how AC is doing as a field § Run Harzing s Publish or Perish with queries Autonomic Computing and International Conference on Autonomic Computing – Uses Google Scholar; finds top 1000 papers in terms of citation counts § Put structured data in spreadsheet § Cleanse the data § Identify interesting trends 19 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Autonomic Computing Papers (2002) 417 168 123 84 63 55 52 44 43 42 39 27 24 21 20 15 14 14 12 10 9 8 6 4 4 3 3 2 2 20 D Patterson, A Brown, P Broadwell, Recovery-oriented G Candea… computing (ROC): Motivation, definition, techniques,2002 and case UC Berkeley UC Berkeley TechTech Report Report JP Bigus, DA Schlosnagle,ABLE: JR Pilgrim… A toolkit for building multiagent autonomic systems 2002 IBM Systems … N Zhong, J Liu… In search of the wisdom web 2002 COMPUTER-LOS ALAMITOSR Sterritt Towards autonomic computing: effective event management 2002 … Workshop, 2002. Proceedings. 27th An A LaMarca, W Brunette, D Plantcare: Koizumi, MAn Lease… investigation in practical ubiquitous systems 2002 UbiComp 2002: … M Satyanarayanan A catalyst for mobile and ubiquitous computing 2002 Pervasive Computing D Paulson Computer system, heal thyself 2002 Computer GM Lohman… SMART: Making DB2 (more) autonomic 2002 … of the 28th international conference on IBMA Computing IBM's Perspective on the State of Information Technology 2002 White Paper, information available at http: SS Lightstone, G Lohman… Toward autonomic computing with DB2 universal database 2002 ACM SIGMOD Record S Elnaffar, P Martin… Automatically classifying database workloads 2002 Proceedings of the eleventh … E Mainsah Autonomic computing: the next era of computing 2002 Electronics & Communication Engineering RK Sahoo, M Bae, R Vilalta, Providing J Moreira… persistent and consistent resources through event log analysis 2002 andWorkshop predictionson forSelflarge-scale … computing syst CH Crawford… eModel: addressing the need for a flexible modeling framework in autonomic 2002 Modeling, computingAnalysis and Simulation of … DA Patterson Recovery oriented computing: A new research agenda for a new century 2002 Keynote address, HPCA WW Gibbs Autonomic computing 2002 Scientific American E Schwartz IBM Offers a Peek at Self-Healing PCS: Autonomic computing initiative2002 will lead Date toAlleged: self-configuring Nov desktops and notebo MN Huhns… Robust software 2002 Internet Computing, IEEE D Pescovitz Helping computers help themselves 2002 Spectrum, IEEE LD Paulson IBM begins autonomic-computing project 2002 Computer Y Tohma Fault tolerance in autonomic computing environment 2002 A Wolfe News analysis: IBM sets its sights on autonomic computing 2002 IEEE Spectrum DJ Clancy NASA challenges in autonomic computing 2002 Almaden Institute JY Chung… „Beyond e-Marketplace & Next Generation e-Business: Grid, Autonomic2002 Computing 4th International & Web Services Conference “ on Electronic YS Tan, B Topol, V Vellanki… Implementing service Grids with the service domain toolkit 2002 IBM Corporation E Grishikashvili, N Badr, DAutonomic Reilly… computing: A service-oriented framework to support the development 2002 Proceeding and management of 3rd … of distributed applic AZ Spector Challenges and opportunities in autonomic computing 2002 Proceedings of the 16th international confe J Kephart Technology challenges of autonomic computing 2002 OOPSLA R Sterritt Towards autonomic computing: effective event management Software Engineering 2002 Proceedings. Workshop, 27th 2002 Annual NASA Goddard ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Autonomic Computing Papers (2002) Autonomic Computing Papers (2002) 1000 Citation Count 100 10 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Paper Rank 21 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Autonomic Computing Papers (2003) 2595 626 147 108 106 104 104 99 96 96 86 84 84 82 80 78 75 73 73 70 67 61 57 55 53 53 52 47 44 44 42 40 40 38 36 29 28 27 22 27 JO Kephart… The vision of autonomic computing Computer AG Ganek… The dawning of the autonomic computing era IBM Systems Journal H Kreger Fulfilling the Web services promise Communications of the ACM R Sterritt… Autonomic Computing-a means of achieving dependability? Engineering of Computer-Based … J Appavoo, K Hui, CAN Soules… Enabling autonomic behavior in systems software with hot swapping IBM systems … G Kaiser, J Parekh, P Gross… Kinesthetics extreme: An external infrastructure for monitoring distributed legacy systems… Autonomic Computing … R Sterritt… Towards an autonomic computing environment Database and Expert Systems … AB Brown… Undo for operators: Building an undoable e-mail store Proceedings of the annual conference on … M Agarwal, V Bhat, H Liu… Automate: Enabling autonomic applications on the grid … Computing … H Cervantes… Automating service dependency management in a service-oriented component model Proceedings of CBSE DM Chess, CC Palmer… Security in an autonomic computing environment IBM Systems Journal DF Bantz, C Bisdikian, D Autonomic Challener…personal computing IBM Systems … Y Diao, JL Hellerstein, S Parekh… Managing web server performance with autotune agents IBM Systems Journal R Want, T Pering… Comparing autonomic and proactive computing IBM Systems Journal F Heylighen… The meaning of self-organization in computing Information Systems X Dong, S Hariri, L Xue, HAutonomia: Chen… an autonomic computing environment … Proceedings of the … A Leff, JT Rayfield… Service-level agreements and commercial grids IEEE Internet Computing V Markl, GM Lohman… LEO: An autonomic query optimizer for DB2 IBM Systems Journal D Capera The AMAS theory for complex problem solving based on self-organizing cooperative agents C Sapuntzakis… Virtual appliances in the collective: A road to hassle-free computing Proceedings of the 9th conference on Hot … P Buhler, JM Vidal… Adaptive workflow= web services+ agents … of the International Conference on Web … A Dan, H Ludwig, G Pacifici Web service differentiation with service level agreements White Paper, IBM Corporation RJT Morris… The evolution of storage systems IBM Systems Journal EM Maximilien… Agent-based architecture for autonomic web service selection Workshop on Web Services and Agent-based … J Jann, LM Browning… Dynamic reconfiguration: Basic building blocks for autonomic computing on IBM pSeries servers IBM Systems Journal M Milenkovic, SH Robinson, Toward RC Knauerhase… internet distributed computing Computer C Boutilier, R Das, JO Kephart, Cooperative G Tesauro… negotiation in autonomic systems using incremental utility elicitation … on Uncertainty in … R Sterritt Pulse monitoring: extending the health-check for the autonomic GRID … Informatics, 2003. INDIN 2003. Proceedings. IEE M Agarwal… Enabling autonomic compositions in grid environments JA Redstone, MM Swift…Using computers to diagnose computer problems … of the 9th Workshop on Hot … F Berman, G Fox… Grid computing JM Deegan… High reliability memory subsystem using data error correcting code symbol sliced command USrepowering Patent App. 10/723,055 T De Wolf… Towards Autonomic Computing: agent-based modelling, dynamical systems analysis, andIndustrial decentralised Informatics, control 2003. INDIN … S Elnaffar, W Powley, D Benoit… Today's DBMSs: How autonomic are they Database and Expert … S Hariri, L Xue, H Chen, MAutonomia: Zhang… an autonomic computing environment IEEE International … DM Russell, PP Maglio, RDealing Dordick… with ghosts: Managing the user experience of autonomic computing IBM Systems Journal G Lanfranchi, PD Peruta, A Toward Perrone… a new landscape of systems management in an autonomic computing environmentIBM Systems … S Lightstone, B Schiefer, D Autonomic Zilio… computing for relational databases: the ten-year vision … , 2003. INDIN 2003. … © 2009 IBM Corporation Keynote autonomic architecture and its application in e-medicine June 15, 2011 H Tianfield ICAC 2011 Multi-agent Intelligent Agent Technology, 2003. IAT 2003. … IBM Research Autonomic Computing Papers (2003) Autonomic Computing Papers (2003) 10000 Citation Count 1000 100 10 1 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 Paper Rank 23 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research 183 papers Autonomic Computing Papers (2004) Autonomic Computing Papers (2004) McKinley, Composing Adaptive Software 10000 Cohen, Correlating instrumentation data to system states: A building block for automated diagnosis and control Candea, Microreboot—A technique for cheap recovery Walsh, Utility functions in autonomic systems 1000 White, An architectural approach to autonomic computing Kephart, An artificial intelligence perspective on autonomic computing policies Citation Count Tesauro, A multi-agent systems approach to autonomic computing Barrett, Field studies of computer system administrators: analysis of system management tools and practice Müller-Schloer, Organic Computing 100 Parashar, A component based programming framework for autonomic applications Littman, Reinforcement learning for autonomic network 10 Hellerstein, Challenges in control engineering of computing systems Ranganathan, Autonomic pervasive computing based on planning Brown, Benchmarking autonomic capabilities: promises and pitfalls Kiciman, Discovering correctness constraints for self-management of system configuration 1 1 11 21 31 41 51 61 71 81 91 101 Paper Rank 24 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Total: 183 papers Autonomic Computing Papers (2004) ICAC04: 39 papers ICAC s impact Autonomic Computing Papers (2004) 10000 McKinley, Composing Adaptive Software Cohen, Correlating instrumentation data to system states: A building block for automated diagnosis and control 183 papers Candea, Microreboot—A technique for cheap recovery Walsh, Utility functions in autonomic systems 1000 White, An architectural approach to autonomic computing Kephart, An artificial intelligence perspective on autonomic computing policies Citation Count Tesauro, A multi-agent systems approach to autonomic computing Barrett, Field studies of computer system administrators: analysis of system management tools and practice Müller-Schloer, Organic Computing 100 Parashar, A component based programming framework for autonomic applications Littman, Reinforcement learning for autonomic network Hellerstein, Challenges in control engineering of computing systems 10 Ranganathan, Autonomic pervasive computing based on planning Brown, Benchmarking autonomic capabilities: promises and pitfalls Kiciman, Discovering correctness constraints for self-management of system configuration 1 1 11 21 31 41 51 61 71 81 91 101 Paper Rank 25 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Wordle s View of AC circa 2003 http://www.wordle.net/show/wrdl/3752036/Autonomic_Computing_paper_themes_2003 26 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Wordle s View of AC circa 2004 http://www.wordle.net/show/wrdl/3752222/utonomic_Computing_paper_themes_2004 27 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Analyzing AC trends over the first decade: a taxonomy § Use original AC vision paper as basis for a taxonomy of papers Architecture: Autonomic elements interact to produce system-level self-configuration self-healing self-optimization, and self-protection § Engineering challenges – Element lifecycle; software engineering – Relationships: services, standards, ontology, negotiation – System: policy, human interaction, self-* § Science challenges – Machine learning, optimization & control – Understanding and governing emergent system behavior 28 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research AC Paper Trends 2001-2010: Vision, Architecture, etc. § Vision and architecture papers have settled to ~10% – Appropriate 1 Vision 0.9 – BAD! § Study of system properties as a whole is rising steadily. 0.8 Vision 0.14 0.12 0.1 0.08 0.6 Arch Human System Vision Architecture System 0.06 Arch Human System Vision 0.04 0.5 Human 0.02 0 0.4 2001 2002 2003 2004 2005 2006 2007 2008 20092010 0.3 0.2 0.1 Architecture Human System 0 – GOOD! 29 0.18 0.16 0.7 § Human interaction study, never prevalent, became extinct in 2006 0.2 ICAC 2011 Keynote 2001 2002 2003 2004 2005 2006 2007 2008 20092010 June 15, 2011 © 2009 IBM Corporation IBM Research Vision § Autonomic Computing – Horn, Ganek, Kephart&Chess; Parashar&Hariri; Sterritt § Recovery-oriented computing – Don’t try to ensure 99.9999% up time for each component – Accept that faults are always going to happen; cope with them at system level – Micro-rebooting – minimize downtime by designing systems to be quickly rebootable at multiple levels – If it’s fast enough, occasional mistaken reboots are ok – Patterson, Fox et al., UC Berkeley § Organic and bio-inspired computing – Use insights from biological systems to understand and exploit collective behavior – KIT, BADS workshop; SASO; Richard Anthony No work on applying autonomic nervous system principles to autonomic computing !?! 30 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research A tale of two analogies § Computer Viruses – Viruses replicate themselves by co-opting their host’s resources – Analogies work on several levels • Macroscopic: epidemiology, evolutionary trends • Microscopic: immune system – Analogies help us • Understand the problem (science) • Ameliorate the problem (engineering) § Autonomic Computing – Large-scale computing systems are becoming too complex for humans to manage. We need self-managing computing systems: • Self-configuring, Self-healing, Self-optimizing, Self-protecting – Autonomic nervous system automatically dilates pupils, increases respiratory rate, heart beat, etc. – Analogy to autonomic nervous system helps us describe the effect we want to achieve 31 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Computer Viruses: Macroscopic Analogy § Epidemiology – Individual = computer – Social network is important: can curtail spread relative to homogeneous mixing § Evolutionary trends – Several great ages of computer viruses • • • • File infectors Boot infectors Macro viruses Worms – Heavily influenced by environment – Co-evolution with host (e.g. Microsoft Windows) – Overly virulent viruses are unsuccessful 32 ICAC 2011 Keynote 1.6 1.4 1.2 1.0 File, Boot and Macro Virus Prevalence Incidents per 1000 Machines File Boot 0.8 0.6 Macro 0.4 0.2 0 1988 1990 1992 June 15, 2011 1994 1996 © 2009 IBM Corporation IBM Research Computer Viruses: Immune System § Recognize pathogen – Unknown: “Innate” immune system combines “Know thyself” with “Know thine enemy” – Known: Vertebrate immune system specifically detects tell-tale portions § Eliminate it – Biology: Killer T cells destroy infected host cell to save host individual – Computers: Can often surgically remove virus from host cell § Learn (if previously unknown) • Self/non-Self as proxy for Benign/Harmful • Fight self-replication with self-replication Immune System for Cyberspace Virus Analyzer Firewall Virus Virus Virus Petri dish 4 5 Analyze behavior, structure riVsu 8 private network 1 3 ption Prescri Clients 7 View Extract signature Derive prescription Virus 2 Administrator 8 6 9 IBM Widgets Inc. – Biology: Each individual does their own learning; vaccination helps – Computers: Learning can be shared 33 ICAC 2011 Keynote Joe User Jane Q. Public June 15, 2011 Gewgaws Ltd. © 2009 IBM Corporation IBM Research How can biological analogies be useful? § Marketing: Describe the problem you re trying to solve; inspire others to solve it § Science: Gain insight into the problem – Borrow mathematical techniques developed for related problems – Sometimes you end up contributing as much as you borrow (e.g. directed-graph epidemiology) § Engineering: Derive techniques for solving the problem – Knowing that Nature has solved a related problem gives you hope – Even better, you may be able to adapt Nature’s solutions to your problem – Even wrong theories about Nature’s workings can be valuable! Immune System for Cyberspace Virus Analyzer Firewall Virus Virus Virus Petri dish 4 5 Analyze behavior, structure riVsu 34 ICAC 2011 Keynote 1 3 ption Prescri Derive prescription Virus 2 Clients 7 View Extract signature Administrator 8 6 9 IBM Widgets Inc. Joe User Jane Q. Public Be open to what Nature has to teach you, but be judicious about what ideas you borrow.! 8 private network Gewgaws Ltd. Kephart et al., Fighting Computer Viruses, Scientific American Nov 1997. June 15, 2011 © 2009 IBM Corporation IBM Research AC Paper Trends 2001-2010: System architecture, policy, self-optimization 0.2 0.18 0.16 0.14 Self-Optimization 0.12 Policy Self-O System 0.1 0.08 System 0.06 0.04 0.02 Policy 0 2001 2002 2003 2004 2005 2006 2007 2008 20092010 35 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Kephart and Walsh, Policy04 How to represent high-level policies? § Utility functions map any possible state of a system to a scalar value § They can be obtained from Possible State σ1 a1 a2 Current State S – Service Level Agreement – preference elicitation a3 – simple templates Possible State σ2 Possible State σ3 § They are a very useful representation for high-level objectives – Value can be transformed and propagated among agents to guide system behavior 36 ICAC 2011 Keynote U(RT) = June 15, 2011 © 2009 IBM Corporation IBM Research U(RT, RPO) How to manage with highlevel policies? § Elicit utility function U(S) expressed in terms of service attributes S U § Model how each attribute Si depends on controls C and observables O Recovery Point Objective – Models expressed as S(C; O) Response Time – E.g., RT(routing weights, request rate) – Models from experiments, learning, theory § Transform from service utility U to resource utility U by substitution Transform U (cpu, b; λ) λ=0.01 – U(S) = U(S(C; O)) = U’(C; O) § Optimize resource utility. As observable O changes, set C to values that maximize U (C; O) U – C*(O) = argmaxC U’(C; O) cpu – U’*(O) = U’(C*(O); O) Backup rate b 37 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Unity Data Center Prototype: Experimental setup Maximize Total SLA Revenue Demand (HTTP req/sec) Trade3 Demand (HTTP req/sec) 5 sec Trade3 Resource Arbiter U(#srv) U(#srv) U(#srv) App Manager U(RT) WebSphere 5.1 Trade3 DB2 Server Server App Manager App Manager U(#srvrs) WebSphere 5.1 DB2 Batch Server Server Server Server Server U(RT) Trade3 Server Chess, Segal, Whalley and White, Unity: Experiences with a Prototype Autonomic Computing System, ICAC 2004 38 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research How App Mgr computes its external resource utility Alternative to generating full curve: utility elicitation Max Utility Patrascu, Boutilier et al. New Approaches to Optimization and Utility Elicitation in Autonomic Computing, AAAI 2005 U Resource Arbiter Elicit: (srv) Number of servers App Manager λ U(RT) WebSphere 5.1 Trade3 DB2 Model: U(RT) Service-level utility My controls Arbiter s controls Observable U(RT(C; srv, λ)) Transform: U (C; srv, λ) = U(RT(C; srv, λ)) Internal resourcelevel utility Optimize: Optimal internal control settings External resourcelevel utility C*(srv, λ) = argmaxCU (C; srv, λ) U (srv, λ) = U (C*(srv, λ); srv, λ) Chess, Segal, Whalley and White, Unity: Experiences with a Prototype Autonomic Computing System, ICAC 2004 39 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research How the Arbiter determines optimal resource allocation Decision problem: Allocate resources srv* = argmaxsrvΣU i(srvi) Effectively maximizes ΣUi(Si) Max Utility Max Utility Resource Arbiter U 1(srv1) Number of servers Number of servers App Manager App Manager U(RT) WebSphere 5.1 WebSphere 5.1 Trade3 DB2 DB2 Server 40 U 2(R2) Server Server ICAC 2011 Keynote Server Server Server Server June 15, 2011 U(RT) Trade3 Server © 2009 IBM Corporation IBM Research Policy and Systems: Status and Future § We ve made a good start on developing the utility-optimization design pattern – Theoretically well-grounded – Proven practical in several scenarios § But we need to push this work much further § Establish that utility works on a grand scale in AC systems – More than just a few agents and attributes – An economy, perhaps? § Utility elicitation from humans Lubin, Kephart, Das and Parkes. Expressive PowerBased Resource Allocation for Data Centers. IJCAI 2009. (Exploring market-based resource allocation for data centers.) § Need planning technologies to support goal policies – More than just an engine – Tools for constructing planning domain descriptions 41 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research AC Paper Trends 2001-2010: Self-*, Benchmarks § David Patterson warned us that we needed benchmarks for self-{C,H,P} in order to drive work in the field § It appears that he was right Self-Healing Self-Optimization 0.14 0.12 0.1 Self-C Self-H Self-O Self-P Benchmark 0.08 0.06 § We need to revive the benchmark work Self-Protection 0.04 § We need more work on self{C,H,P} 42 0.16 Self-Config 0.02 Benchmarks 0 2001 2002 2003 2004 2005 2006 2007 2008 20092010 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Benchmarks § David Patterson noted that – Benchmarks drive innovation, but practically all are performance-related – Innovations pertaining to self-{C, H, P} require appropriate metrics § Brown et al. developed benchmarks for configuration and healing – Brown & Keller. A model of configuration complexity and its application to a change management system. IM 2005. Brown & Hellerstein. Benchmarking Autonomic Capabilities: Promises and Pitfalls. ICAC04 § McCann et al. recommended metrics for adaptivity, robustness, autonomy, sensitivity, stabilization; suggested adapting existing benchmarks – McCann & Huebscher. Evaluation issues in autonomic computing. GCC 2004 § Other papers include – Consens et al. Goals and benchmarks for autonomic configuration recommenders. 2005 – K. Kanoun. Dependability benchmarking for computer systems. 2008. 43 ICAC 2011 Keynote After a promising start, work on autonomic computing benchmarks appears to have (mostly) stagnated. June 15, 2011 © 2009 IBM Corporation IBM Research AC Paper Trends 2001 – 2010: Relationships: WebServices/Grid 0.2 0.18 0.16 0.14 Cloud Web Services or Grid 0.12 0.1 0.08 0.06 0.04 0.02 0 2001 2002 2003 2004 2005 2006 2007 20082009-2010 44 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Relationships: WebServices/Grid § Agent communication standards likely to derive from services Article resulted from brainstorming session at Agents for Autonomic Computing workshop, ICAC 2008 § Foresee convergence of autonomic computing, web services, grid interfaces Brazier, Kephart, Parunak, and Huhns, Internet Computing, June 2009 45 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research AC Paper Trends 2001-2010: AI Technologies § Relatively small but sustained effort on AI technologies for autonomic systems 0.08 0.07 0.06 Machine Learning 0.05 0.04 0.03 Ontology 0.02 0.01 Control 0 2001 2002 2003 2004 2005 2006 2007 2008 2009-2010 46 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Machine Learning § Good progress on learning models and policies – I. Cohen et al. Correlating …. OSDI04. – G. Jiang et al. Discovering likely invariants of distributed transaction systems for autonomic system management. ICAC06 – G. Tesauro et al. A hybrid …. ICAC06 I. Cohen et al. Correlating instrumentation data to system states: A building block for automated diagnosis and control. OSDI04 § We still need to tackle multi-agent learning – Several interacting learners – What are good learning algorithms for cooperative, competitive systems? • Stability and sensitivity characteristics • What is sensitivity to perturbations? – Opportunities for layered learning G. Tesauro et al. A hybrid reinforcement learning approach to autonomic resource allocation. ICAC06 47 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Feedback control § Good progress on applying feedback control to individual autonomic elements – Middleware including databases, application servers: Book and multiple papers by J. Hellerstein et al. § Good progress on applying feedback control to clusters of compute resources , power and performance – Kusic et al. § We still need to understand and control the behavior of multiple interacting feedback loops – Hierarchical and distributed – Some good early thoughts in P. Ranganathan. No Power Struggles: Coordinated Multi-Level Power Management for the Data Center. – ASPLOS08 § Generally, we still need to understand emergent behavior much better 48 ICAC 2011 Keynote D. Kusic et al. Power and Performance Management of Virtualized Computing Environments via Lookahead Control. ICAC07 June 15, 2011 © 2009 IBM Corporation IBM Research Unanticipated trends, and their impact on AC § Data centers and energy management – The physical infrastructure is complex, and needs to be autonomic, too! – New attributes: Energy and temperature § Cloud Computing – Some vendors (Google, Amazon, Facebook) can get away with highly standardized and homogeneous environments – Outsourcing to the cloud means that fewer companies manage IT themselves – Perhaps it places a greater burden on cloud providers to implement AC • Lower costs • Places premium on easy configurability • Outages are more embarrassing and costly 49 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Conclusions § Autonomic Computing is alive and well – Thousands of papers, 129 of them with at least 50 citations – Hundreds of conferences and workshops that touch on AC § We have had a busy and fruitful first decade – Good balance of vision, architecture, new techniques, apps – We haven’t exploited the autonomic nervous system analogy – but that’s OK – Not much new theory, or system-level prototypes that address multiple facets § Several serious engineering and science challenges remain – We need more work at the system level • • • • Multi-agent learning, interacting feedback loops Understanding/harnessing emergent behavior Economic models should be pursued seriously We need to build and experiment with prototypes and testbeds – We need to revive our development of benchmarks for Self-{C,H,P} – We need more focus on human interaction with autonomic systems; elicitation 50 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research Backup 51 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research ICAC 2004-2011 Year 2004 2005 2006 Location General Chairs New York, NY Jeff Kephart (IBM Research) Rajarshi Das (IBM) Manish Parashar (Rutgers) Vaidy Sunderam (Emory) Jeff Kephart (IBM Research) Karsten Schwan (Ga Tech) Manish Parashar (Rutgers) Yi-Min Wang (Microsoft Research) Seattle, WA Dublin, Ireland Karsten Schwan (Ga Tech) Yi-Min Wang (Microsoft Research) 2007 2008 2009 2010 Jacksonville, FL Mazin Yousif (Intel) Chicago, IL Barcelona Reston, VA Program Chairs Mazin Yousif (Intel) Omer Rana (Cardiff U.) Jose Fortes (U. Florida) Omer Rana (Cardiff U.) Kumar Goswami (HP Labs) Jose Fortes (U. Florida) John Strassner (Motorola) Kumar Goswami (HP Labs) Simon Dobson (UCD Dublin) John Strassner (Motorola) Manish Parashar (Rutgers) Simon Dobson (UCD Dublin) Onn Shehory (IBM Research) Manish Parashar (Rutgers) Renato Figueiredo (U. Florida) Emre Kiciman (Microsoft Research) 2011 52 Karlsruhe, Germany Hartmut Schmeck (Karlsruhe, GE) Joseph Hellerstein (Google) Tarek Abdelzaher (UIUC) ICAC 2010 Overview | Jeff Kephart & Canturk Isci 13-Jul-11 IBM Research ICAC Steering Committee (2011) § Jeffrey Kephart, IBM Research (Co-chair) § Salim Hariri, University of Arizona (Co-Chair) § Manish Parashar, Rutgers University § Karsten Schwan, Georgia Tech § Emre Kiciman, Microsoft Research § Renato Figueiredo, University of Florida § John Wilkes, Google 53 ICAC 2010 Overview | Jeff Kephart & Canturk Isci 13-Jul-11 IBM Research Autonomic computing paper impact (from Harzing’s Publish or Perish) Papers: 998 Cites/paper: 30.06 h-index: 75 AWCR: Citations: 29999 Cites/author: N/A g-index: 140 AW-index: 67.04 Years: 11 Papers/author: N/A hc-index: 51 AWCRpA: 1881.16 Authors/paper: 2.78 hI-index: 25.45 e-index: 101.85 44 hm-index: 50.95 Cites/year: 2727.18 hI,norm: 4494.42 Query date: 6/12/2011 Hirsch a=5.33, m=6.82 Contemporary ac=6.91 Cites/paper 30.06/11.0/2 (mean/median/mode) Authors/paper 2.78/3.0/3 (mean/median/mode) 192 paper(s) with 1 author(s) 244 paper(s) with 2 author(s) 248 paper(s) with 3 author(s) 227 paper(s) with 4 author(s) 76 paper(s) with 5 author(s) 11 paper(s) with 6 author(s) 54 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research 55 ICAC 2011 Keynote June 15, 2011 © 2009 IBM Corporation IBM Research 56 ICAC 2010 Overview | Jeff Kephart & Canturk Isci 13-Jul-11