ECEN 4213 Computer Based System Design Microcontroller Architecture Dhinesh Sasidaran PIC16C57 GENERAL DESCRIPTION The PIC16C57 is an EEPROM/ROM based 8-bit CMOS Microcontroller. It contains a Reduced Instruction Set Computer (RISC) based Central Processing Unit (CPU) with only 33 single word instructions. All instructions are fetched and executed within a single cycle except for program branches which take up 2 cycles. The maximum operating frequency is 40MHz. ARCHITECTURAL OVERVIEW The PIC15C57 microcontroller uses a 2 stage pipelined Harvard architecture in which program and data are accessed using independent buses. This architecture enables instructions to be fetched in one cycle. While program memory is being accessed, data memory is on an independent bus and can be read and written. This improves bandwidth over the traditional von Neumann architecture where program and data are fetched on the same bus. The von Nuemann architecture requires more accesses across the bus to fetch the instruction and the data will then need to be operated on and possibly written back. This makes the bus extremely conjested. With the Harvard architecture, instruction opcodes are 12 bits wide while data word is 8-bits wide. With the addition of the pipeline stages, the fetch and execution cycle are allowed to overlap. The microcontroller contains an 8-bit Arithmetic Logic Unit (ALU) and working register. This device is an accumulator-based architecture whereby one of the operands is implicitly in the accumulator (also called the working register, W). The ALU performs arithmetic and boolean functions between the data in the working (W) register and any register file. The W register is an 8-bit working register used for ALU operations and is not an addressable register. Figure 1 shows the block diagram of the PIC16C57. The clock for the system is obtained from an external memory component called an “oscillator”. The clock enters the microcontroller and will be then divided into four nonoverlapping clocks which make up one instruction cycle, during which one instruction is executed. With the pipeline stages, the execution of instructions start by calling an instruction that is next in line. Fetching, decoding and execution of this instruction will then be done during the next instruction cycle. Figure 2 shows the Clock/Instruction cycle for the microcontroller. Microcontroller Architecture August 29, 2002 1 ECEN 4213 Computer Based System Design FIGURE 1. PIC16C57 architecture block diagram. FIGURE 2. Clock/Instruction cycle. Microcontroller Architecture August 29, 2002 2 ECEN 4213 Computer Based System Design From Figure 2, the instruction cycle consists of cycle Q1, Q2, Q3 and Q4. Through pipelining, each instruction is effectively executed in one cycle. In the case when an instruction modifies the Program Counter (PC) and makes it point to some other address, two cycles are needed for execution of an instruction. This is because the instruction must be processed again and from the right address. The Instruction Register (IR) is written during the Q1 clock while the decoding and execution starts with Q2,Q3 and Q4 clocks. Pipeline Flow In a normal unpipelined architecture, each instruction takes a certain number of clock periods to finish before the next instruction is fetched and executed. Any one component of the processor is only used during one of the clock periods which means that it does nothing else during the other clock periods. To make use of the other clock periods, pipelining is introduced. Pipelining is an implementation technique where multiple instructions are overlapped in execution. It is like a an assembly line where different steps are completing different parts of different instructions in parallel. The instructions enter at one end, progress through the pipeline stages and exits out at the other end. Pipelining is implemented by adding a register between each stage of the instruction cycle. For the PIC16C57, there are 2 pipeline stages. The following figure (Figure 3) is an example illustrating the principle of pipelining. It shows the instruction cycle (datapath) as it progresses through the pipeline. FIGURE 3. Instruction pipeline flow In Figure 3 above, TCY0 reads in instruction MOVLW 55h. TCY1 is when instruction MOVLW 55h is being executed while the next instruction MOVWF PORT B is being read. TCY2 executes MOVWF PORT B and reads in CALL SUB_1. TCY3 executes a call to subprogram CALL SUB_1 and reads in the instruction BSF PORT A, BIT3. Since the PC has been modified to point to the address determined by the call to subprogram, the BSF PORT A, BIT 3 instruction which has already been fetched must be flushed from the pipeline while the new instruction must be fetched and then executed. The performance of a pipelined architecture can be summed up in the following equation: Microcontroller Architecture August 29, 2002 3 ECEN 4213 Computer Based System Design instructions throughput = ---------------------------------time The following diagram (Figure 4) shows the comparison between an unpipelined processor and a pipelined one: instruction number i clock period number 1 2 Fetch (i) 3 Decode + Execute(i) Fetch (i+1) i+1 unpipelined instruction number i Fetch (i) Decode + Execute (i) Fetch (i+1) i+1 Decode + Execute(i+1) pipelined Fetch (i+1) FIGURE 4. Unpipelined vs Pipelined processor 1 instruction in 2 clock periods have been reduced to 1 instruction in 1 clock period. Therefore throughput have been increased from : CPI ( Clocks cycles per instruction ) = 2 1 throughput ( unpipelined ) = -----2T CPI ( Clocks cycles per instruction ) = 1 1 throughput ( pipelined ) = --T Computer Operation Codes (Opcode) The first step of a design of a microcontroller is to determine the type of operations it has to perform. The operation code is part of the instruction set and essentially encodes all the functions of the microcontroller, thus providing a unique code for each necessary operation. Figure 5 shows the instruction set (opcodes + operands) and the operations that the PIC16C57 can perform. Microcontroller Architecture August 29, 2002 4 ECEN 4213 Computer Based System Design FIGURE 5. Instruction set (opcode+operands) for the PIC16C57 Central Processing Unit (CPU) The CPU is the brain of the microcontroller which is responsible for locating and fetching the correct instruction, decoding it and executing it. This unit connects all parts of the microcontroller through an 8-bit data bus and a 5-bit address bus and determines the transfer of data from one location to another within the microcontroller via the controller. The main components of the CPU are the Program Counter (PC), the Instruction Register (IR) , the Register File and the Arithmetic Logic Unit (ALU). Microcontroller Architecture August 29, 2002 5 ECEN 4213 Computer Based System Design Program Counter (PC) The program counter is a register that holds the address of the next instruction to fetch from the program memory and is updated by one at every instruction cycle, unless an instruction changes the PC. The processor then fetches the instruction from the memory location pointed to by the PC, places it in the Instruction Register (IR) and then increments itself. The PIC16C57 has an 11-bit PC used to access the 2048 x 12 program memory. Figure 6 shows a simple PC implementation. PC To memory ‘1’ clock FIGURE 6. Program counter example Instruction Register (IR) The currently executing instruction is stored temporarily in the instruction register. The processor interprets the contents of the IR via the instruction decoder and determines the type of operation to be executed depending on the instruction set. Figure 7 shows the contents of the IR if a byte-oriented instruction is loaded. clock d OPCODE source/destination register 5 8 from ALU binary decoder binary decoder 8 to Register 8 to W to Register FIGURE 7. Instruction Register. Microcontroller Architecture August 29, 2002 6 ECEN 4213 Computer Based System Design ALU The Arithmetic Logic Unit (ALU) is the ‘core’ of any processor which performs the calculations (arithmetic, boolean and shifts) on the operands. The ALU is bascially a numbercrunching mechanicsm which computes the results based on the control signals provided to it by the opcode part of the instruction set. The opcode is used to select between the the arithmetic, boolean or shift operations. Figure 8 shows the block diagram of the ALU. Depending on the type of operation performed by the ALU, the status register will be affected. to and from W register from register to status register ALU FIGURE 8. Block diagram of ALU. Register File Highest level of the memory hierarchy which is determined by the data memory. The data memory is actually divided into 2: Special Function Registers (SPRs) and General Purpose Registers (GPRs). In the case of the PIC16C57, there are 8 SPRs and 24 GPRs which are 8-bits wide. SPRs are registers used by the CPU and peripheral functions to control the operation of the device. The Register File for the PIC16C57 is a single port memory device which is implemented using Static Random Access Memory (SRAM) technology. Figure 9 shows the register file block diagram for the PIC16C57. INDF TMR0 PCL STATUS 8 SPRs FSR PORT A PORT B PORT C 24 GPRs FIGURE 9. Register file block diagram Microcontroller Architecture August 29, 2002 7 ECEN 4213 Computer Based System Design I/O Ports Ports refer to a group of pins on a microcontroller which can be accessed simultaneously, or on which we set the desired combination of zeros and ones, or read from them an existing status. They represent the connection of the CPU and the outside world. The ports are basically I/O registers which can be read and written under program control. There are 16 fully programmable I/O’s on the PC16C57 which can be used interchangebly for both input and output. Figure 10 shows the equivalent circuit for a single I/O pin. The Output Driver Control Registers are loaded with the contents of the W register by executing the TRIS f instruction. A ‘1’ from a TRIS register bit puts the corresponding output driver in a hi-impedance state (input mode). A ‘0’ puts the contents of the output data latch on the selected pins, enabling the output buffer. FIGURE 10. I/O pin Memory organization The micrcontroller is divided into program memory and data memory. The PIC16C57 has 2K x 12 program memory space and is accessed using a paging scheme. The program memory pages are accessed using one or two STATUS register bits. Figure 5 shows the program memory map for the microcontroller. The data memory consists of 32 registers (8 SPRs and 24 GPRs). Microcontroller Architecture August 29, 2002 8 ECEN 4213 Computer Based System Design FIGURE 11. Program memory map Instruction set summary Each instruction is 12-bits, which specifies the instruction type and one or more operands which further specifies the operation of the instruction. The instructions are grouped into byte-oriented, bit-oriented, and literal and control operations. For the PIC16C57, a 2 operand format is adopted, where one operand is both the source and the destination of the result. For byte-oriented instructions, ‘f’ represents a file register and ‘d’ represents a destination register. The file register designator is used to specify which one of the 32 file registers is to be used by the instruction. The destination designator specifies where the result of the operation is to be placed. If ‘d’ is ‘0’, the result is placed in the W register. If ‘d’ is 1, the result is placed in a file register specified in the instruction. Figure 5 shows the general format for a byte oriented instruction. FIGURE 12. Byte-oriented file register operations Microcontroller Architecture August 29, 2002 9 ECEN 4213 Computer Based System Design For bit-oriented instructions, ‘b’ represents a bit field designator which selects the number of the bit affected by the operation, while ‘f’ represents the number of the file in which the bit is located. Figure 6 shows the general format for a bit-oriented instruction. FIGURE 13. Bit-oriented file register operations For literal and control operations, ‘k’ represents an 8 or 9-bit constant or literal value. Figure 7 shows the general format for literal and control instructions. FIGURE 14. Literal and control operations Instruction Examples Consider the instruction to add the contents of the W register with the contents of the ‘f’ register. If ‘d’ = ‘0’, then the result will be stored in the W register. If ‘d’=’1’, then the results will be stored back in the register designated by ‘f’. 1) ADDWF f, d Operation: (W) + (f) --> (dest) Encoding: 0001 11df ffff Example: ADDWF TEMP_REG, 0 Before instruction: Microcontroller Architecture W = 0x17 h August 29, 2002 10 ECEN 4213 Computer Based System Design TEMP_REG = 0xC2 h After instruction: W = 0xD9 h TEMP_REG = 0xC2 h In example 2 below, the contents of the W register are AND’ed with the eight-bit literal ‘k’. The result is then placed in the W register. 2) ANDLW k Operation: (W).AND.(k) --> (W) Encoding: 1110 kkkk kkkk Example: ANDLW H’ 5F’ Before instruction: W = 0xA3 h After instruction: W = 0x03 h In example 3 below, the contents of the W register are AND’ed with register ‘f’. If ‘d’ = ‘0’, the result is stored in the W register. If ‘d’ = ‘1’, the result is stored back in register ‘f’. 3) ANDWF f,d Operation: (W) .AND. (f) --> (dest) Encoding: 0001 01df ffff Example: ANDWF TEMP_REG, 1 Before instruction: W = 0x17 h TEMP_REG = 0xC2 h After instruction: W = 0x17 h TEMP_REG = 0x02 h Microcontroller Architecture August 29, 2002 11 ECEN 4213 Computer Based System Design In example 4 below, bit location designated by ‘b’ is set in register ‘f’. 4) BSF f,b Operation: 1 --> (f<b>) Encoding: 0101 bbbf ffff Example: BSF FLAG_REG, 7 Before instruction: FLAG_REG = 0x0A h After instruction: FLAG_REG = 0x8A h In example 5 below, TRIS register ‘f’ (f=5,6 or 7) is loaded with the contents of the W register. 5) TRIS f Operation: (W) --> TRIS register f Encoding: 0000 0000 0fff Example: TRIS PORTB Before instruction: W After instruction: TRISB = 0xA5 h Microcontroller Architecture = 0xA5 h August 29, 2002 12