Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Nicole Negherbon ([email protected]) Compiler Validation Specialist IBM 12 December 2014 Rafik Zurob ([email protected]) Compiler Software Developer IBM Nemanja Ivanovic ([email protected]) Compiler Validation Specialist IBM The IBM® POWER8™ platform supports operating systems that use big endian or little endian byte ordering. Migrating programs written for a big endian operating system to a little endian operating system may require code changes to maintain program behaviour or results. There are differences that need to be considered with regards to vectors, storage association between items of different sizes, long doubles, complex numbers, and serialization. The application binary interface (ABI) implemented in IBM XL C/C++ for little endian Linux on Power Systems is different than the ABI implemented in the big endian distributions. New options and built-in functions have been added to help with porting. This article describes these differences, new options, and built-in functions and makes suggestions about code changes to port code to IBM XL C/C++ on POWER8. Introduction To migrate from a big endian compiler to a little endian compiler, you might have to change some code to maintain program behavior or results. Vectors, storage association between items of different sizes, long doubles, complex numbers and serialization are aspects of your code that should be considered when porting code from big endian to little endian. The little endian Linux® distributions on IBM Power Systems™ use a different ABI than big endian distributions. Programs that have dependencies on the old ABI will need to be updated. New built-in functions make porting easier with respect to byte ordering of vectors. © Copyright IBM Corporation 2014 Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Trademarks Page 1 of 17 developerWorks® ibm.com/developerWorks/ This article describes the possible issues encountered when porting C/C++ code for use on big endian IBM XL C/C++ on Power Systems to little endian IBM XL C/C++ on Power Systems. It makes some suggestions about coding changes and describes compiler features and options that help with porting code. Comparing big endian byte order and little endian byte order Endianness determines how data is interpreted in memory. The endianness of your platform is determined by the architecture of your processor. The two most common types of endianness used today are big endian and little endian. On big endian platforms (often referred to as simply big endian in this article), bytes in memory are ordered with the most significant byte first (or on the "left"). On little endian platforms (often referred to as simply little endian in this article), bytes in memory and vector registers are ordered with the least significant byte first (or on the "left"). For example, Figure 1 depicts how 000102030405060716 (interpreted as an 8-byte integer) is stored in memory on big endian and little endian platforms. In Figure 1, a represents the memory address at that location. Figure 1: Representations of big endian and little endian byte ordering in memory Vectors The IBM POWER® processor architecture supports 16-byte vectors containing sixteen 1-byte elements, eight 2-byte elements, four 4-byte element, or two 8-byte elements. The processor has 128-bit vector registers and instructions to load vectors into registers, operate on vectors in registers, and store vector registers to memory. The IBM XL compilers provide built-in functions and language support to work with vectors. There are differences between big endian and little endian with respect to vector element order. New compiler options and built-in functions have been introduced to help with these differences. These differences and new features are explained in the following sections. Vector element order and vector element byte order There are two ways to lay out the elements of a vector in a vector register. You can load the elements from lowest to highest, so that element 0 is the leftmost element in the vector register. Alternatively, you can load the elements from highest to lowest, so that element 0 is the rightmost Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 2 of 17 ibm.com/developerWorks/ developerWorks® element in the vector register. The former placement is called big endian vector element order, while the latter is called little endian vector element order. On big endian, big endian vector element order is always used. On little endian, you have the choice between using big endian vector element order and little endian vector element order. Regardless of the vector element order, vector elements on big endian use big endian byte order in memory. Vector elements on little endian use little endian byte order in memory by default. To demonstrate the differences of big endian and little endian vector element ordering, consider Figure 2 and Figure 3. Figure 2 depicts how 000102030405060708090A0B0C0D0E0F16 (interpreted as a 16-byte vector) is represented in a vector register in big endian vector element order. The b127 and b0 markings on the figure denote bit 127 and bit 0 of the register respectively. The figure shows representations for vectors filled with elements of 1, 2, 4, and 8 bytes as you move down from top. Figure 2: Representations of big endian vector element order in vector registers Figure 3 depicts how 000102030405060708090A0B0C0D0E0F16 (interpreted as a 16-byte vector) is represented in a vector register in little endian vector element order. The b127 and b0 markings on the figure denote bit 127 and bit 0 of the register respectively. The figure shows representations for vectors filled with elements of 1, 2, 4, and 8 bytes as you move down from top. Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 3 of 17 developerWorks® ibm.com/developerWorks/ Figure 3: Representations of little endian vector element order in vector registers -qaltivec option The -qaltivec option can be used to tell the little endian compiler how to order vector elements in the vector registers. With -qaltivec=le, the compiler loads vectors in little endian element order and assumes vectors are loaded with little endian element order for vector stores. The compiler inserts vector permute operations, if necessary, to ensure that the load and store built-in functions use little endian element order. For vector built-in functions that reference specific elements, the compiler assumes that vectors were loaded with little endian element order. On little endian, -qaltivec=le is the default. With -qaltivec=be, the compiler loads vectors in big endian element order and assumes vectors are loaded with big endian element order for vector stores. The compiler inserts vector permute operations, if necessary, to ensure that the load and store built-in functions use big endian element order. For vector built-in functions that reference specific elements, the compiler assumes that vectors were loaded with big endian element order. To demonstrate, consider the following program: #include<stdio.h> union Example { vector signed int vint; int sint[4]; }; int main() { union Example example; example.sint[0] example.sint[1] example.sint[2] example.sint[3] = = = = 0x0102; 0x0304; 0x0506; 0x0708; printf("First vector element: %04x\n", vec_extract(example.vint,0)); printf("Second vector element: %04x\n", vec_extract(example.vint,1)); printf("Third vector element: %04x\n", vec_extract(example.vint,2)); printf("Fourth vector element: %04x\n", vec_extract(example.vint,3)); Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 4 of 17 ibm.com/developerWorks/ developerWorks® return 0; } When this program is compiled on a big endian platform, or when it is compiled on a little endian platform with -qaltivec=le, it produces the following output: First vector element: 0102 Second vector element: 0304 Third vector element: 0506 Fourth vector element: 0708 However, if the program is compiled on a little endian platform with -qaltivec=be, it produces the following output: First vector element: 0708 Second vector element: 0506 Third vector element: 0304 Fourth vector element: 0102 The vector was loaded backwards into the vector register, while the order of elements in array i was unchanged. A more portable way of working with vectors is to load and store using the vec_xl, vec_xl_be, vec_xst, and vec_xst_be built-in functions described in the next section. Consider the following program: #include<stdio.h> int main() { vector signed vector signed vector signed vector signed int a[4]; int int int int v1; v2; v3; v4; v1 = vec_xl(0, (int[]){1,2,3,4}); vec_xst(v1, 0, a); printf("v1 via a: %d %d %d %d\n", a[0],a[1],a[2],a[3]); v2 = vec_neg(v1); vec_xst(v2, 0, a); printf("v2 via a: %d %d %d %d\n", a[0],a[1],a[2],a[3]); //Merge high and low depend on vector element order v3 = vec_mergeh(v1, v2); vec_xst(v3, 0, a); printf("v3 via a: %d %d %d %d\n", a[0],a[1],a[2],a[3]); v4 = vec_mergel(v1, v2); vec_xst(v4, 0, a); printf("v4 via a: %d %d %d %d\n", a[0],a[1],a[2],a[3]); return 0; } Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 5 of 17 developerWorks® ibm.com/developerWorks/ The program produces the same output on big endian platforms and little endian platforms. The output is: v1 v2 v3 v4 via via via via a: a: a: a: 1 2 3 4 -1 -2 -3 -4 1 -1 2 -2 3 -3 4 -4 The vector built-in functions vec_xl, vec_xst, vec_mergeh, and vec_mergel take the vector element order into account. In other words, when the program is compiled on a little endian platform with -qaltivec=le: • vec_xl uses a Vector Scalar eXtension (VSX) load instruction, which always loads in big endian element order. A vector permute instruction is then used to reverse the vector in the register to use little endian element order. • vec_xst assumes that the vector in the register is using little endian vector element order so it uses a vector permute instruction to reverse the vector element order to big endian vector element order. It then uses a VSX store instruction, which always stores in big endian element order, to store the vector back to memory. • vec_mergeh knows that vector elements start from the right. The vector registers containing v1 and v2 look as follows: v1 4 3 2 1 v2 -4 -3 -2 -1 Because vec_mergeh starts counting from the right, it correctly uses 1 and 2 for elements 0 and 2 of the result of vec_mergeh(v1, v2). • vec_mergel similarly knows that the vector elements start from the right. It therefore, correctly uses -1 and -2 for elements 1 and 3 of the result of vec_mergel(v1, v2). When the program is compiled on a big endian platform, or on a little endian platform with qaltivec=be: • vec_xl uses a VSX load instruction, which always loads in big endian element order. No vector permute is necessary. • vec_xst assumes that the vector in the register is using big endian vector element order. So, it directly uses a VSX store instruction, which always stores in big endian element order, to store the vector back to memory. • vec_mergeh knows that vector elements start from the left. The vector registers containing v1 and v2 look as follows: v1 1 2 3 4 v2 -1 -2 -3 -4 Because vec_mergeh starts counting from the left, it correctly uses 1 and 2 for elements 0 and 2 of the result of vec_mergeh(v1, v2). Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 6 of 17 ibm.com/developerWorks/ developerWorks® • vec_mergel similarly knows that the vector elements start from the left. It therefore, correctly uses -1 and -2 for elements 1 and 3 of the result of vec_mergel(v1, v2). For programs that do not use unions, the –qaltivec=be option can be useful in porting code from big endian to little endian. The POWER8 cryptography built-in functions require their input vectors to be in big endian vector element order. You can do this either by using -qaltivec=be, or by using the vec_xl_be and vec_xst_be functions to load and store. These vector load and store functions are outlined in the following section. New vector load and store built-in functions New vector load and store built-in functions that use VSX instructions have been added. Note that XL C/C++ for little endian Linux on Power Systems is a 64-bit compiler product so only the prototypes for 64-mode apply. These built-in functions are further described in the XL C/C++ Compiler Reference (XL C/C++ for Linux documentation library). vec_xl(offset, address) This function loads a 16-byte vector from the memory address specified by the displacement offset and the address of address with the appropriate element order with respect to the platform and the -qaltivec option. Prototypes (64-bit mode): vector vector vector vector vector vector vector vector vector vector signed char vec_xl(long, signed char *); unsigned char vec_xl(long, unsigned char *); signed short vec_xl(long, signed short *); unsigned short vec_xl(long, unsigned short *); signed int vec_xl(long, signed int *); unsigned int vec_xl(long, unsigned int *); signed long long vec_xl(long, signed long long *); unsigned long long vec_xl(long, unsigned long long *); float vec_xl(long, float *); double vec_xl(long, double *); vec_xl_be(offset, address) This function loads a 16-byte vector from the memory address specified by the displacement offset and the address of address in big endian order regardless of the platform or the -qaltivec option. Prototypes (64-bit mode): vector vector vector vector vector vector vector vector vector vector signed char vec_xl_be(long, signed char *); unsigned char vec_xl_be(long, unsigned char *); signed short vec_xl_be(long, signed short *); unsigned short vec_xl_be(long, unsigned short *); signed int vec_xl_be(long, signed int *); unsigned int vec_xl_be(long, unsigned int *); signed long long vec_xl_be(long, signed long long *); unsigned long long vec_xl_be(long, unsigned long long *); float vec_xl_be(long, float *); double vec_xl_be(long, double *); Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 7 of 17 developerWorks® ibm.com/developerWorks/ vec_xst(vect, offset, address) This function stores the elements of a 16-byte vector specified by vect into a given memory address. The address is calculated by adding the displacement specified by offset to the memory address specified by address with the appropriate element order with respect to the platform and the -qaltivec option. Prototypes (64-bit mode): void void void void void void void void void void vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector vec_xst(vector signed char, long, signed char *); unsigned char, long, unsigned char *); signed short, long, signed short *); unsigned short, long, unsigned short *); signed int, long, signed int *); unsigned int, long, unsigned int *); signed long long, long, signed long long *); unsigned long long, long, unsigned long long *); float, long, float *); double, long, double *); vec_xst_be(vect, offset, address) This function stores the elements of a 16-byte vector specified by vect into a given memory address. The address is calculated by adding the displacement specified by offset to the memory address specified by address in big endian order regardless of the platform or the -qaltivec option. Prototypes (64-bit mode): void void void void void void void void void void vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector vec_xst_be(vector signed char, long, signed char *); unsigned char, long, unsigned char *); signed short, long, signed short *); unsigned short, long, unsigned short *); signed int, long, signed int *); unsigned int, long, unsigned int *); signed long long, long, signed long long *); unsigned long long, long, unsigned long long *); float, long, float *); double, long, double *); Vector literals and binary-coded decimals (BCDs) The binary-coded decimal (BCD) built-in functions operate on signed BCD values loaded in vector registers. Each BCD is made up of a series of 4-byte nibbles containing a value between 0 and 9, with the last nibble containing the sign. For example, the value 10 is represented (in 4-byte nibbles) as: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0 0xC Each hexadecimal value above represents a 4-byte nibble. The 0x1 nibble is for the 1 in 10, and the 0x0 nibble after it is for the 0 in 10. The 0xC nibble is a special value that represents a plus sign. Because there is no BCD data type and the BCD numbers need to be loaded into vector registers, the BCD built-in functions use vector unsigned char arguments and results. This makes the BCD built-in functions dependent on vector element order even though BCD numbers are not themselves vectors. In other words, vectors containing BCD numbers need to be loaded in big Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 8 of 17 ibm.com/developerWorks/ developerWorks® endian vector element order even when compiling with –qaltivec=le. This is a problem when vectors are statically initialized (for example, using vector literals): a braced static vector initializer expression with the byte elements listed in big endian order populates the vector register in little endian element order on little endian platforms. To work around this, you can reverse the order of the elements in the vector initializer. For example, consider the following program: #include <stdio.h> #if __LITTLE_ENDIAN__ #define BCD_INIT(b0, b1, b2, b3, b4, b5, b6, b7, \ b8, b9, ba, bb, bc, bd, be, bf) \ { bf, be, bd, bc, bb, ba, b9, b8, b7, b6, b5, b4, b3, b2, b1, b0 } #else #define BCD_INIT(b0, b1, b2, b3, b4, b5, b6, b7, \ b8, b9, ba, bb, bc, bd, be, bf) \ { b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf } #endif vector unsigned char v1001 = BCD_INIT(0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x1C); vector unsigned char v9009 = BCD_INIT(0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x00, 0x9C); static void print_bytes(vector unsigned char v) { long i; unsigned char b[16]; vec_xst_be(v, 0, b); printf("%.02hhx", b[0]); for (i = 1; i < 16; ++i) { printf(", %.02hhx", b[i]); } printf("\n"); } int main(void) { vector unsigned char result; printf("Adding statically initialized vectors\n"); printf("op1 is "); print_bytes(v1001); printf("op2 is "); print_bytes(v9009); result = __bcdadd(v1001, v9009, 0); printf("result is "); print_bytes(result); return 0; } This program prints the following data on both big endian and little endian platforms. Adding statically initialized op1 is 00, 00, 00, 00, 00, op2 is 00, 00, 00, 00, 00, result is 00, 00, 00, 00, 00, vectors 00, 00, 00, 00, 00, 00, 00, 00, 10, 00, 1c 00, 00, 00, 00, 00, 00, 00, 00, 90, 00, 9c 00, 00, 00, 00, 00, 00, 00, 01, 00, 01, 0c Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 9 of 17 developerWorks® ibm.com/developerWorks/ The __LITTLE_ENDIAN__ macro is predefined on little endian platforms only. We create a BCD_INIT macro to reverse the bytes in the initializer. Note that the print_bytes function uses the vec_xst_be built-in function because the vectors are in big endian vector element order. Other workarounds are to initialize at run time using the vec_xl_be built-in function, or to reverse the vector element order of the statically initialized vector at run time using the vec_reve built-in function. Application binary interface (ABI) XL C/C++ for little endian Linux on Power Systems uses the new IBM Power Architecture® 64bit ELF V2 ABI specification. This new ABI improves several areas, including function calls. This means, however, that assembly files targeting the old ABI have to be ported to the new ABI. Fortran, C, and C++ programs that follow their respective language standards do not need any porting to the new ABI. Programs containing non-standard extensions need to be reviewed for ABI sensitivity. Storage association between different sized items When porting a program from big endian to little endian, storage association between items of different size need to be considered. In C/C++, this concerns unions and pointer casting. The following sections outline those items in greater detail. Note that hexadecimal values are always printed in big endian order. Unions When using unions with items of different sizes, the result can be different on little endian than it is on big endian. To demonstrate this, consider the following program: #include<stdio.h> union Example { char i; short j; int k; }; main() { union Example example; example.k = 0x01020304; printf("example.i: %02x\n", example.i); printf("example.j: %04x\n", example.j); printf("example.k: %08x\n", example.k); return 0; } In the above program, we have specified that i, j, and k share storage space – all of which are of different sizes. On big endian, the most significant byte is in the lowest memory address (or on the left), so i, j, and k contain the following data: Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 10 of 17 ibm.com/developerWorks/ developerWorks® example.i: 01 example.j: 0102 example.k: 01020304 On little endian platforms, the least significant byte is on the left, so i, j, and k contain the following data: example.i: 04 example.j: 0304 example.k: 01020304 Pointer casting The issues with pointer casting are effectively the same as those introduced by unions. Consider the following program: #include<stdio.h> int main() { int i = 0x01020304; printf("example.i: %02x\n", *(char*)&i); printf("example.j: %04x\n", *(short*)&i); printf("example.k: %08x\n", *(int*)&i); return 0; } Note that the code is invalid by aliasing rules of the language (for example, an integer cannot be handled as a short), however, the code simply demonstrates the issue. In the above program, pointer casting is used to cast the integer i to char, short, and int. On big endian, the most significant byte of i is in the lowest memory address. The output of the program on big endian is: example.i: 01 example.j: 0102 example.k: 01020304 On little endian, the least significant byte of i is in the lowest memory address. The output of the program on little endian is: example.i: 04 example.j: 0304 example.k: 01020304 Long double and complex types XL C/C++'s long double types are composed of two parts of type double with different magnitudes that do not overlap (except when the number is zero or close to zero). The high-order double (the one that comes first in storage) must have the larger magnitude, even on little endian. The value of a long double number is the sum of its two real parts. Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 11 of 17 developerWorks® ibm.com/developerWorks/ Complex types are made of a real and an imaginary part where the real part is always before the imaginary part. In C/C++, you can access the real and imaginary parts using the __real__ and __imag__ unary operators or the creal and cimag function sets. Consider the following program: #include<stdio.h> #include<complex.h> union Example { float f[2]; float complex c; }; int main() { union Example example; example.c = 1.0f + 0.0f*I; printf("First element of float: %.4f\n", example.f[0]); printf("Second element of float: %.4f\n", example.f[1]); printf("Real part of complex: %.4f\n", __real__(example.c)); printf("Imaginary part of complex: %.4fi\n", __imag__(example.c)); return 0; } On both big endian and little endian, the first element of f is 1.0000 and the second element of f is 0.0000. The real part of the complex number c is 1.0000 and the imaginary part is 0.0000i. First element of float: 1.0000 Second element of float: 0.0000 Real part of complex: 1.0000 Imaginary part of complex: 0.0000i The byte order of the elements of a floating-point and a complex number is different between big endian and little endian, but the element order is the same. Serialization Binary data files depend on the byte order of the data. If you have a binary data file generated on a big endian platform, you will need to convert the byte order of the data when reading the file on a little endian platform. You can use the __load2r, __load4r, __load8r, __store2r, __store4r, and __store8r built-in functions to convert the byte order. Consider, for example, the following program which, when compiled and run on a big endian platform, generates a binary file bigendian.data: $ cat writefile.c #include <stdlib.h> #include <stdio.h> #include "data.h" int main() { FILE *fp; data_t data = { {1,2,3,4,5,6,7,8}, {1,2,3,4}, {-1,-2}, 3.0, 4.0, "abcdefgh" }; size_t res; fp = fopen("bigendian.data", "w"); if (fp == NULL) { perror("fopen"); Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 12 of 17 ibm.com/developerWorks/ developerWorks® exit(1); } res = fwrite(&data, sizeof(data), 1, fp); if (res != 1) { perror("fwrite"); exit(1); } fclose(fp); printf("Wrote:\n"); print_data(data); return 0; } $ cat data.h #ifndef DATA_H #define DATA_H #include <stdint.h> typedef struct { int64_t ll[8]; int32_t i[4]; int16_t s[2]; float f; double d; char c[10]; } data_t; static void print_data(data_t data) { long i; printf("data: ll={ "); for (i = 0; i < 8; ++i) { printf("0x%016lx ", data.ll[i]); } printf("}\n"); printf(" i={ "); for (i = 0; i < 4; ++i) { printf("0x%08x ", data.i[i]); } printf("}\n"); printf(" s={ "); for (i = 0; i < 2; ++i) { printf("0x%04hx ", data.s[i]); } printf("}\n"); printf(" f=%f\n", data.f); printf(" d=%f\n", data.d); printf(" c="); i = 0; while(i < 10 && data.c[i] != '\0') { printf("%c", data.c[i]); ++i; } printf("\n"); } #endif writes data as a stream of bytes. As a result, ll, i, s, f, and d will be in big endian byte order. To read the data file on a little endian platform, the read routine can't just use fread. It will need to convert the data too. For example: fwrite $ cat readfile.c #include <stdlib.h> #include <stdio.h> #include <stdint.h> #include "data.h" Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 13 of 17 developerWorks® ibm.com/developerWorks/ static void print_data(data_t data); int main() { FILE *fp; data_t data; size_t res; fp = fopen("bigendian.data", "r"); if (fp == NULL) { perror("fopen"); exit(1); } /* This fread call will read all non-character data in the wrong byte order */ res = fread(&data, sizeof(data), 1, fp); if (res != 1) { perror("fread"); exit(1); } printf("Read:\n"); print_data(data); /* Convert the byte order */ { union { uint64_t u64; uint32_t u32; double d; float f; } tmp; long i; /* convert the long long data */ for (i = 0; i < 8; ++i) { data.ll[i] = __load8r((uint64_t *) &data.ll[i]); } /* convert the integer data */ for (i = 0; i < 4; ++i) { data.i[i] = __load4r((uint32_t *) &data.i[i]); } /* convert the short data */ for (i = 0; i < 2; ++i) { data.s[i] = __load2r((uint16_t *) &data.s[i]); } /* convert the float data */ tmp.f = data.f; __store4r(tmp.u32, &data.f); /* convert the double data */ tmp.d = data.d; __store8r(tmp.u64, &data.d); } printf("After conversion:\n"); print_data(data); return 0; } When readfile.c is compiled and run on a little endian platform, the following output is generated. Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 14 of 17 ibm.com/developerWorks/ developerWorks® Read: data: ll={ 0x0100000000000000 0x0200000000000000 0x0300000000000000 0x0400000000000000 0x0500000000000000 0x0600000000000000 0x0700000000000000 0x0800000000000000 } i={ 0x01000000 0x02000000 0x03000000 0x04000000 } s={ 0xffff 0xfeff } f=0.000000 d=0.000000 c=abcdefgh After conversion: data: ll={ 0x0000000000000001 0x0000000000000002 0x0000000000000003 0x0000000000000004 0x0000000000000005 0x0000000000000006 0x0000000000000007 0x0000000000000008 } i={ 0x00000001 0x00000002 0x00000003 0x00000004 } s={ 0xffff 0xfffe } f=3.000000 d=4.000000 c=abcdefgh Similar considerations are needed when porting algorithms that depend on the byte order of the data. For example, a function that reads a text file containing a hexadecimal string of arbitrary length and converts it to an array of uint64_t can cast the character string to a uint64_t *. While this violates the ANSI aliasing rules, it works on a big endian platform because the byte orders of a char[8] array and uint64_t match. This is not the case on little endian platforms. Loading and storing in reverse byte order XL C/C++ provides the following built-in functions to help with converting byte order: unsigned short __load2r(unsigned short* address) Loads an unsigned short from address with reversed byte unsigned int __load4r(unsigned int* address) Loads an unsigned integer from address with reversed order. byte order. unsigned long long __load8r(unsigned long long* address) Loads an unsigned long long from address with reversed byte order. void __store2r(unsigned short source, unsigned short* address) Stores an unsigned short source to address with reversed byte order. void __store4r(unsigned int source, unsigned int* address) Stores an unsigned integer source to address with reversed byte order. void __store8r(unsigned long long source, unsigned long long* address) Stores an unsigned long long source to address with reversed byte order. vec_revb(address) Returns a vector that is of the same type as address containing the bytes of the corresponding element of address in the reversed byte order. Prototypes: Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 15 of 17 developerWorks® vector vector vector vector vector vector vector vector vector vector ibm.com/developerWorks/ signed char vec_revb(vector signed char); unsigned char vec_revb(vector unsigned char); signed short vec_revb(vector signed short); unsigned short vec_revb(vector unsigned short); signed int vec_revb(vector signed int); unsigned int vec_revb(vector unsigned int); signed long long vec_revb(vector signed long long); unsigned long long vec_revb(vector unsigned long long); float vec_revb(vector float) double vec_revb(vector double); vec_reve(address) Returns a vector that is of the same type as address containing the elements of address in the reversed element order. Prototypes: vector vector vector vector vector vector vector vector vector vector signed char vec_reve(vector signed char); unsigned char vec_reve(vector unsigned char); signed short vec_reve(vector signed short); unsigned short vec_reve(vector unsigned short); signed int vec_reve(vector signed int); unsigned int vec_reve(vector unsigned int); signed long long vec_reve(vector signed long long); unsigned long long vec_reve(vector unsigned long long); float vec_reve(vector float); double vec_reve(vector double); Resources • Visit the XL C/C++ for Linux product page for more information. • Visit the Power Architecture: 64-Bit ELF V2 ABI Specification - OpenPOWER ABI for Linux Supplement for more information. • Get the free trial download for XL C/C++ for Linux. • Get connected. Join the Rational C/C++ Cafe community. Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 16 of 17 ibm.com/developerWorks/ developerWorks® About the authors Nicole Negherbon Nicole Negherbon is a software developer with 3 years of experience. Nicole's primary area of expertise is in Fortran compiler validation on IBM AIX and Linux on Power. Nicole is a system administrator for IBM POWER8 systems running IBM AIX and various Linux on Power distributions. Rafik Zurob Rafik Zurob is an advisory software developer with 13 years of experience. Rafik is the team lead for the Fortran compiler frontend and runtime environment on IBM AIX and Linux on Power. He is a technical lead for migrating the IBM C/C++ and Fortran compiler products to the little endian Linux on the IBM Power platform. Nemanja Ivanovic Nemanja Ivanovic is a staff software developer with 5 years of experience in C/C+ + compiler validation on IBM Power Systems and Linux on System z. Nemanja has been with the IBM Canada lab for the past 4 years and his primary areas of expertise are IBM XL C/C++ compilers for AIX, Linux on Power, Linux on OpenPower, and Linux on System z. © Copyright IBM Corporation 2014 (www.ibm.com/legal/copytrade.shtml) Trademarks (www.ibm.com/developerworks/ibm/trademarks/) Targeting your applications - what little endian and big endian IBM XL C/C++ compiler differences mean to you Page 17 of 17