Here’s a problem you can encounter if you’re working with memory intensive calculations using integer types in C and you want your code to be portable (i.e., you want it to function the same way on multiple platforms). The ISO/IEC 9899:1990 standard specified that C should have four signed and unsigned integer types: char (yes, char is an integer type), short, int, and long. In 1999, _Bool was added as a single bit integer type and long long was added. The standards don’t specify the size of these¬†integer types (how many bytes they should use), other than saying int and short should be at least 16 bits, while long should have at least the size of int and not be smaller than 32 bits, and long long should likewise be at least the size of long.

The problem arising from this vagueness can be demonstrated in the following little C program:

/* int-type-memsize.c                                                        */
/* Jason B. Hill (jason@jasonbhill.com)                                      */
/*                                                                           */
/* Returns the memory size of integer data types in C                        */
 
#include <stdio.h>
 
int main(void) {
    printf("--------------------------------------------------------------\n");
    printf("C Integer data types on this %d-bit machine\n", __WORDSIZE);
    printf("--------------------------------------------------------------\n");
 
    printf("short                    %lu bytes\n", sizeof(short));
    printf("int                      %lu bytes\n", sizeof(int));
    printf("long                     %lu bytes\n", sizeof(long));
    printf("long long                %lu bytes\n", sizeof(long long));
 
    return 0;
} /* main */

Compiling this with GCC on my 64-bit Xubuntu desktop gives the following result:

--------------------------------------------------------------
C Integer data types on this 64-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     8 bytes
long long                8 bytes

The same code compiled with Visual C++ and executed on a 64-bit Windows machine gives this result:

--------------------------------------------------------------
C Integer data types on this 64-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     4 bytes
long long                8 bytes

And here’s what happens with GCC on a 32-bit Red Hat machine:

--------------------------------------------------------------
C Integer data types on this 32-bit machine
--------------------------------------------------------------
short                    2 bytes
int                      4 bytes
long                     4 bytes
long long                8 bytes

Obviously, there is something more complicated going on here than the difference between 32-bit and 64-bit architectures. This becomes even more complicated when the code is run on older 64-bit UNICOS based systems like the CRAY T3E, as short, int, long, and long long were all 8 bytes. The differences in these systems come from the fact that Microsoft’s Visual C++ compiler uses the LLP64 model, while most UNIX systems use LP64, and UNICOS was SILP64. You can find out more about what this means at The Open Group’s page and Wikipedia’s entry on 64-bit architectures.

This makes portable code with the standard C integer types a pain for two reasons. Firstly, the range of values capable of being stored in any signed or unsigned integer type varies by machine. Secondly, the memory requirements of data structures using various integer types also varies, sometimes drastically. If you want better control over integer data types in C, allowing you to avoid these issues, you need to use the inttypes.h standard header. We can rewrite our program above using inttypes.h as follows.

/* int-types-portable-memsize.c                                              */
/* Jason B. Hill (jason@jasonbhill.com)                                      */
/*                                                                           */
/* Returns the memory size of portable integer data types in C               */
 
#include <stdio.h>
#include <inttypes.h>
 
int main(void) {
    printf("--------------------------------------------------------------\n");
    printf("C Integer data types on this %d-bit machine\n", __WORDSIZE);
    printf("--------------------------------------------------------------\n");
 
    printf("int8_t                    %lu bytes\n", sizeof(int8_t));
    printf("int16_t                   %lu bytes\n", sizeof(int16_t));
    printf("int32_t                   %lu bytes\n", sizeof(int32_t));
    printf("int64_t                   %lu bytes\n", sizeof(int64_t));
 
    return 0;
} /* main */

On any machine with a competent C compiler, we should now get consistent results like this:

--------------------------------------------------------------
C Integer data types on this 32-bit machine
--------------------------------------------------------------
int8_t                    1 bytes
int16_t                   2 bytes
int32_t                   4 bytes
int64_t                   8 bytes

Of course, we’ve masked something a bit here. The above code uses sizeof(int32_t) inside a printf call, and we know that the integer returned by sizeof() is then used as an unsigned long by the fact that we’ve asked C to print “%lu“. What if we simply want to print a variable saved as type int32_t? On some systems, this may correspond to a long and on others it may be a long long. Thus, inttypes.h provides the appropriate printf and scanf macros for dealing with each of the data types it defines. Here’s a table to summarize the type and macro definitions.

Type Description [min,max] value range printf scanf
int8_t 8-bit signed integer \(\left[-2^7,2^7-1\right]\) PRId8 SCNd8
uint8_t 8-bit unsigned integer \(\left[0,2^8-1\right]\) PRIu8 SCNu8
int16_t 16-bit signed integer \(\left[-2^{15},2^{15}-1\right]\) PRId16 SCNd16
uint16_t 16-bit unsigned integer \(\left[0,2^{16}-1\right]\) PRIu16 SCNu16
int32_t 32-bit signed integer \(\left[-2^{31},2^{31}-1\right]\) PRId32 SCNd32
uint32_t 32-bit unsigned integer \(\left[0,2^{32}-1\right]\) PRIu32 SCNu32
int64_t 64-bit signed integer \(\left[-2^{63},2^{63}-1\right]\) PRId64 SCNd64
uint64_t 64-bit unsigned integer \(\left[0,2^{64}-1\right]\) PRIu64 SCNu64
Posted in C
Share this post, let the world know

Comments are closed