The controversy created by the one line of code I have published pushed me to write a short pamphlet on the subject. Please make sure you do no not take that too seriously
The beauty of programming in C
Yes portability is an issue that you must always have in mind. For the one line of code that started this discussion you may wander if portability is really an issue? As this code is accessing the HW register of an Atari machine it will obviously not work on any other machine. Yet even here portability may be an issue if you want to be able to accommodate several C compilers on the Atari.
It is interesting to note that everybody has some ideas of how things work on a particular platform (as of course I do). For example on Atari which compilers treat int
and which treat them as long
? Personally I was under the assumption that on 68K platform (considered as a 32 bit processor?) any serious compilers would treat int as long? I suspect that this kind of assumptions on platform dependent features is dictated by the programmer’s background. For example I have done most of my C/C++ development on SUN UNIX and ported the code on many platforms / architecture (sun, hp, ibm, digital …). On all these UNIX platforms an int was always defined as 4 bytes long. In this kind of predictable cosy environment programmers tend to be lazy and to forget about portability issues. Therefore int has been used and abused all over the places (actually there are some good reasons for that explained latter) to the point that people where even casting pointers to int or vice-versa! (as an excuse you have to know that before ANSI C void* did not exist)! Life (specially in California) was nice and cool until some companies came with new processor architectures! Among these some famous example:
1) Some RISC architectures that were using optimized pointer to access internal register. These pointers could be one byte or 10 bits or whatever! Obviously not a good idea to cast int to this kind of pointers!!!
2) Some 64 bits architecture, like the Digital Alpha processor. The Alpha C compiler would consider an int as 4 bytes a long as 8 bytes and a pointer as 8 bytes. Therefore porting to Alpha has put a strong emphasis on badly developed code. This has been such a major problem for so many customers that Digital had to create special teams to help…
But all this is history… and I am glad to see that now people are concerned with portability upfront.
To get back to the Atari … one question is to find out if an int 2 bytes or 4 bytes? The obvious answer is that it is compiler dependant and therefore not predictable! For example I checked an old version of the v3.04 Lattice C compiler and at the time you had no choice int was 4 bytes. I also checked the latest version 6.x of the Lattice C compiler it also uses 32 bits for int by default but as mentioned by Ijor you can change this by defining _SHORTINT which gives you 2 bytes int. I have also checked the Pure C compiler and to my surprise it seems like the default is 16 bits and apparently it is the same (16 bits) for Borland Turbo C.
Now let’s first review some basic definitions:
Fundamental types in C are divided into three categories: integral
, and void
. Integral types are capable of handling whole numbers. Floating types are capable of specifying values that may have fractional parts.
- char : Type char is an integral type that usually contains members of the execution character set (usually ASCII). The C compiler treats variables of type char, signed char, and unsigned char as having different types. Variables of type char are sometimes promoted to int if they are type signed char.
Why is int so much use?
- short: Type short int (or simply short) is an integral type that is larger than or equal to the size of type char, and shorter than or equal to the size of type int. Variables of type short can be declared as signed short or unsigned short. Signed short is a synonym for short.
- int: Type int is an integral type that is larger than or equal to the size of type short int, and shorter than or equal to the size of type long. Variables of type int can be declared as signed int or unsigned int. Signed int is a synonym for int.
- long : Type long (or long int) is an integral type that is larger than or equal to the size of type int. Variables of type long can be declared as signed long or unsigned long. Signed long is a synonym for long.
The int and unsigned int type specifiers are widely used in C programs because they allow a particular machine to handle integer values in the most efficient way for that machine. And therefore this was highly recommended at that time. However, since the sizes of the int and unsigned int types vary, programs that depend on a specific int size may not be portable to other machines.
So what is the size of int on my machine?
The size of a signed or unsigned int item is suppose to be the standard size of an integer on a particular machine. For example, in 16-bit operating systems, the int type is usually 16 bits, or 2 bytes. In 32-bit operating systems, the int type is usually 32 bits, or 4 bytes. Thus, the int type is equivalent to either the short int or the long int type, and the unsigned int type is equivalent to either the unsigned short or the unsigned long type, depending on the target environment. The int types all represent signed values unless specified otherwise.
This is all nice but yet it does not give you any information that can be used directly in a program. To make programs more portable, you can use expressions with the sizeof operator instead of hard-coded data sizes. Another alternative is to look for maximum values in limit.h
Now what about the bitwise shift operators (<<and>>) :
Both operands of the shift operators must be of integral types. Integral promotions
are performed according to the rules described in Integral Promotions. The type of the result is the same as the type of the left operand.
Variables of an integral type can be converted to another wider
integral type (that is, a type that can represent a larger set of values). This widening type of conversion is called "integral promotion. In C the integral types are char, int, and long
(and the short, signed, and unsigned versions of these types). C promotions are "value-preserving." That is, the value after the promotion is guaranteed to be the same as the value before the promotion.
Integral conversions are performed between integral types. Variables of an integral type can be converted to another wider integral type (that is, a type that can represent a larger set of values). This widening type of conversion is called "integral promotion". (Note that conversion can also result in an integral demotion not presented here)
It is not always easy to interpret the information from the original ANSI C definition presented in the K&R book (of course the second ANSI edition). Thanks to the ANSI C/C++ working groups a lot of things have been clarified in C++ and the resulting clarifications have been applied back to ANSI C compilers (of course this only apply to relatively more recent C compilers). You may wander but this does not apply to my old C compiler developed for Atari many years before C++ was specified? Well the good news is that in most cases the decision and clarifications where taken based on the most widely accepted interpretations that where therefore widely implemented. That means that most C compilers where doing the right things even if they did not know!!!
If we take the ANSI version of K&R we see in section A7.8 that the left operand of a shift operator is subject to an integral promotion. But what kind of integral promotion: a short
, an int
, or a long
??? If we now look to A6.1 and interpret stricto sensus it seems to indicate that the integral promotion is apparently limited to an int??? But in fact we know that most compilers do the correct promotion this means to a long
if needed. Otherwise the code presented above would not work!
So you may fill more secure using the code from Lautreamont
Code: Select all
volatile unsigned char* ptr = DMA_HIGH;
p = *ptr, p <<= 8;
p |= ptr, p <<= 8;
p |= ptr;
which is somewhat equivalent to the code I presented before (which is admitedly ugly):
Code: Select all
volatile unsigned char* ptr = DMA_HIGH;
register long p;
p = *ptr;
p <<= 8;
(unsigned char)p = *(ptr + 2);
p <<= 8;
(unsigned char)p = *(ptr + 4);
Is this code 100% safe and portable? In practice yes and maybe more portable than
Code: Select all
p = (*ptr << 16) + (*(ptr+2) << 8) + *(ptr+4);
However if you remember that:
char <= short <= int <= long
This implies that in theory a long can be 8 bits
and would not handle 24 bits quantity!!! In practice I have never seen a long to be one byte, but however it is quite possible to have long with 16 bits on 4 or 8 bit controllers.
Of course the solution is easy just use the sizeof operator and make sure that your int / long (use whatever you would like to us) is at least 24 bits … If this is not the case then you will have to develop your own library to perform 24bits long additions.
Who said C is not fun... Can you believe that when programming in java not only you do not have pointers but you don’t even have to take care of the memory management…
As we say in France: ils sont fous ces romains... (From Asterix cartoon)