Why did C use the -> operator instead of reusing the . operator? Unicorn Meta Zoo #1: Why another podcast? Announcing the arrival of Valued Associate #679: Cesar ManaraWhat was the first C compiler for the IBM PC?What was the first C compiler for the Mac?Which tools were used to create the C language?Why were the / (min) and the / (max) operators abandoned in the C language?Why (historically) include the number of arguments (argc) as a parameter of main?The history of the NULL pointerWhy did C have the return type before functions?Why do C to Z80 compilers produce poor code?Why was UNIX never backported to the PDP-7?What was the name of the object-oriented C language I used in the 1980s on the Mac
How can I wire a 9-position switch so that each position turns on one more LED than the one before?
Function to calculate red-edgeNDVI in Google Earth Engine
Married in secret, can marital status in passport be changed at a later date?
Is there any hidden 'W' sound after 'comment' in : Comment est-elle?
As an international instructor, should I openly talk about my accent?
Will I lose my paid in full property
Why is an operator the quantum mechanical analogue of an observable?
Split coins into combinations of different denominations
What is the best way to deal with NPC-NPC combat?
How to avoid introduction cliches
What do you call the part of a novel that is not dialog?
What ability score does a Hexblade's Pact Weapon use for attack and damage when wielded by another character?
Arriving in Atlanta after US Preclearance in Dublin. Will I go through TSA security in Atlanta to transfer to a connecting flight?
All ASCII characters with a given bit count
Is accepting an invalid credit card number a security issue?
Do I need to protect SFP ports and optics from dust/contaminants? If so, how?
How to not starve gigantic beasts
Map material from china not allowed to leave the country
Can you stand up from being prone using Skirmisher outside of your turn?
std::is_constructible on incomplete types
Is Electric Central Heating worth it if using Solar Panels?
Why did Israel vote against lifting the American embargo on Cuba?
What is a 'Key' in computer science?
Implementing 3DES algorithm in Java: is my code secure?
Why did C use the -> operator instead of reusing the . operator?
Unicorn Meta Zoo #1: Why another podcast?
Announcing the arrival of Valued Associate #679: Cesar ManaraWhat was the first C compiler for the IBM PC?What was the first C compiler for the Mac?Which tools were used to create the C language?Why were the / (min) and the / (max) operators abandoned in the C language?Why (historically) include the number of arguments (argc) as a parameter of main?The history of the NULL pointerWhy did C have the return type before functions?Why do C to Z80 compilers produce poor code?Why was UNIX never backported to the PDP-7?What was the name of the object-oriented C language I used in the 1980s on the Mac
In the C programming language, the syntax to access the member of a structure is
structure
.
member
However, a member of a structure referenced by a pointer is written as
pointer
->
member
There's really no need for two different operators. The compiler knows the type of the left-hand value; if it is a structure, the first meaning is evident. If it is a pointer, the second meaning is evident. Furthermore, .
is far easier to type than ->
. Not only does ->
have more characters to type, on many keyboards one character is unshifted and the other character is shifted, requiring some finger acrobatics. Indeed, many languages based on C allow or use .
in place of ->
.
Why did C use two operators when one would have sufficed?
(My guess would be because C evolved from the typeless B language.)
c
|
show 2 more comments
In the C programming language, the syntax to access the member of a structure is
structure
.
member
However, a member of a structure referenced by a pointer is written as
pointer
->
member
There's really no need for two different operators. The compiler knows the type of the left-hand value; if it is a structure, the first meaning is evident. If it is a pointer, the second meaning is evident. Furthermore, .
is far easier to type than ->
. Not only does ->
have more characters to type, on many keyboards one character is unshifted and the other character is shifted, requiring some finger acrobatics. Indeed, many languages based on C allow or use .
in place of ->
.
Why did C use two operators when one would have sufficed?
(My guess would be because C evolved from the typeless B language.)
c
3
You can still write(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)
– tofro
9 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
4
Just out of curiosity: In your hypothetical language wherea.b
could be interpreted as(*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as(**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…
– wrtlprnft
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
3
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago
|
show 2 more comments
In the C programming language, the syntax to access the member of a structure is
structure
.
member
However, a member of a structure referenced by a pointer is written as
pointer
->
member
There's really no need for two different operators. The compiler knows the type of the left-hand value; if it is a structure, the first meaning is evident. If it is a pointer, the second meaning is evident. Furthermore, .
is far easier to type than ->
. Not only does ->
have more characters to type, on many keyboards one character is unshifted and the other character is shifted, requiring some finger acrobatics. Indeed, many languages based on C allow or use .
in place of ->
.
Why did C use two operators when one would have sufficed?
(My guess would be because C evolved from the typeless B language.)
c
In the C programming language, the syntax to access the member of a structure is
structure
.
member
However, a member of a structure referenced by a pointer is written as
pointer
->
member
There's really no need for two different operators. The compiler knows the type of the left-hand value; if it is a structure, the first meaning is evident. If it is a pointer, the second meaning is evident. Furthermore, .
is far easier to type than ->
. Not only does ->
have more characters to type, on many keyboards one character is unshifted and the other character is shifted, requiring some finger acrobatics. Indeed, many languages based on C allow or use .
in place of ->
.
Why did C use two operators when one would have sufficed?
(My guess would be because C evolved from the typeless B language.)
c
c
asked 10 hours ago
Dr SheldonDr Sheldon
1,9912834
1,9912834
3
You can still write(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)
– tofro
9 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
4
Just out of curiosity: In your hypothetical language wherea.b
could be interpreted as(*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as(**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…
– wrtlprnft
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
3
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago
|
show 2 more comments
3
You can still write(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)
– tofro
9 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
4
Just out of curiosity: In your hypothetical language wherea.b
could be interpreted as(*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as(**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…
– wrtlprnft
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
3
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago
3
3
You can still write
(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)– tofro
9 hours ago
You can still write
(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)– tofro
9 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
4
4
Just out of curiosity: In your hypothetical language where
a.b
could be interpreted as (*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as (**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…– wrtlprnft
6 hours ago
Just out of curiosity: In your hypothetical language where
a.b
could be interpreted as (*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as (**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…– wrtlprnft
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
3
3
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago
|
show 2 more comments
4 Answers
4
active
oldest
votes
In the embryonic form of C described in the 1974 C Reference Manual, there was no requirement that the left operand of .
actually be a structure, nor that the left operand of ->
actually be a pointer. The ->
operator meant "interpret the value of the left operand as a pointer, add the offset associated with the indicated structure member name, and dereference the resulting pointer as an object of the appropriate type. The .
operator effectively took the address of the left operand and then applied ->
.
Thus, given:
struct q int x, y; ;
int a[2];
the expressions a[0].y
and a[0]->y
would be interpreted in a fashion equivalent to ((struct q*)&a[0])->y
and ((struct q*)a[0])->y
, respectively.
If the compiler had examined the type of the left operand to the .
operator, it could have used that to select between the two behaviors for it. It was probably easier, however, to have two operators whose behaviors didn't depend upon the left operand's type.
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the+
operator, so the cost of using the already existing information to disambiguate.
and->
would have been minimal if there was no need to use them with other types as well.
– supercat
51 mins ago
add a comment |
I think there are two factors that led to standardization of the distinct operator "->" for accessing data members using a pointer.
- You assume that the C compiler would recognize the type of the LHS as being a pointer. But programmers could, and often did, override the initial typing (variable declaration) by using a typecast.
- In order to make the code more readable and less prone to unintended side-effects, it is useful to distinguish operations using pointers.
A very common feature of idiomatic C code is that a structure passed to a function as a pointer is modified within the function. Thus, the result is returned implicitly, by the side-effect of the structure variable in the calling function having been modified by the callee. This sort of approach violates modern sensibilities about loosely coupled code, but it was a simple and efficient means of dealing with complexity in C code. I would say the programmer was greatly assisted in maintaining the readability of such code by having distinct operations that made it clear whether some (possibly shared) memory pointer was the thing whose target was being modified.
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
add a comment |
Despite your assertion, there would in fact be situations where it would be ambigious.
First off, early C compilers were very simple. This was in fact the main appeal of the language, as compilers for it were very easy to create and could run on very small systems, like early 16/32 microprocessors.
Adding a bunch of code for hitting all the niche cases of type inference would have drastically added to the amount of code required to make a C compiler. In fact, I've argued as a (half) joke that K&R C had type inference, but it always inferred int
. If you didn't tell C what type an object was, it assumed int
(which could cause some really gnarly bugs, let me tell you...)
Secondly, since K&R C was weakly(barely) typed, the information in many cases flat out wasn't available. The destination type of a pointer assignment can be an int, or visa versa, and K&R C has no problem with that. The compiler simply cannot infer a dereference. The coder is assumed to know what she's doing.
Also realize that in C pointers and arrays are essentially syntactic sugar for each other. This means now your . operator would have to automagically work on arrays too. For instance, if member
happened to be a char array, now structure.member would return with the first character. And again, both chars and pointer are assignable into ints, so context doesn't help you.
This being said, you aren't the first to notice this issue. In fact, Ada was designed that dereferencing a pointer object is always assumed when a dot is used. In those cases where you want the actual pointer, you have to use .all
. The ambiguity (pointer vs. pointed to object) is still there, but resolved by moving the extra syntax to the weirder case.
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
add a comment |
Some of the first C code I saw was like this: 0x8040->output = 'A';
— its purpose was accessing memory mapped I/O locations. Needless to say it took me a while to figure out what this code was supposed to do, and the hex constant there really threw me.
The original K&R C placed all field names (here output
) into the same namespace. It was an error to have two fields of the same name in different structs at different offsets — but ok to have the same name at the same offset, the idea here being that two different structs could share the same initial fields, giving cheap way of doing "subclassing" to put varying data members at the end of the struct.
A struct could also be anonymous, e.g. no tag name for the struct. None the less, the members could still be used in .
or ->
expressions.
The C Programming Language (K&R C) Appendix A, p197,209
[8.5] ... Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if the preceding members differ the construction is nonportable.)
...
[14.1] ... §7.1 says that in a direct or indirect structure reference (with . or ->) the name on the right must be a member of the structure named or pointed to by the expression on the left. To allow an escape from the typing rules, this restriction is not firmly enforced by the compiler. In fact, any lvalue is allowed before ., and the lvalue is then assumed to have the form of the structure of which the name on the right is a member. Also, the expression before a -> is required only to be a pointer or an integer. If an integer, it is taken to be the absolute address, in machine storage units, of the appropriate structure.
Since the K&R language & compiler didn't care what the type of the left hand side of .
and ->
was, the only way it had to tell the difference by having the two operators.
The ANSI C line of standards simply followed suit in syntax, even as these old rules were abandoned.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "648"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f10812%2fwhy-did-c-use-the-operator-instead-of-reusing-the-operator%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
In the embryonic form of C described in the 1974 C Reference Manual, there was no requirement that the left operand of .
actually be a structure, nor that the left operand of ->
actually be a pointer. The ->
operator meant "interpret the value of the left operand as a pointer, add the offset associated with the indicated structure member name, and dereference the resulting pointer as an object of the appropriate type. The .
operator effectively took the address of the left operand and then applied ->
.
Thus, given:
struct q int x, y; ;
int a[2];
the expressions a[0].y
and a[0]->y
would be interpreted in a fashion equivalent to ((struct q*)&a[0])->y
and ((struct q*)a[0])->y
, respectively.
If the compiler had examined the type of the left operand to the .
operator, it could have used that to select between the two behaviors for it. It was probably easier, however, to have two operators whose behaviors didn't depend upon the left operand's type.
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the+
operator, so the cost of using the already existing information to disambiguate.
and->
would have been minimal if there was no need to use them with other types as well.
– supercat
51 mins ago
add a comment |
In the embryonic form of C described in the 1974 C Reference Manual, there was no requirement that the left operand of .
actually be a structure, nor that the left operand of ->
actually be a pointer. The ->
operator meant "interpret the value of the left operand as a pointer, add the offset associated with the indicated structure member name, and dereference the resulting pointer as an object of the appropriate type. The .
operator effectively took the address of the left operand and then applied ->
.
Thus, given:
struct q int x, y; ;
int a[2];
the expressions a[0].y
and a[0]->y
would be interpreted in a fashion equivalent to ((struct q*)&a[0])->y
and ((struct q*)a[0])->y
, respectively.
If the compiler had examined the type of the left operand to the .
operator, it could have used that to select between the two behaviors for it. It was probably easier, however, to have two operators whose behaviors didn't depend upon the left operand's type.
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the+
operator, so the cost of using the already existing information to disambiguate.
and->
would have been minimal if there was no need to use them with other types as well.
– supercat
51 mins ago
add a comment |
In the embryonic form of C described in the 1974 C Reference Manual, there was no requirement that the left operand of .
actually be a structure, nor that the left operand of ->
actually be a pointer. The ->
operator meant "interpret the value of the left operand as a pointer, add the offset associated with the indicated structure member name, and dereference the resulting pointer as an object of the appropriate type. The .
operator effectively took the address of the left operand and then applied ->
.
Thus, given:
struct q int x, y; ;
int a[2];
the expressions a[0].y
and a[0]->y
would be interpreted in a fashion equivalent to ((struct q*)&a[0])->y
and ((struct q*)a[0])->y
, respectively.
If the compiler had examined the type of the left operand to the .
operator, it could have used that to select between the two behaviors for it. It was probably easier, however, to have two operators whose behaviors didn't depend upon the left operand's type.
In the embryonic form of C described in the 1974 C Reference Manual, there was no requirement that the left operand of .
actually be a structure, nor that the left operand of ->
actually be a pointer. The ->
operator meant "interpret the value of the left operand as a pointer, add the offset associated with the indicated structure member name, and dereference the resulting pointer as an object of the appropriate type. The .
operator effectively took the address of the left operand and then applied ->
.
Thus, given:
struct q int x, y; ;
int a[2];
the expressions a[0].y
and a[0]->y
would be interpreted in a fashion equivalent to ((struct q*)&a[0])->y
and ((struct q*)a[0])->y
, respectively.
If the compiler had examined the type of the left operand to the .
operator, it could have used that to select between the two behaviors for it. It was probably easier, however, to have two operators whose behaviors didn't depend upon the left operand's type.
answered 7 hours ago
supercatsupercat
8,405943
8,405943
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the+
operator, so the cost of using the already existing information to disambiguate.
and->
would have been minimal if there was no need to use them with other types as well.
– supercat
51 mins ago
add a comment |
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the+
operator, so the cost of using the already existing information to disambiguate.
and->
would have been minimal if there was no need to use them with other types as well.
– supercat
51 mins ago
3
3
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
To the point. As well as the last part about being 'easier'. C wasn'T designed to be a language as comfortable as possible, but to be translated as linear as possible. Resolving contextual information adds complexity and ambiguity. Nothing one wants t have when the task is to write an OS as close to the machine as possible while having the luxury of structured programming support.
– Raffzahn
7 hours ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the
+
operator, so the cost of using the already existing information to disambiguate .
and ->
would have been minimal if there was no need to use them with other types as well.– supercat
51 mins ago
@Raffzahn: Even in 1974 C, the contextual information had be kept to process the
+
operator, so the cost of using the already existing information to disambiguate .
and ->
would have been minimal if there was no need to use them with other types as well.– supercat
51 mins ago
add a comment |
I think there are two factors that led to standardization of the distinct operator "->" for accessing data members using a pointer.
- You assume that the C compiler would recognize the type of the LHS as being a pointer. But programmers could, and often did, override the initial typing (variable declaration) by using a typecast.
- In order to make the code more readable and less prone to unintended side-effects, it is useful to distinguish operations using pointers.
A very common feature of idiomatic C code is that a structure passed to a function as a pointer is modified within the function. Thus, the result is returned implicitly, by the side-effect of the structure variable in the calling function having been modified by the callee. This sort of approach violates modern sensibilities about loosely coupled code, but it was a simple and efficient means of dealing with complexity in C code. I would say the programmer was greatly assisted in maintaining the readability of such code by having distinct operations that made it clear whether some (possibly shared) memory pointer was the thing whose target was being modified.
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
add a comment |
I think there are two factors that led to standardization of the distinct operator "->" for accessing data members using a pointer.
- You assume that the C compiler would recognize the type of the LHS as being a pointer. But programmers could, and often did, override the initial typing (variable declaration) by using a typecast.
- In order to make the code more readable and less prone to unintended side-effects, it is useful to distinguish operations using pointers.
A very common feature of idiomatic C code is that a structure passed to a function as a pointer is modified within the function. Thus, the result is returned implicitly, by the side-effect of the structure variable in the calling function having been modified by the callee. This sort of approach violates modern sensibilities about loosely coupled code, but it was a simple and efficient means of dealing with complexity in C code. I would say the programmer was greatly assisted in maintaining the readability of such code by having distinct operations that made it clear whether some (possibly shared) memory pointer was the thing whose target was being modified.
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
add a comment |
I think there are two factors that led to standardization of the distinct operator "->" for accessing data members using a pointer.
- You assume that the C compiler would recognize the type of the LHS as being a pointer. But programmers could, and often did, override the initial typing (variable declaration) by using a typecast.
- In order to make the code more readable and less prone to unintended side-effects, it is useful to distinguish operations using pointers.
A very common feature of idiomatic C code is that a structure passed to a function as a pointer is modified within the function. Thus, the result is returned implicitly, by the side-effect of the structure variable in the calling function having been modified by the callee. This sort of approach violates modern sensibilities about loosely coupled code, but it was a simple and efficient means of dealing with complexity in C code. I would say the programmer was greatly assisted in maintaining the readability of such code by having distinct operations that made it clear whether some (possibly shared) memory pointer was the thing whose target was being modified.
I think there are two factors that led to standardization of the distinct operator "->" for accessing data members using a pointer.
- You assume that the C compiler would recognize the type of the LHS as being a pointer. But programmers could, and often did, override the initial typing (variable declaration) by using a typecast.
- In order to make the code more readable and less prone to unintended side-effects, it is useful to distinguish operations using pointers.
A very common feature of idiomatic C code is that a structure passed to a function as a pointer is modified within the function. Thus, the result is returned implicitly, by the side-effect of the structure variable in the calling function having been modified by the callee. This sort of approach violates modern sensibilities about loosely coupled code, but it was a simple and efficient means of dealing with complexity in C code. I would say the programmer was greatly assisted in maintaining the readability of such code by having distinct operations that made it clear whether some (possibly shared) memory pointer was the thing whose target was being modified.
edited 9 hours ago
answered 10 hours ago
Brian HBrian H
18.3k69158
18.3k69158
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
add a comment |
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
Could you give an example of your point #1? The type of the result of a typecast is well-defined... it's the type you are typecasting to. Should that be a pointer to a structure, the compiler has enough information to access its members. Whatever the type was prior to the typecast is irrelevant.
– Dr Sheldon
9 hours ago
1
1
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
@DrSheldon Yes, you are correct. The compiler can check operators used against the type of a typecast too, and interpret a single operator appropriately. But then you are removing one of the compiler checks against programming errors. I think #1 and #2 actually work in concert to push the programmer toward care with pointer references. If the compiler tries to be "too smart" it ends up misinterpreting what the error-prone programmer intended, and (perhaps more important) makes the code harder to read.
– Brian H
9 hours ago
add a comment |
Despite your assertion, there would in fact be situations where it would be ambigious.
First off, early C compilers were very simple. This was in fact the main appeal of the language, as compilers for it were very easy to create and could run on very small systems, like early 16/32 microprocessors.
Adding a bunch of code for hitting all the niche cases of type inference would have drastically added to the amount of code required to make a C compiler. In fact, I've argued as a (half) joke that K&R C had type inference, but it always inferred int
. If you didn't tell C what type an object was, it assumed int
(which could cause some really gnarly bugs, let me tell you...)
Secondly, since K&R C was weakly(barely) typed, the information in many cases flat out wasn't available. The destination type of a pointer assignment can be an int, or visa versa, and K&R C has no problem with that. The compiler simply cannot infer a dereference. The coder is assumed to know what she's doing.
Also realize that in C pointers and arrays are essentially syntactic sugar for each other. This means now your . operator would have to automagically work on arrays too. For instance, if member
happened to be a char array, now structure.member would return with the first character. And again, both chars and pointer are assignable into ints, so context doesn't help you.
This being said, you aren't the first to notice this issue. In fact, Ada was designed that dereferencing a pointer object is always assumed when a dot is used. In those cases where you want the actual pointer, you have to use .all
. The ambiguity (pointer vs. pointed to object) is still there, but resolved by moving the extra syntax to the weirder case.
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
add a comment |
Despite your assertion, there would in fact be situations where it would be ambigious.
First off, early C compilers were very simple. This was in fact the main appeal of the language, as compilers for it were very easy to create and could run on very small systems, like early 16/32 microprocessors.
Adding a bunch of code for hitting all the niche cases of type inference would have drastically added to the amount of code required to make a C compiler. In fact, I've argued as a (half) joke that K&R C had type inference, but it always inferred int
. If you didn't tell C what type an object was, it assumed int
(which could cause some really gnarly bugs, let me tell you...)
Secondly, since K&R C was weakly(barely) typed, the information in many cases flat out wasn't available. The destination type of a pointer assignment can be an int, or visa versa, and K&R C has no problem with that. The compiler simply cannot infer a dereference. The coder is assumed to know what she's doing.
Also realize that in C pointers and arrays are essentially syntactic sugar for each other. This means now your . operator would have to automagically work on arrays too. For instance, if member
happened to be a char array, now structure.member would return with the first character. And again, both chars and pointer are assignable into ints, so context doesn't help you.
This being said, you aren't the first to notice this issue. In fact, Ada was designed that dereferencing a pointer object is always assumed when a dot is used. In those cases where you want the actual pointer, you have to use .all
. The ambiguity (pointer vs. pointed to object) is still there, but resolved by moving the extra syntax to the weirder case.
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
add a comment |
Despite your assertion, there would in fact be situations where it would be ambigious.
First off, early C compilers were very simple. This was in fact the main appeal of the language, as compilers for it were very easy to create and could run on very small systems, like early 16/32 microprocessors.
Adding a bunch of code for hitting all the niche cases of type inference would have drastically added to the amount of code required to make a C compiler. In fact, I've argued as a (half) joke that K&R C had type inference, but it always inferred int
. If you didn't tell C what type an object was, it assumed int
(which could cause some really gnarly bugs, let me tell you...)
Secondly, since K&R C was weakly(barely) typed, the information in many cases flat out wasn't available. The destination type of a pointer assignment can be an int, or visa versa, and K&R C has no problem with that. The compiler simply cannot infer a dereference. The coder is assumed to know what she's doing.
Also realize that in C pointers and arrays are essentially syntactic sugar for each other. This means now your . operator would have to automagically work on arrays too. For instance, if member
happened to be a char array, now structure.member would return with the first character. And again, both chars and pointer are assignable into ints, so context doesn't help you.
This being said, you aren't the first to notice this issue. In fact, Ada was designed that dereferencing a pointer object is always assumed when a dot is used. In those cases where you want the actual pointer, you have to use .all
. The ambiguity (pointer vs. pointed to object) is still there, but resolved by moving the extra syntax to the weirder case.
Despite your assertion, there would in fact be situations where it would be ambigious.
First off, early C compilers were very simple. This was in fact the main appeal of the language, as compilers for it were very easy to create and could run on very small systems, like early 16/32 microprocessors.
Adding a bunch of code for hitting all the niche cases of type inference would have drastically added to the amount of code required to make a C compiler. In fact, I've argued as a (half) joke that K&R C had type inference, but it always inferred int
. If you didn't tell C what type an object was, it assumed int
(which could cause some really gnarly bugs, let me tell you...)
Secondly, since K&R C was weakly(barely) typed, the information in many cases flat out wasn't available. The destination type of a pointer assignment can be an int, or visa versa, and K&R C has no problem with that. The compiler simply cannot infer a dereference. The coder is assumed to know what she's doing.
Also realize that in C pointers and arrays are essentially syntactic sugar for each other. This means now your . operator would have to automagically work on arrays too. For instance, if member
happened to be a char array, now structure.member would return with the first character. And again, both chars and pointer are assignable into ints, so context doesn't help you.
This being said, you aren't the first to notice this issue. In fact, Ada was designed that dereferencing a pointer object is always assumed when a dot is used. In those cases where you want the actual pointer, you have to use .all
. The ambiguity (pointer vs. pointed to object) is still there, but resolved by moving the extra syntax to the weirder case.
edited 5 hours ago
answered 5 hours ago
T.E.D.T.E.D.
70125
70125
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
add a comment |
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
+5 for the 'syntactic sugar' and another +5 for showing a solution with the Ada reference.
– Raffzahn
5 hours ago
add a comment |
Some of the first C code I saw was like this: 0x8040->output = 'A';
— its purpose was accessing memory mapped I/O locations. Needless to say it took me a while to figure out what this code was supposed to do, and the hex constant there really threw me.
The original K&R C placed all field names (here output
) into the same namespace. It was an error to have two fields of the same name in different structs at different offsets — but ok to have the same name at the same offset, the idea here being that two different structs could share the same initial fields, giving cheap way of doing "subclassing" to put varying data members at the end of the struct.
A struct could also be anonymous, e.g. no tag name for the struct. None the less, the members could still be used in .
or ->
expressions.
The C Programming Language (K&R C) Appendix A, p197,209
[8.5] ... Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if the preceding members differ the construction is nonportable.)
...
[14.1] ... §7.1 says that in a direct or indirect structure reference (with . or ->) the name on the right must be a member of the structure named or pointed to by the expression on the left. To allow an escape from the typing rules, this restriction is not firmly enforced by the compiler. In fact, any lvalue is allowed before ., and the lvalue is then assumed to have the form of the structure of which the name on the right is a member. Also, the expression before a -> is required only to be a pointer or an integer. If an integer, it is taken to be the absolute address, in machine storage units, of the appropriate structure.
Since the K&R language & compiler didn't care what the type of the left hand side of .
and ->
was, the only way it had to tell the difference by having the two operators.
The ANSI C line of standards simply followed suit in syntax, even as these old rules were abandoned.
add a comment |
Some of the first C code I saw was like this: 0x8040->output = 'A';
— its purpose was accessing memory mapped I/O locations. Needless to say it took me a while to figure out what this code was supposed to do, and the hex constant there really threw me.
The original K&R C placed all field names (here output
) into the same namespace. It was an error to have two fields of the same name in different structs at different offsets — but ok to have the same name at the same offset, the idea here being that two different structs could share the same initial fields, giving cheap way of doing "subclassing" to put varying data members at the end of the struct.
A struct could also be anonymous, e.g. no tag name for the struct. None the less, the members could still be used in .
or ->
expressions.
The C Programming Language (K&R C) Appendix A, p197,209
[8.5] ... Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if the preceding members differ the construction is nonportable.)
...
[14.1] ... §7.1 says that in a direct or indirect structure reference (with . or ->) the name on the right must be a member of the structure named or pointed to by the expression on the left. To allow an escape from the typing rules, this restriction is not firmly enforced by the compiler. In fact, any lvalue is allowed before ., and the lvalue is then assumed to have the form of the structure of which the name on the right is a member. Also, the expression before a -> is required only to be a pointer or an integer. If an integer, it is taken to be the absolute address, in machine storage units, of the appropriate structure.
Since the K&R language & compiler didn't care what the type of the left hand side of .
and ->
was, the only way it had to tell the difference by having the two operators.
The ANSI C line of standards simply followed suit in syntax, even as these old rules were abandoned.
add a comment |
Some of the first C code I saw was like this: 0x8040->output = 'A';
— its purpose was accessing memory mapped I/O locations. Needless to say it took me a while to figure out what this code was supposed to do, and the hex constant there really threw me.
The original K&R C placed all field names (here output
) into the same namespace. It was an error to have two fields of the same name in different structs at different offsets — but ok to have the same name at the same offset, the idea here being that two different structs could share the same initial fields, giving cheap way of doing "subclassing" to put varying data members at the end of the struct.
A struct could also be anonymous, e.g. no tag name for the struct. None the less, the members could still be used in .
or ->
expressions.
The C Programming Language (K&R C) Appendix A, p197,209
[8.5] ... Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if the preceding members differ the construction is nonportable.)
...
[14.1] ... §7.1 says that in a direct or indirect structure reference (with . or ->) the name on the right must be a member of the structure named or pointed to by the expression on the left. To allow an escape from the typing rules, this restriction is not firmly enforced by the compiler. In fact, any lvalue is allowed before ., and the lvalue is then assumed to have the form of the structure of which the name on the right is a member. Also, the expression before a -> is required only to be a pointer or an integer. If an integer, it is taken to be the absolute address, in machine storage units, of the appropriate structure.
Since the K&R language & compiler didn't care what the type of the left hand side of .
and ->
was, the only way it had to tell the difference by having the two operators.
The ANSI C line of standards simply followed suit in syntax, even as these old rules were abandoned.
Some of the first C code I saw was like this: 0x8040->output = 'A';
— its purpose was accessing memory mapped I/O locations. Needless to say it took me a while to figure out what this code was supposed to do, and the hex constant there really threw me.
The original K&R C placed all field names (here output
) into the same namespace. It was an error to have two fields of the same name in different structs at different offsets — but ok to have the same name at the same offset, the idea here being that two different structs could share the same initial fields, giving cheap way of doing "subclassing" to put varying data members at the end of the struct.
A struct could also be anonymous, e.g. no tag name for the struct. None the less, the members could still be used in .
or ->
expressions.
The C Programming Language (K&R C) Appendix A, p197,209
[8.5] ... Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if the preceding members differ the construction is nonportable.)
...
[14.1] ... §7.1 says that in a direct or indirect structure reference (with . or ->) the name on the right must be a member of the structure named or pointed to by the expression on the left. To allow an escape from the typing rules, this restriction is not firmly enforced by the compiler. In fact, any lvalue is allowed before ., and the lvalue is then assumed to have the form of the structure of which the name on the right is a member. Also, the expression before a -> is required only to be a pointer or an integer. If an integer, it is taken to be the absolute address, in machine storage units, of the appropriate structure.
Since the K&R language & compiler didn't care what the type of the left hand side of .
and ->
was, the only way it had to tell the difference by having the two operators.
The ANSI C line of standards simply followed suit in syntax, even as these old rules were abandoned.
answered 1 hour ago
Erik EidtErik Eidt
1,147412
1,147412
add a comment |
add a comment |
Thanks for contributing an answer to Retrocomputing Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f10812%2fwhy-did-c-use-the-operator-instead-of-reusing-the-operator%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
You can still write
(*structure).member
, if you like that more. (I don't, very probably K&R didn't, either. It is a bit awkward to handle because of C operator precedence, and that might answer your question)– tofro
9 hours ago
@tofro: True, such a form always was possible, and avoids introducing another operator. However, it is far worse in terms of finger acrobatics and (as BrianH points out) readability.
– Dr Sheldon
6 hours ago
4
Just out of curiosity: In your hypothetical language where
a.b
could be interpreted as(*a).b
if a is a pointer-to-struct, would it also be automatically be interpreted as(**a).b
if a is a pointer-to-pointer-to-struct? Just to point out a possible consequence…– wrtlprnft
6 hours ago
@wrtlprnft: Interesting thought. I can see arguments both for and against such behavior, so I'm not sure there is a clear answer.
– Dr Sheldon
6 hours ago
3
People complain a lot about pointers being confusing. Imagine adding to that confusion by not knowing if a variable was a pointer or not when reading code.
– JPhi1618
6 hours ago