C Language Without The .
Operator
struct
is one of the most fundamental concepts in the C programming language.
There are 2 operators for accessing fields of a struct: .
and ->
.
Let's think about this idea: Removing the .
operator from the C programming language.
Removing the .
operator
For a long time, people tend to explain the ->
operator by the .
operator
since students usually have learned the .
operator earlier.
For example, let's define a struct first:
struct person {
char *name;
int age;
};
struct person harry = { "Harry", 20 };
struct person *leader = &harry;
Many people would tell new-comers that
leader->name
means
(*leader).name
It is OK, but we can also use the ->
operator to explain the .
operator. We can say that
harry.name
means
(&harry)->name
And actually, this is what compilers and computers have been doing all the time.
In the world of computers, harry.name
doesn't exist, (&harry)->name
is closer to what the machines have been doing.
Let's go further
What if we also eliminate the ->
operator?
The ->
operator is just a syntax sugar of the *
operator combining the +
operator.
leader->name
means
*(leader + OFFSET_OF_name)
And
harry.name
means
*(&harry + OFFSET_OF_name)
Let's see some more complex examples.
Example 1
a.b.c
means
(& (&a)->b)->c
which means
*(&*(&a + OFFSET_OF_b) + OFFSET_OF_c)
The concatenating &
and *
will consume each other, so it becomes
*(&a + OFFSET_OF_b + OFFSET_OF_c)
Cool !
So when you write &a.b.c
, it is actually
&*(&a + OFFSET_OF_b + OFFSET_OF_c)
which is
&a + OFFSET_OF_b + OFFSET_OF_c
No *
operator anymore, it's just +
operations.
Did you know that &a.b.c
is a cheaper operation than a.b.c
?
Example 2
&a->b.c
means
&*(&*(a + OFFSET_OF_b) + OFFSET_OF_c)
which becomes
a + OFFSET_OF_b + OFFSET_OF_c
Example 3
&a->b->c
means
&*(*(a + OFFSET_OF_b) + OFFSET_OF_c)
which means
*(a + OFFSET_OF_b) + OFFSET_OF_c
Example 4
Let's do a more complex exercise:
&a->b.c.d->e.f
means
&*(&*(*(&*(&*(a + OFFSET_OF_b) + OFFSET_OF_c) + OFFSET_OF_d) + OFFSET_OF_e) + OFFSET_OF_f)
which means
*(a + OFFSET_OF_b + OFFSET_OF_c + OFFSET_OF_d) + OFFSET_OF_e + OFFSET_OF_f
Let's go even further
When temporary variable count
is on the stack,
count
can also be represented as
*&count
which can also be
*(FRAME_POINTER + OFFSET_OF_count)
A struct object on stack follows the same rule:
leader.name
becomes
*(&leader + OFFSET_OF_name)
then becomes
*(FRAME_POINTER + OFFSET_OF_leader + OFFSET_OF_name)
And global variables can also be put into this rule.
my_global_variable
can also be represented as
*&my_global_variable
which is also
*(GLOBAL_VARIABLE_BASE_ADDRESS + OFFSET_OF_my_global_variable)
All variables
and struct fields
can be represented by additions of integers.