C Language Without The . Operator

struct is one of the most fundamental concepts in the C programming language. There are 2 operators for accessing fields of a struct: . and ->.

Let's think about this idea: Removing the . operator from the C programming language.

Removing the . operator

For a long time, people tend to explain the -> operator by the . operator since students usually have learned the . operator earlier.

For example, let's define a struct first:

struct person {
    char *name;
    int age;
};

struct person harry = { "Harry", 20 };

struct person *leader = &harry;

Many people would tell new-comers that

leader->name

means

(*leader).name

It is OK, but we can also use the -> operator to explain the . operator. We can say that

harry.name

means

(&harry)->name

And actually, this is what compilers and computers have been doing all the time.

In the world of computers, harry.name doesn't exist, (&harry)->name is closer to what the machines have been doing.

Let's go further

What if we also eliminate the -> operator?

The -> operator is just a syntax sugar of the * operator combining the + operator.

leader->name

means

*(leader + OFFSET_OF_name)

And

harry.name

means

*(&harry + OFFSET_OF_name)

Let's see some more complex examples.

Example 1

a.b.c

means

(& (&a)->b)->c

which means

*(&*(&a + OFFSET_OF_b) + OFFSET_OF_c)

The concatenating & and * will consume each other, so it becomes

*(&a + OFFSET_OF_b + OFFSET_OF_c)

Cool !

So when you write &a.b.c, it is actually

&*(&a + OFFSET_OF_b + OFFSET_OF_c)

which is

&a + OFFSET_OF_b + OFFSET_OF_c

No * operator anymore, it's just + operations.

Did you know that &a.b.c is a cheaper operation than a.b.c ?

Example 2

&a->b.c

means

&*(&*(a + OFFSET_OF_b) + OFFSET_OF_c)

which becomes

a + OFFSET_OF_b + OFFSET_OF_c

Example 3

&a->b->c

means

&*(*(a + OFFSET_OF_b) + OFFSET_OF_c)

which means

*(a + OFFSET_OF_b) + OFFSET_OF_c

Example 4

Let's do a more complex exercise:

&a->b.c.d->e.f

means

&*(&*(*(&*(&*(a + OFFSET_OF_b) + OFFSET_OF_c) + OFFSET_OF_d) + OFFSET_OF_e) + OFFSET_OF_f)

which means

*(a + OFFSET_OF_b + OFFSET_OF_c + OFFSET_OF_d) + OFFSET_OF_e + OFFSET_OF_f

Let's go even further

When temporary variable count is on the stack,

count

can also be represented as

*&count

which can also be

*(FRAME_POINTER + OFFSET_OF_count)

A struct object on stack follows the same rule:

leader.name

becomes

*(&leader + OFFSET_OF_name)

then becomes

*(FRAME_POINTER + OFFSET_OF_leader + OFFSET_OF_name)

And global variables can also be put into this rule.

my_global_variable

can also be represented as

*&my_global_variable

which is also

*(GLOBAL_VARIABLE_BASE_ADDRESS + OFFSET_OF_my_global_variable)

All variables and struct fields can be represented by additions of integers.