Initialization in Programming Languages

\[...\]
This has led to innumerable errors, vulnerabilities, and system crashes.” ~ Tony Hoare 1

The core problem addressed here is Null Safety and how to properly initialize objects to ensure type invariants (specifically non-nullness) hold true.

Simple Non-Null Types

The goal is to prevent null-dereferencing errors statically. Most of the variables are null, after initialization.

Non-Null Types vs. Possibly-Null Types

Non-null type T!: Consists of references to T-objects. It cannot hold null.
Possibly-null type T?: Consists of references to T-objects plus null. This corresponds to the standard type T in languages like Java, and the most common case among most languages.

Type Safety and Invariant

Invariant: If the static type of an expression e is a non-null type, then e’s value at runtime must be different from null.
Enforcement: We require non-null types for the receiver of each field access, array access, method call, and throw statement. Equivalent to message not understood cited in Typing and Subtyping.

Subtyping and Casts

The values of T! are a proper subset of T?.

Subtyping:
- S! <: T! (if S extends T)
- S? <: T? (if S extends T)
- T! <: T? (Non-null is a subtype of possibly-null) 6.
Downcasts: Casting from T? (possibly-null) to T! (non-null) requires a runtime check., so we know T? is not a subtype of T!.

graph TD
    A[object?]
    B[T?]
    C[S?]
    D(object!)
    E(T!)
    F(S!)
    B --> A
    C --> B
    D -.-> A
    E -.-> B
    E --> D
    F -.-> C
    F --> E
    style D fill:#f0f8ff,color:#007bff,stroke:#007bff,stroke-width:2px
    style E fill:#f0f8ff,color:#007bff,stroke:#007bff,stroke-width:2px
    style F fill:#f0f8ff,color:#007bff,stroke:#007bff,stroke-width:2px

Type Rules

Some expressions in a language need to be non-null

Receiver of field access
Receiver of array access
Receiver of method call
Expression of a throw statement. If not, you can just get a null exception. Now we would like to have compile time error, to make things statically safe.

Safe Handling Null

Control Flow Analysis (Dataflow Analysis) How do we safely use a possibly-null type? We can check against null.

Definite Assignment: Java/C# use dataflow analysis to ensure local variables are assigned before use.
Null Checks: If we check if (n != null), the compiler can treat n as non-null inside the block.
Limitations: Dataflow analysis works well for local variables but not for heap locations (fields) because of aliasing and side effects (e.g., a method call foo(this) might set a field to null behind the scenes). Concurrency might be a problem. I have personally seen this thing for the first time in Typescript during my internship at Cubbit.

Object Initialization

The main challenge is: How do we construct an object of a non-null type?

When we create an object (new T()), fields start as null (or default values). We must ensure all non-null fields are assigned non-null values before the constructor terminates.

The purpose is establishing some invariant. We want to know when it is possible to rely on the invariants.

The Escaping Problem

A naive definite assignment check for fields isn’t enough. If the this reference “escapes” the constructor (is passed to another method, stored in a global field, etc.) before the object is fully initialized, other code might see null fields that are supposed to be T!.

Escaping Scenarios:

Method Calls: Calling a dynamically bound method on this inside the constructor. A subclass might override this method and access a field that hasn’t been initialized yet.
Callbacks: Registering this as a listener (e.g., Observer pattern) inside the constructor. The Subject might call back immediately.
Concurrent Access: Publishing this to a static field where another thread picks it up, bu the object has not been initialized yet.

Initialization Phases

To solve this, we track the state of the object during construction using a type system.

Free (Under Construction): The object is being created. Fields may be null.
Committed (Initialized): The object is fully initialized. All non-null invariants hold.

Construction Types

For every class T, we distinguish three states for references:

T! / T? (Committed): Standard types. Construction is complete, if you read non null, it means, we only read non null, for the other cases its not guaranteed.
free T! / free T? (Free): References to objects under construction (like this inside a constructor).
unc T! / unc T? (Unclassified): We don’t know if it’s free or committed (common supertype).
There are no casts from unclassified to free or committed types.

Initialization Requirements

Requirement 1: Local Initialization: An object is locally initialized if its non-null fields have non-null values. Committed types must be locally initialized.
Requirement 2: Transitive Initialization: If an object is committed, everything it reaches (references) must also be committed (transitively initialized).

Handling Cyclic Structures

Cyclic structures (e.g., a Node pointing to another Node) pose a problem. We cannot have both objects fully “committed” before they reference each other.

Solution: We allow free references to be assigned to fields of an object under construction.
The constructor parameters can be declared free to allow passing partially initialized objects (like this).
- It will be completed only after the last one is completed for requirement two of being well typed.


class List {
    List! next; // cyclic list

    List(int n) {
        if (n == 1) {
            next = this;
        } else {
            next = new List(this, n);
        }
    }

    List(free List! last, int n) {
        if (n == 2) {
            next = last;
        } else {
            next = new List(last, n - 1);
        }
    }
}

Type Rules for Initialization

Field Write: e1.f = e2. If e1 is free (under construction), we can assign committed values to it. If e1 is committed, we cannot assign a free or unitialized values to it (preserves transitive initialization).
Field Read:
Method Calls: Methods must declare if they accept free receivers. You cannot call a standard method on a free object unless that method is marked free (meaning it knows how to handle partially initialized objects)22.

Subtyping for Initialization

free T! <: unc T! <: T!
free T? <: unc T? <: T?

Lazy Initialization

We explain here what are usually the main advantages of lazy initialization. To reduce startup time when initializing an application we use lazy initialization methods. Sometimes we want to delay initialization until the field is accessed.

Since the field starts as null, it must be declared T? (possibly-null) internally.
The getter method checks for null, initializes if necessary, and returns the value as T! (non-null).

Non-Null Arrays

Arrays are difficult because they don’t have constructors in the traditional sense; they are initialized to default values (null).

Problem: String![] s = new String![5] creates an array of nulls, violating the type String!.
Solutions:
- Array initializers: s = { "a", "b" }.
- Runtime checks/Assertion methods (e.g., Spec# NonNullType.AssertInitialized(s)). Methods cannot check within runtime loops (they can initialize other parts).

Since arrays have really two references, we can have many types for arrays: `Person! [ ] ! a;Person? [ ] ! b;Person! [ ] ? c;Person? [ ] ? d;

Static Initializers

Static initializers are executed once one of the following occurs:

The class is instantiated.

A static field is accessed.

A static method is called.

Initialization of Global Data

Global data (Singletons, Factories, Flyweights) must be initialized before access.

Design Goals

Effectiveness: Ensure initialization before first access.
Clarity: Clean semantics.
Laziness: Initialize only when needed to save startup time.

Based on section 6.3 of the provided lecture slides, here is the detailed breakdown of the approaches for initializing global data, including the code snippets you requested.

Global Vars and Init-Methods

This approach uses global variables to store references to global data, but relies on explicit calls to initialization methods to set them up. This is often the most basic way to handle globals in languages that support them.

Mechanism: Explicit init() calls that must be invoked, usually from a main function.
Pros: It is simple to implement.
Cons:
- Manual Ordering: The programmer must manually code the order of initialization to satisfy dependencies, which is error-prone.
- Encapsulation: Main methods often need to know internal module dependencies to call inits in the right order, breaking information hiding.
- No Laziness: It generally requires upfront initialization unless manually coded otherwise.

// Global variable declaration
global Factory theFactory;

void init( ) {
    theFactory = new Factory( );
}

class Factory {
    HashMap flyweights;
    Flyweight create( Data d ) { ... }
}

// Client usage
Flyweight f = theFactory.create( ... );

Static Fields and Initializers (Java/C#)

Java and C# use static fields to store global data and static initializer blocks to initialize them. These blocks run automatically immediately before the class is first used (e.g., creation of an instance, static method call, or static field access)666.

Mechanism: static { ... } blocks executed by the runtime system.
Pros:
- Automatic & Lazy: Initialization happens just in time when the class is needed, reducing startup time7.
- System-managed: The system handles the triggering of initialization8.
Cons:
- Mutual Dependencies: If class A’s static initializer triggers class B, and class B triggers class A, the cycle can lead to crashes or NullPointerExceptions because initialization is considered “in progress” and won’t restart, leaving fields uninitialized 9.
- Side Effects: Static initializers can have arbitrary side effects (like modifying other static fields), making it hard to reason about the program state since execution order depends on which class is accessed first 10.

class Factory {
    static Factory theFactory;
    HashMap flyweights;

    // Static initializer block
    static {
        theFactory = new Factory( );
    }

    Flyweight create( Data d ) { ... }
}

// Initialization triggered here automatically
Factory o = Factory.theFactory;
Flyweight f = o.create( ... );

Scala Objects

Scala provides direct language support for the Singleton pattern using the object keyword. This defines a class and a single instance of that class simultaneously12.

Mechanism: object SingletonName { ... }.
Pros: Syntactic sugar and language-level support for singletons.
Cons: Under the hood, this often translates to Java static fields and initializers. Therefore, it inherits all the pros and cons of the static field approach, including issues with mutual dependencies and side effects13.

Scala

object Factory {
    val flyweights: HashMap[ ... ]

    def create( d: Data ): Flyweight = {
        // ... implementation ...
    }
}

Eiffel Once Methods

Eiffel uses once methods (routines). The body of a once method is executed only the first time it is called. The result is cached and returned for all subsequent calls.

Mechanism: The once keyword applied to a feature (method).
Pros:
- Laziness: The initialization code runs only when the data is actually requested.
- Caching: Provides a consistent global access point.
Cons:
- Recursion/Mutual Dependencies: If a once method recursively calls itself (directly or via another object) during its first execution, it returns the current (partial) result rather than waiting or crashing. This often leads to meaningless values like 0 or null being used.
- Parameter Ignoring: Arguments are used only for the first call; subsequent calls ignore arguments, which can be confusing.

class FlyweightMgr
feature
    theFactory: Factory
    
    -- "once" ensures this runs only once
    once
        create Result
    end
end

-- Usage
o := manager.theFactory
f := o.createFlyweight( ... )

Simple Non-Null Types#

Non-Null Types vs. Possibly-Null Types#

Type Safety and Invariant#

Subtyping and Casts#

Type Rules#

Safe Handling Null#

Object Initialization#

The Escaping Problem#

Initialization Phases#

Construction Types#

Initialization Requirements#

Handling Cyclic Structures#

Type Rules for Initialization#

Subtyping for Initialization#

Lazy Initialization#

Non-Null Arrays#

Static Initializers#

Initialization of Global Data#

Design Goals#

Global Vars and Init-Methods#

Static Fields and Initializers (Java/C#)#

Scala Objects#

Eiffel Once Methods#