Last update June 1, 2008

Transitive Const

Difference (last change) (Author, normal page display)

Changed: 167c167

Clearly just allowing for mutable state doesn't give any guarantee that the extension will be just of the types 1-4.

Clearly just allowing for mutable state doesn't give any guarantee that the extension will be just of the types 1-4, and I saw little benefit from no_state if pure functions have no access to it.

Changed: 171c171

I think that the no_state keyword might be a good idea as they are safe by themselves, and allow libraries to implement 1-4 easily.

I think that the no_state keyword might be a good idea as they are safe by themselves, and allow libraries to implement 1-4 easily using cast.

Changed: 175c175,178

If the burden to implement the no_state would be big or having a really read-only invariant allows for real optimizations then I might change again idea

If the burden to implement the no_state would be big or having a really read-only invariant allows for real optimizations then I might change again idea.

One can avoid the keyword, if its only use are 1-4, and there is a guarantee that via a cast one can modify a const or invariant object that initially allocated as mutable (no read-only memory or real stringent tests that the memory will never change).

An attempt to present various points in the const discussion (at the moment biased toward my point of view but hopefully soon improved --fawzi).

Table of contents of this page

Const

Why do we want a const?

Mutable Const

Uses of mutable const

1. suspended value (lazy evaluation and caching)

Notes

2. parallel lazy evaluation and synchronization (dataflow variable)

Notes

3. caching last requests (memoization)

Notes

4. performance/statistical info

Notes

Const and pure

How to introduce mutable state

Smart low level library implementor view

Language designer view

Comments

Fawzi

You?

Implementation Proposals

no_state

Your Method?

Const

is an object that cannot be modified by the caller, and if the caller is const then it should not modify the object.

Why do we want a const?

http://www.digitalmars.com/d/2.0/const-faq.html

Mutable Const

Does it make sense for const object to have mutable state?

The answer (maybe surprisingly) is yes, but only if it is invisible from the outside: the object must look like it is constant.

The problem is that the extensions must be performed in a very careful and controlled way, otherwise they destroy all the properties of being const.

Uses of mutable const

Here are listed the possible reasonable uses of mutable parts in a const object. I think that it very important to list them all to then discuss about the solutions.

If you think that something is missing, add it.

1. suspended value (lazy evaluation and caching)

This means that a variable is already connected with the way to compute it a=f(x), but the computation has not been performed, and will be performed on request.

Pure functions should not be able to know if a is calculated or suspended, just to get the value.

This extension can be implemented in a wholly safe way is thread safe, and needs no locking.

In pseudocode:

getA(){

  if (a==undefined_value){
    auto newV=f(x);
    assert(a==undefined_value || a==newV);
    a=newV;
  }
  return a;

}

Notes

If a function/thread/process receives a copy of a const structure or two threads access simultaneously the object, the only possible ill effect is a performance penalty.

Here it should be clear that f should be pure and x invariant.

2. parallel lazy evaluation and synchronization (dataflow variable)

This is an evolution of the previous schema. The value of can also in this case be undefined, but now when a functions tries to access it instead of calculating it it waits until someone else sets it.

In pseudocode:

getA(){

  if (a==undefined_value){
    // wait for a to be set
    // can use different mechanisms, here a lock that is created locked
    // the lock could also be created just on need (protected by another lock)
    a.lock.acquire();
    a.lock.release();
  }
  return a;

}

setA(newVal){

  unify(a,newVal);
  a.lock.release(); // (if a lock was created)

}

bool unify(a,b) { /+ recursively copies in the undefined elements of a the corresponding value of b

 + fails if anywhere a and b are set to different values
 + (can be made symmetric and also copy in b)
 +/

}

Notes

If a function/thread/process receives a copy of a const structure then the two copies have to synchronized, but the synchronization can be done in any way, and moment, it only affects the performance, there can be no non deterministic effects, no race conditions, either it blocks (and always blocks) or it works (and always works).

Locks are created only to wait for assignment, not to exclude more than one call, or more than one assigment.

A variable like before can have only one value, but (like before) can be assigned more than once to the same value.

3. caching last requests (memoization)

This is just a performance optimization if one knows that it is quite likely that the last values computed are likely to be requested again, and the computation is expensive. In pseudocode:

getF(x){

  {
    scope(exit){
      f.lock.release();
    }
    f.lock.acquire();
    if (x in f.cache)
      return f.cache[x];
  }
  auto v=f(x);
  {
    scope(exit){
     f.lock.release();
    }
    f.lock.acquire();
    f.cache.add(x,v); // this might discard some old values, locking could be done in add
  }
  return v;

}

Notes

As shown the cache has to be guarded with locks to avoid multi threading issues. Copying the structure or multiple threads access influences just the performance (if locking is done correctly).

4. performance/statistical info

One just wants to collect informations about performance, number of calls, distribution of the arguments,...

This operations should be performed so that the body of the function does not see the extra information, i.e. in a subscope of the function. This way the result of the function cannot depend on the mutable information.

An example in pseudocode:

f(x){

  stat_info{
    f.extraInfo.ncalls+=1;
  }
  //body of f
  x2=...;
  stat_info{
    f.extraInfo.x_2_sum+=x2;
  }
  //body of f

}

Notes

Depending of the kind of information one might even not care if in some rare cases the counters do not get incremented (when two threads update them together). Otherwise locking, or some other way that gives atomic updates (like transactional memory,...).

In the case of copying unless one implements a merging operation this extra information will be partial. compiler optimizations will clearly change this information, but this should be ok, and the documentation should make it clear that one *cannot* depend on this information.

Const and pure

Should a pure function be able to access the mutable part of a const object? The unanimous agreement in the NG seems to be NO, so that one can guarantee all the nice properties of the pure functions.

One wants to make pure functions as convenient and attractive as possible. Only by making the use of pure functions pervasive one can really reap their benefits. If pure functions are slower, less efficient, this is not going to happen.

It is also clear that for pure functions to have their nice properties one has to allow *only* the forms of access that I described, no other way of access should be accepted.

Extensions 1-3 (4) are safe enough even for invariant a pure functions!

Fortunately as using cast it is possible to convince the compiler that an impure function actually satisfies the constraints of pure functions, and can be optimized consequently. So even in its "safe" variant the mutable state together with cast allows libraries to implement 1-4. Actually it is probably better to introduce the safe version and keep the unsafe operations limited to the minimum (in this case to the already present cast).

How to introduce mutable state

So now we come to the real question: how should the mutable state be introduced? I think that there can be mainly two views about this (note that this is not how the discussion developed in the newsgroup, or exactly what was said, but it is how I think the thing should be looked at)

Smart low level library implementor view

I want to add modifiable attributes to const objects in the most efficient way, and to implement 1-4, and a way tat I know will work (no compiler putting everything in read-only memory). To achieve this the best way is to allow for some pieces of the objects to reamain mutable even when const is applied to the object . Then I will take care of the rest.

Language designer view

No way I want to keep the language pure, let's start with a real complete const, in which "as the need arise" I will punch the holes 1-4 with enough time and in a controlled way. One (less efficently) can work around the limitations, modifiable parts just makes the whole more difficult to understand and bring a small benefit.

Comments

Fawzi

I started out clearly in the camp of the language designer view, because I want pure functions, and some assurance that they are pure. Clearly just allowing for mutable state doesn't give any guarantee that the extension will be just of the types 1-4, and I saw little benefit from no_state if pure functions have no access to it.

But then I started thinking the in fact the library I use in D is not the standard library, but tango, that in the community there are many smart people that are able to implement 1-4 in the correct way, that D is also very much about efficiency (so just use a standard implementation of them given by the language, might be considered too inefficient in some cases) made me reconsider a little my position.

I think that the no_state keyword might be a good idea as they are safe by themselves, and allow libraries to implement 1-4 easily using cast.

the language tries to support the correct ways 1-4 as much as possible (few additions in the standard library would give most of the requested things)

This way some of the cache/locking/lazy evaluation could be implemented also by others in the low level libraries.

If the burden to implement the no_state would be big or having a really read-only invariant allows for real optimizations then I might change again idea.

One can avoid the keyword, if its only use are 1-4, and there is a guarantee that via a cast one can modify a const or invariant object that initially allocated as mutable (no read-only memory or real stringent tests that the memory will never change).

You?

Implementation Proposals

no_state

mark the constantly mutable variables with attribute "no_state" (Stevens proposal, and later Janice unpaintable), actually I see no useful use case for no_state const, but well...
add mixin templates to add static lazy variables that implement 1 (something like template StaticLazy?(varType,varName,undefinedTest,calcFun)), and also a pointer based dynamic version (that can hold a delegate). This allows to have 1 even using just an int if you know that a given value of the int can be used as undefined_value, of the implementation would have minimal overhead
for 2 I think that adding a standard implementation of the dataflow variable that uses a standard way to store undefined values and do coping (unification) is acceptable because some overhead in size is acceptable in this case. Also such a variable needs to be more complex and in case of copying to stay synchronized.
add mixin templates that add a new cached a function.
for 4 one could add the possibility to add a mutable object of which all the methods that are have just const arguments but modify the the object itself to be deemed ok to call, but such a thing might be difficult to enforce, so that one might even mark uses 4 as unsafe.

There are many details to fill out but I think that such a proposal would allow basically everything while making it easy for peoples to stay safe, and informed when they aren't (and in that case if the compiler breaks their code or the program do not work as expected it is clear whose fault it is).