Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Im trying to see how the fence is applied.

I have this code (which Blocks indefinitely):

static void Main()
{
    bool complete = false;
    var t = new Thread(() => {
        bool toggle = false;
        while(!complete) toggle = !toggle;
    });
    t.Start();
    Thread.Sleep(1000);
    complete = true;
    t.Join(); // Blocks indefinitely
}

Writing volatile bool _complete; solve the issue .

Acquire fence :

An acquire-fence prevents other reads/writes from being moved before the fence;

But if I illustrate it using an arrow ( Think of the arrowhead as pushing everything away.)

so now - the code can look like :

 var t = new Thread(() => {
            bool toggle = false;
            while( !complete ) 
                    ↓↓↓↓↓↓↓     // instructions can't go up before this fence.  
               {
                 toggle = !toggle;
                }
        });

I don't understand how the illustrated drawing represent a solution for solving this issue.

I do know that while(!complete) now reads the real value. but how is it related to complete = true; location to the fence ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
140 views
Welcome To Ask or Share your Answers For Others

1 Answer

Making complete volatile does two things:

  • It prevents the C# compiler or the jitter from making optimizations that would cache the value of complete.

  • It introduces a fence that tells the processor that caching optimizations of other reads and writes that involve either pre-fetching reads or delaying writes need to be de-optimized to ensure consistency.

Let's consider the first. The jitter is perfectly within its rights to see that the body of the loop:

    while(!complete) toggle = !toggle;

does not modify complete and therefore whatever value complete has at the beginning of the loop is the value that it is going to have forever. So the jitter is allowed to generate code as though you'd written

    if (!complete) while(true) toggle = !toggle;

or, more likely:

    bool local = complete; 
    while(local) toggle = !toggle;

Making complete volatile prevents both optimizations.

But what you are looking for is the second effect of volatile. Suppose your two threads are running on different processors. Each has its own processor cache, which is a copy of main memory. Let's suppose that both processors have made a copy of main memory in which complete is false. When one processor's cache sets complete to true, if complete is not volatile then the "toggling" processor is not required to notice that fact; it has its own cache in which complete is still false and it would be expensive to go back to main memory every time.

Marking complete as volatile eliminates this optimization. How it eliminates it is an implementation detail of the processor. Perhaps on every volatile write the write gets written to main memory and every other processor discards their cache. Or perhaps there is some other strategy. How the processors choose to make it happen is up to the manufacturer.

The point is that any time you make a field volatile and then read or write it, you are massively disrupting the ability of the compiler, the jitter and the processor to optimize your code. Try to not use volatile fields in the first place; use higher-level constructs, and don't share memory between threads.

I'm trying to visualize the sentence :"An acquire-fence prevents other reads/writes from being moved before the fence..." What instruction should not be before that fence ?

Thinking about instructions is probably counterproductive. Rather than thinking about a bunch of instructions just concentrate on the sequence of reads and writes. Everything else is irrelevant.

Suppose you have a block of memory, and part of it is copied to two caches. For performance reasons, you read and write mostly to the caches. Every now and then you re-synchronize the caches with main memory. What effect does this have on a sequence of reads and writes?

Suppose we want this to happen to a single integer variable:

  1. Processor Alpha writes 0 to main memory.
  2. Processor Bravo reads 0 from main memory.
  3. Processor Bravo writes 1 to main memory.
  4. Processor Alpha reads 1 from main memory.

Suppose what really happens is this:

  • Processor Alpha writes 0 to the cache, and synchronizes to main memory.
  • Processor Bravo synchronizes cache from main memory and reads 0.
  • Processor Bravo writes 1 to cache and synchronizes the cache to main memory.
  • Processor Alpha reads 0 -- a stale value -- from its cache.

How is what really happened in any way different from this?

  1. Processor Alpha writes 0 to main memory.
  2. Processor Bravo reads 0 from main memory.
  3. Processor Alpha reads 0 from main memory.
  4. Processor Bravo writes 1 to main memory.

It isn't different. Caching turns "write read write read" into "write read read write". It moves one of the reads backwards in time, and, in this case equivalently, moves one of the writes forwards in time.

This example just involves two reads and two writes to one location, but you can imagine a scenario where there are many reads and many writes to many locations. The processor has wide lattitude to move reads backwards in time and move writes forwards in time. The precise rules for what moves are legal and which are not differ from processor to processor.

A fence is a barrier that prevents reads from moving backwards or writes from moving forwards past it. So if we had:

  1. Processor Alpha writes 0 to main memory.
  2. Processor Bravo reads 0 from main memory.
  3. Processor Bravo writes 1 to main memory. FENCE HERE.
  4. Processor Alpha reads 1 from main memory.

No matter what caching strategy a processor uses, it is now not allowed to move read 4 to any point before the fence. Similarly it is not allowed to move write 3 ahead in time to any point after the fence. How a processor implements a fence is up to it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...