Structured parallel communication

One of the PM design goals was to create a combined model for parallelisation and vectorisation. Basic PM parallel structures have the same form when distributing code over a distributed cluster as they do running vector operations on a single core. This not to say that this underlying hardware structure is invisible to the programmer; the mapping from parallel programming structures to hardware is both explicit and configurable, as will be described in a future post.

A previous post introduced the PM for statement and communicating operators. The for statement executes its body of statements concurrently for each member of an array or domain. Communicating operators provide both communication and synchronisation by allowing a local variable to be viewed as an array over the complete ‘iteration’ domain.

In the absence of other control structures, communicating operators act as straightforward synchronisation points between invocations:

 for .. do   
      statements_a  
      .. @x ..  
      statements_b  
      .. @y ..   
      statements_c  
 endfor

In terms of logical program flow, every invocation of the for statement body must complete statements_a and the communicating operator @x before any invocation can proceed to statements_b. Similarly, all invocations must complete statements_b and @y before any invocation can commence statements_c. The actual implementation may soften this hard conceptual lockstep, though only if this can be made transparent to the programmer. Once nested control structures come into the picture, things get a little more complicated. Here PM imposes the structured parallel communication rule: each invocation must execute the same operators in the same order on the same variables.

In practice, this rule is implemented though a set of more specific rules associated with different forms of control statement. For conditional statements, each branch must contain the same sequence of operators:

 if .. then  
      ..@x..  
      ..@y..  
 else  
      ..@x..  
      ..@y..  
 endif

This rule, of course, applies recursively:

 if .. then  
      if .. then  
           ..@x..  
      else  
           ..@x..  
      endif  
      ..@y..  
 else  
      ..@x..  
      ..@y..  
 endif

With loops, the rule is relaxed slightly. Loops containing communicating operators can execute differing numbers of times in each invocation. When a loop in one invocation runs for longer than those in other invocations, any communicating operator that it executes will obtain variable loop-exit values from those invocations in which the loop has already terminated (for a zero-iteration loop the exit value for a variable is equal to the entry value). The end of any loop containing communicating operators is an implicit synchronisation point. Execution only proceeds beyond the end of such a loop once all invocations have attained the loop exit conditions.

 for x in grid do  
      while abs(mean::x@{-1:1,-1:1}-x) > thresh do  
           x=mean::x@{=1:1,-1:1}  
      endwhile  
 endfor

The most complex case of structured parallel communication concerns loops inside conditional statements. Here the rule is that each branch of the conditional statement must contain an iterative loop in the same position, respective to the overall sequence of communicating operators, and that each of these loops must contain the same sequence of operators. The loops do not have to be of the same type. For example, one could be a while loop and the other a repeat.. until. It is also possible for the operator sequence in one loop to be an exact repetition on the sequence in a corresponding loop.

 if .. then  
      while .. do  
           .. @x ..  
           .. @y ..  
      endwhile  
 else  
      repeat  
           .. @x .. 
           .. @y ..  
           .. @x ..  
           .. @y ..  
      until ..  
 endif

Structured parallel communication is designed to achieve a number of goals. As with other structuring regimes, it is hoped that it will assist the early detection of bugs. It also makes possible a range of efficient implementation approaches, ranging from vector operations to MPI collective communication (to be discussed in a future post).

Of course, there will be some parallel algorithms that do not fit into this structured model. These are facilitated through a more conventional message-passing model, modified to work in both vectorised and parallelised environments. Again, more about this at a later date.

PM Programming Language

Search This Blog

Structured parallel communication

Comments

Post a Comment

Popular posts from this blog

Data in, data out

Compile time, run time, coffee time

The PM Type System