Skip to main content

Structured parallel communication


One of the PM design goals was to create a combined model for parallelisation and vectorisation. Basic PM parallel structures have the same form when distributing code over a distributed cluster as they do running vector operations on a single core. This not to say that this underlying hardware structure is invisible to the programmer; the mapping from parallel programming structures to hardware is both explicit and configurable, as will be described in a future post.

A previous post introduced the PM for statement and communicating operators. The for statement executes its body of statements concurrently for each member of an array or domain.  Communicating operators provide both communication and synchronisation by allowing a local variable to be viewed as an array over the complete ‘iteration’ domain.

In the absence of other control structures, communicating operators act as straightforward synchronisation points between invocations:

 for .. do   
      statements_a  
      .. @x ..  
      statements_b  
      .. @y ..   
      statements_c  
 endfor  

In terms of logical program flow, every invocation of the for statement body must complete statements_a and the communicating operator @x before any invocation can proceed to statements_b. Similarly, all invocations must complete statements_b and @y before any invocation can commence statements_c. The actual implementation may soften this hard conceptual lockstep, though only if this can be made transparent to the programmer. Once nested control structures come into the picture, things get a little more complicated. Here PM imposes the structured parallel communication rule: each invocation must execute the same operators in the same order on the same variables.


In practice, this rule is implemented though a set of more specific rules associated with different forms of control statement. For conditional statements, each branch must contain the same sequence of operators:
 if .. then  
      ..@x..  
      ..@y..  
 else  
      ..@x..  
      ..@y..  
 endif  

This rule, of course, applies recursively:
 if .. then  
      if .. then  
           ..@x..  
      else  
           ..@x..  
      endif  
      ..@y..  
 else  
      ..@x..  
      ..@y..  
 endif  
With loops, the rule is relaxed slightly. Loops containing communicating operators can execute differing numbers of times in each invocation. When a loop in one invocation runs for longer than those in other invocations, any communicating operator that it executes will obtain variable loop-exit values from those invocations in which the loop has already terminated (for a zero-iteration loop the exit value for a variable is equal to the entry value). The end of any loop containing communicating operators is an implicit synchronisation point. Execution only proceeds beyond the end of such a loop once all invocations have attained the loop exit conditions.
 for x in grid do  
      while abs(mean::x@{-1:1,-1:1}-x) > thresh do  
           x=mean::x@{=1:1,-1:1}  
      endwhile  
 endfor  
The most complex case of structured parallel communication concerns loops inside conditional statements. Here the rule is that each branch of the conditional statement must contain an iterative loop in the same position, respective to the overall sequence of communicating operators, and that each of these loops must contain the same sequence of operators. The loops do not have to be of the same type. For example, one could be a while loop and the other a repeat.. until. It is also possible for the operator sequence in one loop to be an exact repetition on the sequence in a corresponding loop.

 if .. then  
      while .. do  
           .. @x ..  
           .. @y ..  
      endwhile  
 else  
      repeat  
           .. @x .. 
           .. @y ..  
           .. @x ..  
           .. @y ..  
      until ..  
 endif       
Structured parallel communication is designed to achieve a number of goals. As with other structuring regimes, it is hoped that it will assist the early detection of bugs. It also makes possible a range of efficient implementation approaches, ranging from vector operations to MPI collective communication (to be discussed in a future post).

Of course, there will be some parallel algorithms that do not fit into this structured model. These are facilitated through a more conventional message-passing model, modified to work in both vectorised and parallelised environments. Again, more about this at a later date.


Comments

Popular posts from this blog

Data in, data out

When a program is running over multiple processors or processing cores then it is important to be able to map where data are flowing. One way in which data flows can be translated into programming language semantics is though argument passing to subroutines. In the absence of global variables, data transfer to and from a subroutine straightforwardly can be directly related to communication to and from another processor. In PM, the emphasis is on data parallelism through the for statement. for index in array do index=process(index) endfor      So what are the rules governing data flow into and out of this statement? These are primarily based around variable locality. x:= 1 for index in array do y:=func(index) index=process(x,y) endfor Here x is relatively global to the for statement and y is local to it (a variables scope only extends to the end of the statement block in which it is defined).  There are two ways t...

Compile time, run time, coffee time

[ Please note that in the following discussion I will be using PM 0.5 notation, which is slightly different to PM 0.4 and also in some flux as PM 0.5 is still in development. ]   Many programming languages (and most newly developed ones) include some form of compile-time programming, ranging from simple preprocessors ( #define , #if in C) to fully blown macro systems capable of re-writing the abstract syntax tree during compilation (Rust, Julia, etc .). In line with its philosophy of keeping things as simple as possible, but not simpler, PM takes a middle road with compile-time programming supported primarily through the type system. There is nothing too radical here – this is not an area where the language aims to break new ground.  The PM approach centres around the use of compile-time types. Many languages use special types to denote literal values and PM follows this trend. Literal integers, reals, strings and Booleans each have their own types: literal(int) , litera...

The PM Type System

In common with other aspects of PM design, the type system is designed to facilitate the use of flexible programming constructs, such as polymorphism and dynamic dispatch, while emphasising the generation of fast, static, code.  In common with many other modern languages, PM also attempts to combine the safety of static typing with the expressivity of dynamically typed languages such as Python . The type of any PM expression may usually be determined by a static examination of the code. Variables take their types from an initialising expression using the ‘:=’ declaration syntax popularised by Go . a:= 100 ! Declare ‘a’ as an integer (int) variable Variables do not change type during the course of their lifetime – polymorphic programming requires the use of special values, described later. Composite values such as structures are generated using specialised expressions rather than through the invocation of type-specific creator functions: b:= struct var_descriptor...