Skip to main content

Data in, data out

When a program is running over multiple processors or processing cores then it is important to be able to map where data are flowing. One way in which data flows can be translated into programming language semantics is though argument passing to subroutines. In the absence of global variables, data transfer to and from a subroutine straightforwardly can be directly related to communication to and from another processor. In PM, the emphasis is on data parallelism through the for statement.

 for index in array do  
   index=process(index)  
 endfor  
    

So what are the rules governing data flow into and out of this statement? These are primarily based around variable locality.


 x:= 1  
 for index in array do  
     y:=func(index)  
     index=process(x,y)  
 endfor  

Here x is relatively global to the for statement and y is local to it (a variables scope only extends to the end of the statement block in which it is defined).  There are two ways to import data into the for statement and two ways to get it back out.

To import data you either:
  1.  Refer to a relatively global variable from within the statement - such as accessing x in the above example
  2. Specify the data in the iteration clause  – array above.
To export data you either:
  1. Assign to an iteration variable – such as index above. This, of course, only works if the variable is iterating over a modifiable value – an array or slice
  2. Use a let clause
Including a let clause at the end of a for statement defines named constant values in the enclosing scope. These can be used to return values that are invariant across all invocations of the for statement body – essentially the results of reduction or aggregation operations.

 for index in array do  
  y:=func(index)  
 let  
  sum=sum::y  
  allvalues=@y  
 endfor  
 further_process(sum,allvalues)  

This is the complete set of data transfers allowed for the for statement. Other statements allow different communication patterns - such as parallel find, to be discussed in a future post.

Comments

Popular posts from this blog

‘Tis the season

… to give an end-of-year update of what is happening with the PM language. It is a long time now since I wrote a PM blog. However, behind the scenes there has been a lot of work done on the guts of the compiler with new inference and optimisation passes added and others taken apart, polished, and put back together again. New language features have also been implemented, including methods, improved object lifetime management and, most importantly, a radically simplified approach to coding stencils. The handling of stencils was one aspect of PM-0.4 with which I was least happy. Since these are a key requirement for nearly all numerical modelling, I have long been looking for a way to make their coding as straightforward as possible. However, this has been one of the greatest challenges in terms of implementation, since the amount of code restructuring need to make a stencil operate efficiently (separating out halo computation, overlapping computation and halo exchange, tiling and the in...

Compile time, run time, coffee time

[ Please note that in the following discussion I will be using PM 0.5 notation, which is slightly different to PM 0.4 and also in some flux as PM 0.5 is still in development. ]   Many programming languages (and most newly developed ones) include some form of compile-time programming, ranging from simple preprocessors ( #define , #if in C) to fully blown macro systems capable of re-writing the abstract syntax tree during compilation (Rust, Julia, etc .). In line with its philosophy of keeping things as simple as possible, but not simpler, PM takes a middle road with compile-time programming supported primarily through the type system. There is nothing too radical here – this is not an area where the language aims to break new ground.  The PM approach centres around the use of compile-time types. Many languages use special types to denote literal values and PM follows this trend. Literal integers, reals, strings and Booleans each have their own types: literal(int) , litera...

WYSIWYG

PM was in part inspired by ZPL , a parallel array language. One of the key goals of the ZPL project was to provide a “What You See Is What You Get” (WYSIWYG) approach to parallel communication – in particular as this related to communication overhead. This was achieved using a range of operators that explicitly invoke parallel communication on distributed arrays. For example, the @ operator moves the elements of an array by a fixed displacement while the # operator provides for more general element permutation. A programmer would know that the former operation is much less expensive than the latter. To shift a distributed array by a fixed displacement, computational nodes only need to exchange information for edge cells that have moved into the region governed by a different node. However, general permutation requires all nodes to request and then receive data from arbitrary other nodes and for this exchange to be globally synchronized. The ZPL # operator thus needs avoiding unless ...