asaptools.simplecomm module¶
A module for simple MPI communication.
The SimpleComm class is designed to provide a simplified MPI-based communication strategy using the MPI4Py module.
To accomplish this task, the SimpleComm object provides a single communication pattern with a simple, light-weight API. The communication pattern is a common ‘manager’/’worker’ pattern, with the 0th rank assumed to be the ‘manager’ rank. The SimpleComm API provides a way of sending data out from the ‘manager’ rank to the ‘worker’ ranks, and for collecting the data from the ‘worker’ ranks back on the ‘manager’ rank.
PARTITIONING:
Within the SimpleComm paradigm, the ‘manager’ rank is assumed to be responsible for partition (or distributing) the necessary work to the ‘worker’ ranks. The partition mathod provides this functionality. Using a partition function, the partition method takes data known on the ‘manager’ rank and gives each ‘worker’ rank a part of the data according to the algorithm of the partition function.
The partition method is synchronous, meaning that every rank (from the ‘manager’ rank to all of the ‘worker’ ranks) must be in synch when the method is called. This means that every rank must participate in the call, and every rank will wait until all of the data has been partitioned before continuing. Remember, whenever the ‘manager’ rank speaks, all of the ‘worker’ ranks listen! And they continue to listen until dismissed by the ‘manager’ rank.
Additionally, the ‘manager’ rank can be considered involved or uninvolved in the partition process. If the ‘manager’ rank is involved, then the master will take a part of the data for itself. If the ‘manager’ is uninvolved, then the data will be partitioned only across the ‘worker’ ranks.
Partitioning is a synchronous communication call that implements a static partitioning algorithm.
RATIONING:
An alternative approach to the partitioning communication method is the rationing communication method. This method involves the individual ‘worker’ ranks requesting data to work on. In this approach, each ‘worker’ rank, when the ‘worker’ rank is ready, asks the ‘manager’ rank for a new piece of data on which to work. The ‘manager’ rank receives the request and gives the next piece of data for processing out to the requesting ‘worker’ rank. It doesn’t matter what order the ranks request data, and they do not all have to request data at the same time. However, it is critical to understand that if a ‘worker’ requests data when the ‘manager’ rank does not listen for the request, or the ‘manager’ expects a ‘worker’ to request work but the ‘worker’ never makes the request, the entire process will hang and wait forever!
Rationing is an asynchronous communication call that allows the ‘manager’ to implement a dynamic partitioning algorithm.
COLLECTING:
Once each ‘worker’ has received its assigned part of the data, the ‘worker’ will perform some work pertaining to the data it received. In such a case, the ‘worker’ may (though not necessarily) return one or more results back to the ‘manager’. The collect method provides this functionality.
The collect method is asynchronous, meaning that each slave can send its data back to the master at any time and in any order. Since the ‘manager’ rank does not care where the data came from, the ‘manager’ rank simply receives the result from the ‘worker’ rank and processes it. Hence, all that matters is that for every collect call made by all of the ‘worker’ ranks, a collect call must also be made by the ‘manager’ rank.
The collect method is a handshake method, meaning that while the ‘manager’ rank doesn’t care which ‘worker’ rank sends it data, the ‘manager’ rank does acknowledge the ‘worker’ rank and record the ‘worker’ rank’s identity.
REDUCING:
In general, it is assumed that each ‘worker’ rank works independently from the other ‘worker’ ranks. However, it may be occasionally necessary for the ‘worker’ ranks to know something about the work being done on (or the data given to) each of the other ranks. The only allowed communication of this type is provided by the allreduce method.
The allreduce method allows for reductions of the data distributed across all of the ranks to be made available to every rank. Reductions are operations such as ‘max’, ‘min’, ‘sum’, and ‘prod’, which compute and distribute to the ranks the ‘maximum’, ‘minimum’, ‘sum’, or ‘product’ of the data distributed across the ranks. Since the reduction computes a reduced quantity of data distributed across all ranks, the allreduce method is a synchronous method (i.e., all ranks must participate in the call, including the ‘manager’).
DIVIDING:
It can be occasionally useful to subdivide the ‘worker’ ranks into different groups to perform different tasks in each group. When this is necessary, the ‘manager’ rank will assign itself and each ‘worker’ rank a color ID. Then, the ‘manager’ will assign each rank (including itself) to 2 new groups:
- Each rank with the same color ID will be assigned to the same group, and
within this new color group, each rank will be given a new rank ID ranging from 0 (identifying the color group’s ‘manager’ rank) to the number of ‘worker’ ranks in the color group. This is called the monocolor grouping.
- Each rank with the same new rank ID across all color groups will be assigned
to the same group. Hence, all ranks with rank ID 0 (but different color IDs) will be in the same group, all ranks with rank ID 1 (but different color IDs) will be the in another group, etc. This is called the multicolor grouping. NOTE: This grouping will look like grouping (1) except with the rank ID and the color ID swapped.
The divide method provides this functionality, and it returns 2 new SimpleComm objects for each of the 2 groupings described above. This means that within each group, the same partition, collecting, and reducing operations can be performed in the same way as described above for the global group.
Copyright 2020 University Corporation for Atmospheric Research
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
-
class
asaptools.simplecomm.SimpleComm[source]¶ Bases:
objectSimple Communicator for serial operation.
-
_numpy¶ Reference to the Numpy module, if found
-
_color¶ The color associated with the communicator, if colored
-
_group¶ The group ID associated with the communicator’s color
-
allreduce(data, op)[source]¶ Perform an MPI AllReduction operation.
The data is “reduced” across all ranks in the communicator, and the result is returned to all ranks in the communicator. (Reduce operations such as ‘sum’, ‘prod’, ‘min’, and ‘max’ are allowed.)
This call must be made by all ranks.
- Parameters
data – The data to be reduced
op (str) – A string identifier for a reduce operation (any string found in the OPERATORS list)
- Returns
The single value constituting the reduction of the input data. (The same value is returned on all ranks in this communicator.)
-
collect(data=None, tag=0)[source]¶ Send data from a ‘worker’ rank to the ‘manager’ rank.
If the calling MPI process is the ‘manager’ rank, then it will receive and return the data sent from the ‘worker’. If the calling MPI process is a ‘worker’ rank, then it will send the data to the ‘manager’ rank.
For each call to this function on a given ‘worker’ rank, there must be a matching call to this function made on the ‘manager’ rank.
NOTE: This method cannot be used for communication between the ‘manager’ rank and itself. Attempting this will cause the code to hang.
- Keyword Arguments
data – The data to be collected asynchronously on the manager rank.
tag (int) – A user-defined integer tag to uniquely specify this communication message
- Returns
On the ‘manager’ rank, a tuple containing the source rank ID and the data collected. None on all other ranks.
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
divide(group)[source]¶ Divide this communicator’s ranks into groups.
Creates and returns two (2) kinds of groups:
- groups with ranks of the same color ID but different rank IDs
(called a “monocolor” group), and
- groups with ranks of the same rank ID but different color IDs
(called a “multicolor” group).
- Parameters
group – A unique group ID to which will be assigned an integer color ID ranging from 0 to the number of group ID’s supplied across all ranks
- Returns
- A tuple containing (first) the “monocolor” SimpleComm for
ranks with the same color ID (but different rank IDs) and (second) the “multicolor” SimpleComm for ranks with the same rank ID (but different color IDs)
- Return type
tuple
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
get_color()[source]¶ Get the integer color ID of this MPI process in this communicator.
By default, a communicator’s color is None, but a communicator can be divided into color groups using the ‘divide’ method.
This call can be made independently from other ranks.
- Returns
The color of this MPI communicator
- Return type
int
-
get_group()[source]¶ Get the group ID of this MPI communicator.
The group ID is the argument passed to the ‘divide’ method, and it represents a unique identifier for all ranks in the same color group. It can be any type of object (e.g., a string name).
This call can be made independently from other ranks.
- Returns
The group ID of this communicator
-
get_rank()[source]¶ Get the integer rank ID of this MPI process in this communicator.
This call can be made independently from other ranks.
- Returns
The integer rank ID of this MPI process
- Return type
int
-
get_size()[source]¶ Get the integer number of ranks in this communicator.
The size includes the ‘manager’ rank.
- Returns
The integer number of ranks in this communicator.
- Return type
int
-
is_manager()[source]¶ Check if this MPI process is on the ‘manager’ rank (i.e., rank 0).
This call can be made independently from other ranks.
- Returns
- True if this MPI process is on the master rank
(or rank 0). False otherwise.
- Return type
bool
-
partition(data=None, func=None, involved=False, tag=0)[source]¶ Partition and send data from the ‘manager’ rank to ‘worker’ ranks.
By default, the data is partitioned using an “equal stride” across the data, with the stride equal to the number of ranks involved in the partitioning. If a partition function is supplied via the func argument, then the data will be partitioned across the ‘worker’ ranks, giving each ‘worker’ rank a different part of the data according to the algorithm used by partition function supplied.
If the involved argument is True, then a part of the data (as determined by the given partition function, if supplied) will be returned on the ‘manager’ rank. Otherwise, (‘involved’ argument is False) the data will be partitioned only across the ‘worker’ ranks.
This call must be made by all ranks.
- Keyword Arguments
data – The data to be partitioned across the ranks in the communicator.
func – A PartitionFunction object/function that returns a part of the data given the index and assumed size of the partition.
involved (bool) – True if a part of the data should be given to the ‘manager’ rank in addition to the ‘worker’ ranks. False otherwise.
tag (int) – A user-defined integer tag to uniquely specify this communication message.
- Returns
A (possibly partitioned) subset (i.e., part) of the data. Depending on the PartitionFunction used (or if it is used at all), this method may return a different part on each rank.
-
ration(data=None, tag=0)[source]¶ Send a single piece of data from the ‘manager’ rank to a ‘worker’ rank.
If this method is called on a ‘worker’ rank, the worker will send a “request” for data to the ‘manager’ rank. When the ‘manager’ receives this request, the ‘manager’ rank sends a single piece of data back to the requesting ‘worker’ rank.
For each call to this function on a given ‘worker’ rank, there must be a matching call to this function made on the ‘manager’ rank.
NOTE: This method cannot be used for communication between the ‘manager’ rank and itself. Attempting this will cause the code to hang.
- Keyword Arguments
data – The data to be asynchronously sent to the ‘worker’ rank
tag (int) – A user-defined integer tag to uniquely specify this communication message
- Returns
On the ‘worker’ rank, the data sent by the manager. On the ‘manager’ rank, None.
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
-
class
asaptools.simplecomm.SimpleCommMPI[source]¶ Bases:
asaptools.simplecomm.SimpleCommSimple Communicator using MPI.
-
PART_TAG¶ Partition Tag Identifier
-
RATN_TAG¶ Ration Tag Identifier
-
CLCT_TAG¶ Collect Tag Identifier
-
REQ_TAG¶ Request Identifier
-
MSG_TAG¶ Message Identifer
-
ACK_TAG¶ Acknowledgement Identifier
-
PYT_TAG¶ Python send/recv Identifier
-
NPY_TAG¶ Numpy send/recv Identifier
-
_mpi¶ A reference to the mpi4py.MPI module
-
_comm¶ A reference to the mpi4py.MPI communicator
-
ACK_TAG= 3¶
-
CLCT_TAG= 3¶
-
MSG_TAG= 2¶
-
NPY_TAG= 5¶
-
PART_TAG= 1¶
-
PYT_TAG= 4¶
-
RATN_TAG= 2¶
-
REQ_TAG= 1¶
-
allreduce(data, op)[source]¶ Perform an MPI AllReduction operation.
The data is “reduced” across all ranks in the communicator, and the result is returned to all ranks in the communicator. (Reduce operations such as ‘sum’, ‘prod’, ‘min’, and ‘max’ are allowed.)
This call must be made by all ranks.
- Parameters
data – The data to be reduced
op (str) – A string identifier for a reduce operation (any string found in the OPERATORS list)
- Returns
The single value constituting the reduction of the input data. (The same value is returned on all ranks in this communicator.)
-
collect(data=None, tag=0)[source]¶ Send data from a ‘worker’ rank to the ‘manager’ rank.
If the calling MPI process is the ‘manager’ rank, then it will receive and return the data sent from the ‘worker’. If the calling MPI process is a ‘worker’ rank, then it will send the data to the ‘manager’ rank.
For each call to this function on a given ‘worker’ rank, there must be a matching call to this function made on the ‘manager’ rank.
NOTE: This method cannot be used for communication between the ‘manager’ rank and itself. Attempting this will cause the code to hang.
- Keyword Arguments
data – The data to be collected asynchronously on the ‘manager’ rank.
tag (int) – A user-defined integer tag to uniquely specify this communication message
- Returns
- On the ‘manager’ rank, a tuple containing the source rank
ID and the the data collected. None on all other ranks.
- Return type
tuple
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
divide(group)[source]¶ Divide this communicator’s ranks into groups.
Creates and returns two (2) kinds of groups:
groups with ranks of the same color ID but different rank IDs (called a “monocolor” group), and
groups with ranks of the same rank ID but different color IDs (called a “multicolor” group).
- Parameters
group – A unique group ID to which will be assigned an integer color ID ranging from 0 to the number of group ID’s supplied across all ranks
- Returns
- A tuple containing (first) the “monocolor” SimpleComm for
ranks with the same color ID (but different rank IDs) and (second) the “multicolor” SimpleComm for ranks with the same rank ID (but different color IDs)
- Return type
tuple
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
get_rank()[source]¶ Get the integer rank ID of this MPI process in this communicator.
This call can be made independently from other ranks.
- Returns
The integer rank ID of this MPI process
- Return type
int
-
get_size()[source]¶ Get the integer number of ranks in this communicator.
The size includes the ‘manager’ rank.
- Returns
The integer number of ranks in this communicator.
- Return type
int
-
partition(data=None, func=None, involved=False, tag=0)[source]¶ Partition and send data from the ‘manager’ rank to ‘worker’ ranks.
By default, the data is partitioned using an “equal stride” across the data, with the stride equal to the number of ranks involved in the partitioning. If a partition function is supplied via the ‘func’ argument, then the data will be partitioned across the ‘worker’ ranks, giving each ‘worker’ rank a different part of the data according to the algorithm used by partition function supplied.
If the ‘involved’ argument is True, then a part of the data (as determined by the given partition function, if supplied) will be returned on the ‘manager’ rank. Otherwise, (‘involved’ argument is False) the data will be partitioned only across the ‘worker’ ranks.
This call must be made by all ranks.
- Keyword Arguments
data – The data to be partitioned across the ranks in the communicator.
func – A PartitionFunction object/function that returns a part of the data given the index and assumed size of the partition.
involved (bool) – True, if a part of the data should be given to the ‘manager’ rank in addition to the ‘worker’ ranks. False, otherwise.
tag (int) – A user-defined integer tag to uniquely specify this communication message
- Returns
A (possibly partitioned) subset (i.e., part) of the data. Depending on the PartitionFunction used (or if it is used at all), this method may return a different part on each rank.
-
ration(data=None, tag=0)[source]¶ Send a single piece of data from the ‘manager’ rank to a ‘worker’ rank.
If this method is called on a ‘worker’ rank, the worker will send a “request” for data to the ‘manager’ rank. When the ‘manager’ receives this request, the ‘manager’ rank sends a single piece of data back to the requesting ‘worker’ rank.
For each call to this function on a given ‘worker’ rank, there must be a matching call to this function made on the ‘manager’ rank.
NOTE: This method cannot be used for communication between the ‘manager’ rank and itself. Attempting this will cause the code to hang.
- Keyword Arguments
data – The data to be asynchronously sent to the ‘worker’ rank
tag (int) – A user-defined integer tag to uniquely specify this communication message
- Returns
On the ‘worker’ rank, the data sent by the manager. On the ‘manager’ rank, None.
- Raises
RuntimeError – If executed during a serial or 1-rank parallel run
-
-
asaptools.simplecomm.create_comm(serial=False)[source]¶ This is a factory function for creating SimpleComm objects.
Depending on the argument given, it returns an instance of a serial or parallel SimpleComm object.
- Keyword Arguments
serial (bool) – A boolean flag with True indicating the desire for a serial SimpleComm instance, and False incidicating the desire for a parallel SimpleComm instance.
- Returns
- An instance of a SimpleComm object, either serial
(if serial == True) or parallel (if serial == False)
- Return type
- Raises
TypeError – if the serial argument is not a bool.
Examples
>>> sercomm = create_comm(serial=True) >>> type(sercomm) <class 'simplecomm.SimpleComm'>
>>> parcomm = create_comm() >>> type(parcomm) <class 'simplecomm.SimpleCommMPI'>