Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

orte-restart(1) [debian man page]

OMPI-RESTART(1) 						     Open MPI							   OMPI-RESTART(1)

NAME
ompi-restart, orte-restart - Restart a previously checkpointed parallel job using the Open PAL Checkpoint/Restart Service (CRS) NOTE: ompi-restart, and orte-restart are all exact synonyms for each other. Using any of the names will result in exactly identical behav- ior. SYNOPSIS
ompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE> Options ompi-restart will attempt to restart a previously checkpointed parallel job from the global snapshot handle reference returned by ompi_checkpoint. <GLOBAL SNAPSHOT HANDLE> The global snapshot handle reference returned by ompi_checkpoint, used to restart the job. This is required to be the last argu- ment to this command. -h | --help Display help for this command -p | --preload Preload the checkpoint files on the remote systems before restarting the application. Disabled by default. --fork Fork off a new process, which is the restarted process. By default, the restarted process will replace ompi-restart. -s | --seq The sequence number of the checkpoint to restart from. By default, the most recent sequence number is used (specified by -1). -hostfile | --hostfile The hostfile from which to restart the application. Useful in unscheduled environments. (Same behavior as --machinefile option) -machinefile | --machinefile The machinefile from which to restart the application. Useful in unscheduled environments. (Same behavior as --hostfile option) -v | --verbose Enable verbose output for debugging. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules. DESCRIPTION
ompi-restart can be invoked multiple, non-overlapping times. This allows the user to restart a previously running parallel job. SEE ALSO
orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7) 1.4.5 Feb 10, 2012 OMPI-RESTART(1)

Check Out this Related Man Page

OPAL-CHECKPOINT(1)						       1.4.5							OPAL-CHECKPOINT(1)

NAME
opal-checkpoint - Checkpoint a running sequential process using the Open PAL Checkpoint/Restart Service (CRS). Note: This should only be used by the user if the application being checkpointed is an OPAL-only application. If it is an Open RTE or Open MPI program their respective tools should be used. SYNOPSIS
opal-checkpoint [ options ] <PID> Options opal-checkpoint will attempt to notify a running process that it has been requested that the process checkpoint itself. A snapshot handle reference is presented to the user, which is used in opal_restart to restart the process. <PID> Process ID of the running target process. -h | --help Display help for this command --term After checkpointing the running process, terminate it. -v | --verbose Enable verbose output for debugging. -n | --name Request a specific name for the local snapshot reference. -w | --where Request that the local snapshot reference be placed in a specific location. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules. DESCRIPTION
opal-checkpoint can be invoked multiple, non-overlapping times. This allows the user to take involuntary checkpoints of a running sequen- tial process. See opal_crs(7) for more information about the CRS framework and components. It is convenient to note that the user does not need to spectify the checkpointer to be used here, as that is determined completely by the running process being checkpointed. SEE ALSO
opal-restart(1), opal_crs(7) Open MPI Feb 10, 2012 OPAL-CHECKPOINT(1)
Man Page