MATHEMATICA

From HPCC Wiki
Jump to navigation Jump to search
Modes of Operation in Mathematica

Mathematica can be run locally on an office workstation, directly on a server or cluster from its head node, or across the network between an office-local client and a remote server (a cluster for instance). It can be run serially or in parallel; its licenses can be provided locally or via a network-resident license server; and it can be run in command-line or GUI mode. The details of installing and running Mathematica on a local office workstation are left to the user. Those modes of operation important to the use of CUNY's HPC resources are discussed here.

Selecting Between GUI and Command-Line Mode

The use of command-line mode or GUI mode is determined by the Mathematica command selected. To use the Mathematica GUI, enter the following command to the user prompt:

$mathematica

To use Mathematica Command Line Interface (CLI), enter:

$math

More details on these and other Mathematica commands is available through man command as in:

$man mathematica
$man math
$man mcc

The lines above provide documentation on the GUI, CLI, and Mathematica C-compiler, respectively.

Using Mathematica on KARLE

Karle is a standalone, four socket, 4 x 6 = 24 core head-like node and is highly capable system. Karle's 24 Intel E740-based cores run at 2.4 GHz. Karle has a total of 96 Gbytes of memory or 4 Gbytes per core. Users can run GUI applications on Karle following this approach or they can prefer CLI. Selecting Between GUI and Command-Line Mode is described here.

Serial Job Exmaple

If mathematica was started in interactive mode using GUI/CLI users can enter mathematica commands as they would normally do:

$ module load mathematica
$ math
Mathematica 10.0 for Linux x86 (64-bit)
Copyright 1988-2014 Wolfram Research, Inc.

In[1]:= Print["Hello World!"]
Hello World!

In[2]:= Table[Random[],{i,1,10}]

Out[2]= {0.22979, 0.168789, 0.257107, 0.724029, 0.466558, 0.588178, 0.186516, 
 
>    0.957024, 0.950642, 0.938009}

In[3] = Exit[]
$

Alternatively one may put these commands into a text file:

$ cat test.nb
Print["Hello World!"]
Table[Random[],{i,1,10}]
In[3] = Exit[]

$

and run it using:

math < test.nb

the following output will be received:

Mathematica 10.0 for Linux x86 (64-bit)
Copyright 1988-2014 Wolfram Research, Inc.

In[1]:= Hello World!

In[2]:= 
Out[2]= {0.67778, 0.737257, 0.862751, 0.623122, 0.253662, 0.541513, 0.776872, 
 
>    0.424682, 0.934039, 0.190007}

In[3]:= 
Parallel Job Example

To run parallel computations in Mathematica on Karle first start required amount kernels (CUNY HPC license allows up to 16 kernels) and then run actual computation. Consider the following example:

$ cat parallel.nb 

LaunchKernels[8]

With[{base = 10^1000, r = 10^10}, WaitAll[Table[ParallelSubmit[
     While[! PrimeQ[p = RandomInteger[{base, base + r}]], Null]; 
     p], {$KernelCount}]] - base]
$
$
$ math < parallel.nb 
Mathematica 10.0 for Linux x86 (64-bit)
Copyright 1988-2014 Wolfram Research, Inc.

In[1]:= 
In[1]:= 
Out[1]= {KernelObject[1, local], KernelObject[2, local], 
 
>    KernelObject[3, local], KernelObject[4, local], KernelObject[5, local], 
 
>    KernelObject[6, local], KernelObject[7, local], KernelObject[8, local]}

In[2]:= 
In[2]:= 
Out[2]= {4474664203, 8096247063, 9746330049, 4733134789, 2879419863, 
 
>    377023287, 7848087693, 8139999951}

In[3]:= 
$

Statement

LaunchKernels[8]

starts 8 local kernels. Rest on the notebook runs parallel evaluation on those 8 kernels.

Submitting Batch Jobs to the CUNY ANDY Cluster

Currently, there is no simple and secure method of submitting Mathematica jobs from a remote (user local or desktop) CUNY installation of Mathematica to ANDY. This is something that is being pursued. In the mean time, both serial and parallel Mathematica jobs can be submitted from ANDY's head node by constructing a standard batch job. To ease the process of debugging such work, we recommend that user's test their Mathematica command sequences locally on smaller, but similar cases before submitting thier work to the cluster. The standard batch submission process is simple to set up and imposes no burden on ANDY's head node.

Serial Batch Jobs Run with 'qsub' Using a Mathematica Command (Text) File

In the following example, a batch job is created around a locally pre-tested Mathematica command sequence that is then submitted to ANDY batch queueing system using the qsub command. The simple Mathematica command sequence shown here computes a matrix of integrals and prints out every element of that matrix. Any valid sequence of Mathematica commands provided in a note book file, whether tested on an office Mathematica installation or on the cluster head node itself, could be used in this example.

When working remotely from an office or a classroom, a user would validate their command sequence on their local workstation (via a smaller local test run), modify it incrementally to make use of the additional resources available on ANDY, and then copy, paste, and save the Mathematica command sequence in a notebook file (file.nb) on ANDY. This last step would be done through a text editor like 'vi' or 'emacs' from a cluster terminal window. From a Windows desktop, the free, secure Windows-to-Linux terminal emulation package, PuTTY could be used. From a Linux desktop, connecting with secure shell 'ssh' would be the right approach.

Below, a note book file called "test_run.nb" does a serial (single worker-kernel) integral calculation (that might have been tested on the user's office Mathematica installation) has been saved on ANDY from a 'vi' session. Its contents are listed here:

$
$ cat test_run.nb

Print ["Beginning Integral Calculations"]; p=5;
Timing[matr = Table[Integrate[x^(j+i),{x,0,1}], {i,1,p-1}, {j,1,p-1}]//N];
For[i=1, i<p, i++, For[j=1, j<p, j++, Print[matr[[i]][[j]]]]];
Print ["Finished!"];
Quit[];

$

As a serial Mathematica job, this job executes on just one core of just one of ANDY's compute nodes. The simple batch script offered to 'qsub' to run this job (we will call it serial_run.math here) is listed below. This script is written in the SLURM Pro form, which became the workload manager on ANDY on 11-18-09. For details on SLURM Pro see the section on using the SLURM Pro workload manager elsewhere in the CUNY HPC Wiki.

$
$cat serial_run.math

#!/bin/bash
#SBATCH --partition production
#SBATCH --job-name mmat8_serial1 
#SBATCH --nodes=1
#SBATCH --ntasks=1

cd $SLURM_SUBMIT_DIR

math -run <test_run.nb > output

$

This script runs on a single processor (core) within a single ANDY compute node invoking a single Mathematica kernel instance. The '-N mmat8_serial1' option names the job 'mmat8_serial1' The job is directed to ANDY's production routing queue, which reads the script's resource request information, and places it in the appropriate execution queue. The 'nodes=1 ntasks=1 mem=1920' option requests one resource 'chunk' composed of 1 processor (core) and 1920 Mbytes of memory. The option instructs SLURM Pro to place the job where it wishes, which will be on the compute node with the lowest load average. The '-V' option ensures that the current local Unix environment is pushed out to the compute node that runs the job. Because this is a batch script with no connection to the terminal, the CLI version of the Mathematica command, 'math', is used.

Save this script in a file for your future use, for example in "serial_run.math". With few modifications, it can be used to run most serial Mathematica batch jobs on ANDY.

To run this job script use the command:

 qsub serial_run.math 

Like any other batch jobs submitted using 'qsub', you can check the status of your job by running the command 'qstat' or 'qstat -f JID'. Upon completion, the output generated by the job will be written to the file 'output'.

Here is the output from this sample serial batch job:

Mathematica 8.0 for Linux x86 (64-bit)
Copyright 1988-2011 Wolfram Research, Inc.

In[1]:= Beginning Integral Calculations

In[2]:= 
In[3]:= 0.333333
0.25
0.2
0.166667
0.25
0.2
0.166667
0.142857
0.2
0.166667
0.142857
0.125
0.166667
0.142857
0.125
0.111111

In[4]:= Finished!

In[5]:= 
SMP-Parallel Batch Jobs Run with 'qsub' Using a Mathematica Command (Text) File

Mathematica provides some easy-to-use methods to perform parallel computations in so-called SMP regime. This mode of operations allow users to use cores that are available within one compute node. KARLE as a standalone computational node has 24 cores and each of ANDY's nodes has 8 cores. Consider the following mathematica notebook:


$
$ cat test_smp.nb
(* perform some computations in serial mode *)
Timing[Table[{i,Plus @@ (#[[2]] &) /@ FactorInteger[(10^i - 1)/9]}, {i, 60, 70}]]

(* initialize 4 MathKernels*)
Needs["SubKernels`LocalKernels`"]
(* object 'mykernels' contains information about 4 computational instances *)
mykernels = LaunchKernels[LocalMachine[4]];

(* Let every kernel report it's existence *)
ParallelEvaluate[$MachineName, mykernels]

(* Perform the same computation as before but now using ParallelTable using those 4 kernels*)
Timing[ParallelTable[{i,Plus @@ (#[[2]] &) /@ FactorInteger[(10^i - 1)/9]}, {i, 60, 70}]]

Exit[]
$

This job first performs some computations using only one core. After that a stack of 4 computational kernels is created by Mathematica and similar computations are repeated in parallel. SLURM script that sends this job into the queue is:

$
$cat parallel_run.math

#SBATCH --partition production
#SBATCH --job-name mmat_smp 
#SBATCH --nodes=4
#SBATCH --ntasks=1

cd $SLURM_SUBMIT_DIR

math -run <test_smp.nb > output
$

There are two important things to note here: 1) "#SBATCH -l place=pack" -- user must request SLURM to pack allocated resources onto a single physical compute node. 2) "#SBATCH --nodes=4 ntasks=1" -- user requests 4 cores from SLURM (4 'chunks' with ntasks=1 each). This is important because in the mathematica notebook 4 computational kernels were created.

As before, the '-V' option ensures that the environment local to the head node is pushed out to the compute node that runs the job. The CLI version of the Mathematica command, 'math', is used again here.

As any other SLURM job, this smp-parallel Mathematica job is submitted to the SLURM queue using "qsub parallel_run.math" command.

Result of this SMP-parallel batch job will be stored in the file 'output':

Mathematica 8.0 for Linux x86 (64-bit)
Copyright 1988-2011 Wolfram Research, Inc.

In[1]:= 
Out[1]= {8.14851, {{60, 20}, {61, 7}, {62, 5}, {63, 14}, {64, 15}, {65, 7}, 
 
>     {66, 15}, {67, 3}, {68, 10}, {69, 6}, {70, 12}}}

In[2]:= 
In[2]:= 
In[2]:= 
In[2]:= 
In[3]:= 
In[4]:= 
Out[4]= {r1i0n8, r1i0n8, r1i0n8, r1i0n8}

In[5]:= 
In[5]:= 
Out[5]= {0.544033, {{60, 20}, {61, 7}, {62, 5}, {63, 14}, {64, 15}, {65, 7}, 
 
>     {66, 15}, {67, 3}, {68, 10}, {69, 6}, {70, 12}}}

In[6]:= 
In[6]:= 
In[6]:=

It can be seen in the output that computations were first done in serial mode on one core. Line "Out[4]" is the output from "ParallelEvaluate[$MachineName, mykernels]". Obviously all 4 computational kernels evaluated the same $MachineName (as they were started on the same host). "Out[5]" is original computation performed in parallel on 4 MathKernels.


Submitting Batch Jobs from Remote Locations to Clusters

A method for doing this is being developed and tested.

For more information on Mathematica:

  • Online documentation is available through the Help menu within the Mathematica notebook front end.
  • The Mathematica Book, 5th Edition (Wolfram Media, Inc., 2003) by Stephen Wolfram.
  • The Mathematica Book is available online.
  • Additional Mathematica documentation is available online.
  • Information on the Parallel Computing Toolkit is available online.
  • Getting Started with Mathematica (Wolfram Research, Inc., 2004).
  • The Wolfram web site http://www.wolfram.com