Quick Start Guide for ARCHIE-WeSt users
This guide briefly describes how to login to ARCHIE-WeSt, transfer the data, use the modules, create a job script and how to submit a job. It also gives you information about best practice of using ARCHIE-WeSt and some additional information that might be useful in practice.
Login to ARCHIE-WeSt
ARCHIE-WeSt has four login nodes called archie-w, archie-e, archie-s and archie-t. To log into ARCHIE-WeSt you need to have account on ARCHIE-WeSt. For log in you should use your DS username and password, and you need to specify the particular login node:
1) Terminal Access (ssh)
To login to ARCHIE-WeSt via ssh (e.g. from Linux/Mac), use any of the following login nodes:
archie-w.hpc.strath.ac.uk (184.108.40.206) archie-e.hpc.strath.ac.uk (220.127.116.11) archie-s.hpc.strath.ac.uk (18.104.22.168) archie-t.hpc.strath.ac.uk (22.214.171.124)
ssh -X firstname.lastname@example.org
-X is optional and will tunnel X windows back to your desktop.
You will see a table summarizing your project usage and disk usage. Note that if you are assigned to the project but your usage is 0.0 the project will not be listed
From Windows, download putty . Click on the images below for instructions on how to use Putty.
2) Graphical Desktop Session
A graphical desktop session can be obtained using the ThinLinc remote desktop client (Windows/Linux/Mac). View the images below to see the suggested configuration options (click on an image to exit).
Pressing F8 from within the desktop session will give you access to the ThinLinc client options.
You can “suspend” the session by simply closing the window via the “X” on the top LH corner. You can of course resume suspended sessions.
However, if you have no applications running, we recommend that you log out so as to release the license.
3) Visualization Servers
Use the ThinLinc remote desktop client to connect to archie-viz.hpc.strath.ac.uk. Follow the instructions above for “2) Graphical Desktop Session”, but replacing archie-login.hpc.strath.ac.uk with archie-viz.hpc.strath.ac.uk.
To get the best performance use prefix all GUI commands with vglrun to ensure that your applications use the server installed graphics card. For example to run vmd type:
instead of simply vmd.
File Systems and Data Transfer
1) File Systems
- /home: backed-up
- /lustre: not backed-up; because of its’ high performance it should be used to run jobs
ARCHIE-WeSt operates using soft and hard quotas on the disk space:
|File System||Soft Quota (in GB)||Hard Quota (in GB)|
If the soft quota is exceeded the user has 7 days to go under the soft quota, otherwise the disk will be protected from writing. The second limitation is hard quota which can not be exceeded. In well-justified cases /lustre disk allocation might be increased.
2) Data Transfer
From/to Windows desktop
For data transfer between ARCHIE-WeSt and Windows desktop download WinSCP . Click on the images below for instructions on how to use WinSCP.
For big data transfer connect to dm1.hpc.strath.ac.uk rather than to particular login node. Data Mover 1 (dm1) network connection is 10Gb/s while other parts of ARCHIE-WeSt file system is 1Gb/s.
From/to Linux/Mac desktop
From Linux/Mac desktop user cwb08102 would do:
scp -pr email@example.com:/lustre/strath/phys/cwb08102/MY_DATA . -p - preserves file attributes and timestamps
-r - option to transfer the entire directory
Note that cwb08102 story her big data at /lustre folder.
Transferring Files to H and I Drives (Strathclyde users only)
To copy files to your I drive space on dm1, type:
(you will be prompted for your DS password)
I drive will be mounted at ~username/i_drive
To copy files to your H drive space on dm1, type:
H drive will be mounted at ~username/h_drive
You can then copy files to ~username/i_drive or ~username/h_drive . Once finished, type
There are a variety of software packages and libraries installed each of which require different environment variables and paths to be set up.
This is handled via “modules”. To view installed software, type: module list
At any time a module can be loaded or unloaded and the envrionment will be automatically updated so as to be able to use the desired software and libraries.
|To list available modules type:||module avail|
|To list loaded modules, type:||module list|
|To load the Intel Compiler suite, for example, type:||module load compilers/intel/2012.0.032|
|To remove a module:||module rm compilers/intel/2012.0.032|
These commands can be added to your .bashrc file so that they are loaded automatically when you log in. Note that the order of loading modules might be important at some cases.
Create a Job Submission Script
To run a calculation on the ARCHER compute nodes you need to write a job submission script that tells the Sun Grid Engine (SGE) system what compute nodes you need (normal, SMP, GPU) and what is your project ID. To run a parallel process you also need to specify the parallel environment (mpi-verbose or smp-verbose). For efficient usage of ARCHIE-WeSt is also advised to use the “back-filling” – it is efficient particulary for short parallel calculations and gives you the possibility of using the nodes reserved for bigger parallel job.
All jobs should be submitted from the /lustre file system. This high-performance storage allows for faster writing the results, therefore your calculations will use less CPU and you get the result faster than using /home file system.
1) Sample serial job script:
# Simple serial job submission script
# Specifies that all environment variables active within the qsub
# utility be exported to the context of the job.
# Execute the job from the current working directory. Standard output and
# standard error files will be written to this directory
# Submit to the queue called serial.q
#$ -q serial.q
# Merges standard error stream with standard output
#$ -j y
# Specifies the name of the file containing the standard output
#$ -o out.$JOB_ID
2) Sample parallel job-script:
# NAMD job-script export PROCS_ON_EACH_NODE=12 # ************* SGE qsub options **************** #Export env variables and keep current working directory #$ -V -cwd #Specify the project ID #$ -P project.prj #Select parallel environment and number of parallel queue slots (nodes) #$ -pe mpi-verbose 10 #Combine STDOUT/STDERR #$ -j y #Specify output file #$ -o out.$JOB_ID #Request resource reservation (reserve slots on each scheduler run until enough have been gathered to run the job #$ -R y
# ************** END SGE qsub options ************ export NCORES=`expr $PROCS_ON_EACH_NODE \* $NSLOTS` export OMPI_MCA_btl=openib,self # Execute NAMD2 with configuration script with output to log file charmrun +p$NCORES -v namd2 namd.inp > namd.out
Note: lines starting from # are comments, lines starting with #$ are SGE directives.
Runtime is in format hh:mm:ss. If the job would exceed the running time it will be killed automatically.
Before submitting a job ensure you have loaded all modules required (see above).
For more sample job-scripts click here.
Basic Job Submission and Monitoring
1) Types of queues:
Note that serial and parallel queue run on the same compute nodes (3312 in total). Part of them is allocated to serial queue and the remaining part to the parallel one. The division might be changed basing on the system load / users demand. The AU multiplier factor for serial and parallel queue is 1.
For more details about ARCHIE-WeSt prices see http://www.archie-west.ac.uk/information/archie-fees.
2) Basic SGE Commands
Jobs on the ARCHIE-WeSt machine are submitted and controlled using Sun Grid Engine (SGE).
|– Lists your jobs (qw – waiting, r – running)|
qstat -u “*”
|– Lists all jobs in the queue(s) by all users|
qstat -g c
|– Provides summary overview of the system use|
|– Lists all queues|
|– Launches job using the script start-job.sh|
qstat -j JOBID
|– Gives fuller detail on a job|
qacct -j JOBID
|– Gives details on a completed job|
|– Deletes job from queue|
3) Submitting a job
All job commands and SGE directives should be placed in a script (e.g. start-job.sh) and launched by typing:
Then you will see the comment:
your job 9355 ("start-job.sh") has been submitted
4) Monitoring a job
Progress can be monitored via the qstat command.
job-ID prior name user state submit/start queue slots
9355 0.50894 start-job.sh cwb08102 r 05/31/2012 09:41:35 firstname.lastname@example.org 6
If the user does not have any running jobs qstat will not return any output.
5) Deleting a job
To delete a job from the queue (the job can be in any state i.e. running or waiting):
6) Duration of jobs
The maximum queuing time for one job is 61 days. The maximum wall-clock duration of one job is 14 days.
In all graphical presentations such as conference presentations, posters, lectures etc., the graphical logo of ARCHIE-WeSt should be used (click here to download the logo).
In papers, reports etc., include this statement in the Acknowledgement paragraph: “Results were obtained using the EPSRC funded ARCHIE-WeSt High Performance Computer (www.archie-west.ac.uk). EPSRC grant no. EP/K000586/1.”
Strathclyde users are obliged to update PURE and associate all papers, conference talks and posters as well as completed PhD thesis with UOSHPC (available under “equipment”).
- Do not launch the production job without knowing:
A. How much data it will generate (disk quota limitation)
B. How much time it will take to complete (runtime limit 14 days)
- Do not submit jobs from /home directory. All jobs should be submitted from /lustre
- For data transfer use dm1. It is particularly important for big data transfer
- There is no /lustre back-up, therefore copy the data to other, secure location (desktop computer, university storage)
- Keep important ARCHIE-West files at /home because this drive is backed-up
- For post-processing data you might use vizualization servers (archie-viz.hpc.strath.ac.uk). Due to limited license number please log out as soon as your work is finish to release the license.
Basic Linux presentation is available here.
Full ARCHIE-WeSt guide is available here.
HPC introductory presentation is available here.
More job-scripts examples are available here.