How to copy files to/from hydra and what disk(s) to use
How to copy files to/from hydra
- You can copy files to/from
hydra from trusted hosts (SI or SAO/CfA IPs),
hydra to hosts that allow external
ssh/scp/sftp connections (see note below).
- For large transfers, we ask users to use
rsync, and limit the bandwidth to 1MB/s (3.5GB/h), with
- Remember that
cp can also create high I/O load on the NFS servers (both NetApps and the other servers),
So limit your concurrent I/Os and serialize your I/Os as much as possible.
sftp for small files
rsync --bwlimit=1000 for large transfers
| Limit, when possible, concurrent I/Os
|| Serialize them, or limit the number of high I/O load jobs
| Stop right away your
mv process(es) if the load on the head node exceeds 6 (check w/
Access to SAO/CfA hosts is limited to the border control hosts (
for SOA/CfA users,
tunneling via these border control hosts is explained on the CF's SSH Remote Access page,
or the HEAD Systems Group's SSH FAQ page.
What disk(s) to use
There are currently a dozen distinct scratch file systems, besides you home directory (
), most (but not all) on NetApp:
|| Mounted on
|| 1.0 TB
|| 5.2 TB
|| 4.0 TB
|| 4.0 TB
|| 9.0 TB
|| 4.7 TB
|| 7.2 TB
|| 2.9 TB
|| 37.9 TB
|| Public scratch storage
|| to be consolidated in fewer partitions
|| 5.0 TB
/pool/sao (all SAO users)
|| 20.0 TB
/pool/sao_atmos (ATMOS group)
|| 5.0 TB
/pool/sao_rtdc (RTDC group)
|| 30.0 TB
|| SAO storage
|| 2.000 TB
|| 2.000 TB
|| 1.000 TB
|| 5.000 TB
|| SI storage
|| 1.134 TB
|| 7.163 TB
|| 8.297 TB
|| Short term storage,
|| files older than 14 days are scrubbed
- Use the scratch space (
/pool), not your home directory for large data storage, needed for your computation.
- This scratch space is for temporary storage on a first come first served basis.
- This is a shared resource, use responsibly.
- None are backed up (but the ones on the NetApps have the
- There are no scrubbers running on the
/pool/cluster* file systems. So be considerate of others, delete regularly what you don't use/need any longer.
/pool/cluster2 is now its own separate volume.
- We will consolidate
/pool/cluster* into fewer partitions (disks).
- Disks get filled up by either running (i) out of disk space, or (ii) out of
inodes (these are used to keep track of filenames and directories).
- If you produce a lot of small files you will run out of
inodes before filling the disk space.
Avoid producing, or leaving, lots of small files. You should monitor both disk space (i.e.,
df /pool/sao) and inodes use (i.e.,
df -i /pool/sao),
Use% you need to reduce the number of files you are using by either consolidating your small files into a single
.zip file, or reorganizing your disk space to use fewer but larger files.
- You can use the command
~hpc/sbin/check-disks.pl +i to check disk use, the
+i option adds the
IUse% after the
- Pick a file system and create a subdirectory using your username, and store your stuff under that subdir.
The NetApp filer has been recently (02/2012) upgraded, and I/Os performances have improved.
Some users have invested in buying their own NetApp disks. Contact me (at hpc@cfa
) if you want to do so.
Local Disk Space
- There is some local disk space on each compute node, the size of the local disk varies greatly from node to node.
- We discourage its use, unless you run jobs that have heavy I/O needs.
Remember that you don't know on what node your job(s) will run on, nor how much local free disk space is available.
- If you do use the local disks, purge them when you are done, including anything left over by crashed, terminated or killed jobs.
We do no scrub these disks, nor check them for stale content.
If you have heavy I/O needs, please contact me (at hpc@cfa
), so I can look at how to streamline your I/O use and if needed offset some
of it on to these local disks.
FYI: High Performance File System
- We hope to have soon (once the we have the InfiniBand fabric up and running) a high performance file system (
- Once such a file system will be available, there will be no local disk space left/available on the compute nodes.
- 30 Jan 2012