from:	Thomas Mac Cooper 
to:	"Thomas, Holly Sarah" 
cc:	"Zhao, Jun-Hui" ,
Thomas Mac Cooper 
date:	Thu, Oct 19, 2017 at 12:44 AM
subject:	Re: miriad computing requirements
Hi, hilodr1 is running on our newest virtualization host and so all io is network based. It might be a better solution for Miriad, since it is disk based, to use something like a Smithsonian std. configuration like the Optiplex 3050 MT. With 2 x 256GB SSDs it is only $1500 vs $8000 for the virtual setup in a 1U rackmount. That setup has 4 CPU cores at 3.6GHz.

Anyway, just saying maybe we could think about single purpose, and single user, machines once we have our storage needs covered this year. Users would have to copy data onto the SSDs and then results off the SSDs, but I doubt if that would add much to the processing time over a 1Gbit network. We could even consider adding a local 10Gbit network if the copy times were too excessive. Generally, here in Hawaii we get about 120-150GB/hour over the 1Gbit ethernet.

Okay, talk to you soon          ...mac

On 10/18/2017 11:35 AM, Zhao, Jun-Hui wrote:

    Hi Holly,
          The fast computer around CfA appears to be hilodr1 based on my processing Amy Steele's 40 GByte dataset.
    hilodr1 is a factor of 1.15 faster than rtdc9. I checked both computers, hilodr1 simply having four processors each
    with a single CPU core of  QEMU Virtual CPU version (cpu64-rhel6) and a clock speed of 3.4 GHz while
    rtdc9 having sixteen processors each with 4 CPU cores of Intel(R) Xeon(R) CPU E5-2637 v3 @ 3.50GHz
    and a faster clock speed of 3.5 GHz.

         The structure of rtdc9 appears to be more sophisticated and advanced; but for the offline data reduction software that is
    way behind the computer technology rtdc9 seems too much already. For example, most of the programs (Miriad) use only one processor but do require
    a large amount of data exchanges between the processor and a disk file. Thus, fast data exchange rate disks, such as SSD,
    may needed. The price of SSD declines rapidly over the years.

         Also, disk space was an critical issue for me. Usually, pre-processing (not including calibration and imaging) requires a disk space
    of 9 times more than the size of the raw SMA archived data, for intermediate data file swapping, i.e. for 100 GB size of a SMA online dataset,
    one needs about 1 TB to pre-process for a Tsys corrected data that is twice larger than the online data from integer-sampling.

         You may consult Mac Cooper for the details of his hilodr1 setup. It seems to me that he has made a clever setup for the
    site data reduction computing power.

         Jun-Hui



    On Wed, Oct 18, 2017 at 3:47 PM, Thomas, Holly Sarah > wrote:

        Hi Jun-Hui,
        I'm just putting together a short summary of recommendations on the
        RTDC website for which computer to use. As you can see I need some
        input on the Miriad section. Is there anything regarding computing
        resources that you would recommend?

        https://www.cfa.harvard.edu/rtdc/RGcomp/whatcomp/
        

        Thanks,
        Holly

        --     Holly Thomas
        Radio Telescope Data Center
        Harvard-Smithsonian Center for Astrophysics
        +1 (617) 496-0172   |
        holly.thomas@cfa.harvard.edu