星空网 > 软件开发 > 操作系统

一些网摘的hpc材料

 source from: https://computing.llnl.gov

 

 Factors determines a large-scale program's performance

  4         * Application related factors:

  5                 * algorithms

  6                 * dataset size

  7                 * Memory Usage Pattern

  8                 * Use of IO

  9                 * Communication Patterns

 10                 * Task Granularity

 11                 * Load Balancing

 12                 * Amdahl's Law

 13 

 14         * Hardware factors

 15                 * Processors Architecture

 16                 * Memory Hierarchy

 17                 * I/O configuration

 18                 * Network

 19 

 20         * Software factors

 21                 * OS

 22                 * Compiler

 23                 * Preprocessor

 24                 * Communication protocols

 25                 * Libraries

 

Performance analysis: 

  Timers, Profiles, system stat, memory tools

 

Learn some about hardware archiecture:

Intel Xeon 5500/5600 

  4-core/ 6-core

  2.4/2.8 GHz

  Cache

    L1 Data 32Kb, private

    L1 Instruction 32Kb, private

        L2 256K, private

     L3 8Mb/12Mb, shared

     Cpu-Memory bandwidth: 32 Gb/s

 

Intel Xeon E5-2670 

    8-core, 2.6GHz

            Cache

      L1 Data 32K, private

      L1 Instruction 32K, private

      L2 256K, private

      L3 20Mb, shared

       CPU-Memory bandwidth  51.2G/s

 

AMD processors 

     2.2 GHz

  Cache

       L1  Data 64k (2-way)

       L1  Instruction 64k(2-way)

       L2  512K private

       L3  2M shared

 

  Direct - connect Architecture

    CPU-memory bandwidth 10.7G/s per socket F

    other connect socket bandwidth 8G/s(2-way)

 

  4x Infiniband Interconnect

    * SDR 1.25G/s

    * DDR 2.5G/s

          * QDR  5G/s

 

Learn something about NUMA  

  -physical: each node has sevearl(2-4) sockets, each socket has sevearl(4-8) CPU cores. On same socket, cores share L3 cache; socket-socket communcation through CPU-memory bus, almost 2x ~ 5x slower.   

      -design consideration: CPU affinity(numactl --cpunodebind), local memory policy. other compiler/running-time options(mpirun --bind-to-socket -bynode) 

 

Finally and most importantly, a good algorithm.   

 




原标题:一些网摘的hpc材料

关键词:

*特别声明:以上内容来自于网络收集,著作权属原作者所有,如有侵权,请联系我们: admin#shaoqun.com (#换成@)。

跨境电商快速发展,国际物流迎来机遇!:https://www.goluckyvip.com/news/1564.html
出口退税要怎么办理_办理出口退税有哪些要求?:https://www.goluckyvip.com/news/1565.html
SLS物流助力Shopee9.9大促为卖家提供福利_Shopee9.9大促期间SLS物流将免去体积重计费:https://www.goluckyvip.com/news/1566.html
Lazada入驻条件与所需资料_入驻Lazada流程实操详解_入驻Lazada注意事项:https://www.goluckyvip.com/news/1567.html
虾皮自建物流Shopee Logistics Service(SLS):https://www.goluckyvip.com/news/1568.html
物流服务提供商ShadowFax:https://www.goluckyvip.com/news/1569.html
武陵山大裂谷周围景点 武陵山大裂谷周围景点图片:https://www.vstour.cn/a/411233.html
南美旅游报价(探索南美洲的旅行费用):https://www.vstour.cn/a/411234.html
相关文章
我的浏览记录
最新相关资讯
海外公司注册 | 跨境电商服务平台 | 深圳旅行社 | 东南亚物流