When it comes to drug discovery, research is everything. The ability to accelerate research can make or break a business and, of course, save lives. However, key design considerations in HPC infrastructure design can make a dramatic difference in performance and maximize your cluster’s value.
We recommend you start by using commodity off the shelf (COTS) components and best-of-breed technologies from trusted partners. This lets you get more nodes for your money. This also avoids the problem of vendor lock-in, and the associated risk of being trapped into higher prices without the corresponding improvement in performance. This, in turn, improves return on investment, increases flexibility, and delivers better performance per dollar.
Using composable disaggregated infrastructure (CDI) can also significantly benefit performance. In short, CDI refers to the use of software to pool hardware resources and so they can be dynamically combined to meet shifting workload needs. As a result, you get the ability to run diverse projects on a cluster while still optimizing for each unique workload.
You should also ensure your cluster is designed to easily scale as needed with your evolving workload. By leveraging CDI you can dynamically scale individual resource pools like GPUs, accelerators, or storage. You can seamlessly add resources to your cluster and pull them under management. You should also design for linear scalability so that cluster performance and storage reliability improve with each node that you add to your cluster.
Even if you don’t use CDI, though, if your cluster designers used modular design, you should be able to scale the compute, storage, and networking components as needed. Read more about the benefits of modular HPC/AI design in this white paper about the Atlas AI Cluster.
Traditional NAS solutions can run up to 10Gb/s before they create bottlenecks, which leads to exceedingly long epoch times for drug discovery workloads. You can use fast local NVMe storage as a temporary fix, but this means data is constantly being copied between nodes. This leads to high network traffic, which can be another bottleneck.
Software defined storage (SDS) is a great solution to data management. By pairing commodity servers with SDS, you can improve IO bandwidth by up to 10x over traditional mass. This solution also eliminates constant node to node copying, by creating a single namespace data lake that all compute nodes can read or write to. You can also reduce the total cost of ownership by tiering S3 compliant object storage on premise or in public cloud.
Purpose-built clusters can really optimize drug discovery time-to-result. Consider CDI, as it enables your workforce to perform without hitting bottlenecks, and allows you to scale seamlessly, while maintaining high-level security.
Bare metal performance (or its equivalent with CDI) is ideal. If you are thinking about deploying an AI solution to the cloud, though, ideally, you want them in the same data center. If you have a hybrid model, the resources that incur the greatest costs should be on premises.
You can also improve efficiency through networking by using NVMe and robust (10 GbE) networking technology. We recommend NVIDIA/Mellanox 200Gb/s HDR InfiniBand. Your storage should also be highly parallel and support large single namespace.
xTo learn more about HPC/AI for drug discovery, watch our on-demand webinar.
Silicon Mechanics, Inc. is one of the world’s largest private providers of high-performance computing (HPC), artificial intelligence (AI), and enterprise storage solutions. Since 2001, Silicon Mechanics’ clients have relied on its custom-tailored open-source systems and professional services expertise to overcome the world’s most complex computing challenges. With thousands of clients across the aerospace and defense, education/research, financial services, government, life sciences/healthcare, and oil and gas sectors, Silicon Mechanics solutions always come with “Expert Included” SM.
AMD Ryzen Threadripper PRO 7000 WX-Series: Is It Worth the Upgrade?READ MORE
Building an infrastructure to deliver high-performance networking and AI is critical to taking content delivery and streaming services to the next level.READ MORE
Our engineers are not only experts in traditional HPC and AI technologies, we also routinely build complex rack-scale solutions with today's newest innovations so that we can design and build the best solution for your unique needs.
Talk to an engineer and see how we can help solve your computing challenges today.