An Impractical ApproachWould you ever think that a smart car could beat a souped-up sports car on a quarter-mile race track? You could, by modifying the tiny smart car with an over-powered engine packed into its lightweight frame. This is a clever trick to get maximum power over a short distance. However, would you ever race one of these cars on an F1 track? Or tow a boat? Or take kids to swim practice?Although these mental images are entertaining, a super-powered smart car does not make a useful or effective value prop for these activities. Think of the stress the engine would put on the brakes, chassis, and steering. Think of the maintenance, component upgrades, and labor that are required to operate such a car.A Pragmatic ApproachServers are designed and built in the same way – for specific workloads. They are not the sum of their individual components. Each piece of hardware must be optimized to work with other hardware and firmware to effectively tackle specific workloads. A powerful component without the right support does not perform at its full potential.If you take the engine of a race car and install it into the frame of a midsized sedan there will be significant performance left on the table. This is exactly the case with dropping-in the 2nd Gen AMD EPYC processor (code named Rome) into a server that is designed for the 1st Gen AMD EPYC processor (code named Naples).This makes one wonder about the release of AMD’s 2nd generation EPYC CPU. How will you effectively leverage this technology? Does a drop-in make technical or business sense, especially when comparing to a Rome-optimized system?Technical SenseIf you’ve waited in line to check out at a retail supercenter – you experienced how the throughput of a system is dependent on the slowest part. This may induce anxiety when thinking about replacing old CPUs with new, advanced ones. By including a Rome processor on a Naples-based platform, you will experience lower performance, decreased capabilities, slower memory speeds, subpar networking, and limited platform scalability. Your memory and the input/output latency will slow your AMD 2nd generation EPYC CPU with 64 cores like a boy scout loaded with a fridge worth of food plus his family’s collection of kitchenware.Business SenseUsing a server effectively also has business and financial implications. Cobbling systems can become labor intensive when a rack goes down because of an aged component. Operating costs can be 10 times higher in years 4 through 6 than the initial procurement cost of the server. Refreshing your servers around the three-year mark is shown to reduce overall costs. This is just the cost of operations. This does not account for the better outcomes and innovative solutions your employees will create, when they are free to pursue non-maintenance tasks.Returning to the original analogy, one size does NOT fit all. The Toyota Prius could tow a boat. But why not use an appropriate car or truck? Matching the right workload with the right server will increase performance, automate management, and improve security (i.e., Dell EMC PowerEdge Servers with 2nd generation of AMD EPYC). This includes:More NVMe for better virtualization and software-defined solutionsIncreased cores per socket for hyper-converged infrastructure and virtual machinesLower latency, Gen 4 PCIe, with GPU slots for data analytics, artificial intelligence, and machine learningRed Light, Green Light, Go!Watching a smart car beat a Mustang can be entertaining, but is it a pragmatic solution for towing boats or everyday commuting? Should you drop-in a 2nd generation AMD EPYC chip into a Naples-based server? We all get excited when a new version of a technology we love is introduced. The hardest part is waiting! Dell EMC PowerEdge is releasing a portfolio of servers that are designed and optimized to leverage the full capabilities of the 2nd generation AMD EPYC processor.Prefer and be patient for sustained super-performance over a souped-up server with power left on the table. https://www.dellemc.com/resources/en-us/asset/analyst-reports/products/servers/idc-infobrief-amd-rome-optimized-platforms.pdf  https://www.dellemc.com/resources/en-us/asset/analyst-reports/products/servers/idc-infobrief-amd-rome-optimized-platforms.pdf https://www.delltechnologies.com/en-us/blog/the-6-myths-about-servers-that-almost-everyone-believes/ https://www.delltechnologies.com/en-us/blog/name-brand-or-commodity-server-research-reveals-answer/
This blog is co-authored by Claudio Fahey, Chief Solutions Architect, Artificial Intelligence and Analytics, Unstructured Data Solutions, Dell Technologies and Jacci Cenci, Senior Technical Marketing Engineer, NVIDIAOver the last few years, Dell Technologies and NVIDIA have been helping our joint customers fast-track their Artificial Intelligence and Deep Learning initiatives. For those looking to leverage a pre-validated hardware and software stack for DL, we offer Dell EMC Ready Solutions for AI: Deep Learning with NVIDIA, which also feature Dell EMC Isilon All-Flash storage. For organizations that prefer to build their own solution, we offer the ultra-dense Dell EMC PowerEdge C-series, with NVIDIA V100 Tensor Core GPUs, which allows scale-out AI solutions from four up to hundreds of GPUs per cluster. We also offer the Dell EMC DSS 8440 server, which supports up to 10 NVIDIA V100 GPUs or 16 NVIDIA T4 Tensor Core GPUs. Our collaboration is built on the philosophy of offering flexibility and informed choice across a broad portfolio that combines the best GPU-accelerated compute, scale-out storage, and networking.To give organizations even more flexibility in how they deploy AI from sandbox to production with breakthrough performance for large-scale AI, Dell Technologies and NVIDIA have recently collaborated on a new reference architecture for AI and DL workloads that combines the Dell EMC Isilon F800 all-flash scale-out NAS, Dell EMC PowerSwitch S5232F-ON switches, and NVIDIA DGX-2 systems.Key components of the reference architecture include:Dell EMC Isilon all-flash scale-out NAS storage delivers the scale (up to 58 PB), performance (up to 945 GB/s), and concurrency (up to millions of connections) to eliminate the storage I/O bottleneck keeping the most data-hungry compute layers fed to accelerate AI workloads at scale. A single Isilon cluster may contain an all-flash tier for high performance and an HDD tier for lower cost, and files can be automatically moved across tiers to optimize performance and costs throughout the AI development life cycle.The PowerSwitch S5232F-ON is a 1 RU switch with 32 QSFP28 ports that can provide 40 GbE and 100 GbE connectivity. This series supports RDMA over Converged Ethernet (RoCE), which allows a GPU to communicate with a NIC directly across the PCIe bus, without involving the CPU. Both RoCE v1 and v2 are supported.The NVIDIA DGX-2 system includes fully integrated hardware and software that is purpose-built for AI development and high-performance training at scale. Each DGX-2 system is powered by 16 NVIDIA V100 Tensor Core GPUs that are interconnected using NVIDIA NVSwitch technology, providing an ultra-high-bandwidth, low-latency fabric for inter-GPU communication. The next figure shows the network metrics during the same ResNet-50 training on 48 GPUs. The total storage throughput was 4,501 MB/sec.Based on the 15 second average network utilization for the RoCE network links, it appears that the links were using less than 80 MB/sec (640 Mbps) during ResNet-50. However, this is extremely misleading. We measured the network utilization with millisecond precision and plotted it in the figure below. This shows periodic spikes of up to 60 Gbps per link per direction. For VGG-16, we measured peaks of 80 Gbps (not shown).TensorFlow Storage BenchmarkTo understand the limits of Isilon when used with TensorFlow, a TensorFlow application was created (TensorFlow Storage Benchmark) that only reads the TFRecord files (the same ones that were used for training). No preprocessing nor GPU computation is performed. The only work performed is counting the number of bytes in each TFRecord. This application also has the option to synchronize all readers after each batch of records, forcing them to go at the same speed. This option was enabled to better simulate a DL or ML training workload. The result of this benchmark is shown below.With this storage-only workload, the maximum read rate obtained from the eight Isilon nodes was 24,772 MB/sec. As Isilon has been demonstrated to scale to 252 nodes, additional throughput can be obtained simply by adding Isilon nodes.ConclusionHere are some of the key findings from our testing of the Isilon, PowerSwitch, and NVIDIA DGX-2 system reference architecture:Achieved compelling performance results across industry-standard DL benchmarks from 16 through 48 GPUs without degradation to throughput or performanceLinear scalability from 16 to 48 GPUs while keeping the GPUs pegged at >97% utilizationThe Isilon F800 system can deliver more than 24 GB/sec of synchronous reads, which is typical of DL or ML training workloadsDell EMC Isilon-based DL solutions deliver the capacity, performance, and high concurrency to eliminate the I/O storage bottlenecks for AI. This provides a rock-solid foundation for production-ready, large-scale, enterprise-grade DL solutions with a future proof scale-out architecture that meets your AI needs of today.If you are interested in learning more, please be sure to see the Dell EMC Isilon, PowerSwitch and NVIDIA DGX-2 Systems for Deep Learning whitepaper. You’ll find the complete reproducible benchmark methodology, hardware and software configuration, sizing guidance, performance measurement tools, and some useful scripts.Finally, check out NVIDIA GTC Digital to learn about the latest innovations. Benchmark MethodologyTo validate the new reference architecture, we ran industry-standard image classification benchmarks using a 22 TB dataset to simulate real-world training workloads. We used three DGX-2 systems (48 GPUs total) and eight Isilon F800 nodes connected through a pair of PowerSwitch S5232F-ON switches. Various benchmarks from the TensorFlow Benchmarks repository were executed. This suite of benchmarks performs training of an image classification convolutional neural network (CNN) on labeled images. Essentially, the system learns whether an image contains a cat, dog, car, train, etc. The well-known ILSVRC2012 image dataset (often referred to as ImageNet) was used. This dataset contains around 1.3 million training images in 148 GB. This dataset is commonly used by DL researchers for benchmarking and comparison studies. To approximate the performance of this reference architecture for datasets much larger than 148 GB, the dataset was duplicated 150 times, creating a 22 TB dataset.To determine whether the network or storage impact the performance, we ran identical benchmarks on the original 148 GB dataset. After the first epoch, the entire dataset was cached in the DGX-2 system and subsequent runs had zero storage I/O. These results are labeled Linux Cache in the next section.Benchmark ResultsThere are a few conclusions that we can make from the benchmark results shown in the figure below.Image throughput and therefore storage throughput scale linearly from 16 to 48 GPUs.There is no significant difference in image throughput when the data comes from Isilon instead of Linux cache.In the following figure, system metrics captured during three runs of ResNet-50 training on 48 GPUs are shown. There are a few conclusions that we can make from the GPU and CPU metrics.Each GPU had 97% utilization or higher. This indicates that the GPUs were fully utilized.The maximum CPU core utilization on the DGX-2 system was 70%. This occurred with ResNet-50. read more
We worked with IDC annually to identify key trends that are expected to accelerate monitor innovation and contribute to a more efficient organization. Here are the five key outcomes from our Future of Work 2020 study which will continue to develop over the next few years and impact an organization’s monitor selection criteria. CollaborationAs the line between personal and professional life converge, organizations are placing an increasing importance on remote work and creating a more flexible and collaborative workplace. IDC predicts that by 2024, 30% of (G2000)¹ firms will rely on a secure, highly integrated and collaborative ecosystem that helps companies to function as borderless organizations².Monitors are key enablers for smarter and faster collaboration. For example, large interactive monitors with touch functionality improve meeting productivity by supporting diverse collaboration needs including communicating over different applications, annotating on the same presentations or documents that can be shared on a single screen.Companies will be looking at different monitor solutions for employees’ workspaces, various types of meeting spaces and even remote offices to better connect and collaborate with their teams. Dell sees monitors offering users the ability to connect quickly, present and share content and collaborate effortlessly.SustainabilityAccording to the U.S. Bureau of Statistics, millennials are forecast to comprise of half of the workforce by 2020. The changing demographics of the future workforce will urge organizations to commit to environmental and social tenets like using energy efficient products and increasing efforts to reduce carbon footprint.For millennials and Gen Z workers, factors such as environmental impact, social awareness and sustainability are important considerations for choosing an employer – in fact, 35% of workers will consider employers based on social and environmental factors by 2021³. Dell is continually pushing the boundaries to find ways to reduce our environmental impact—and that includes developing energy-efficient monitors. We are ahead in the industry with seven monitors registered as EPEAT Gold to date and many more awarded the ENERGY STAR® Most Efficient Mark in 2020. You can expect this number to increase in the coming months. In addition to advancing sustainability and attracting younger talent, there’s also an economic benefit to investing in energy-efficient monitors, you can potentially save up to $52 USD annually in electricity bills.ProductivityOptimizing employee performance to drive productivity has a direct correlation to business impact. New workloads involving data processing, visualization or AR-driven design will result in employees requiring larger display real estate so that they can see and do more.Dell sees ultrawide monitors offering many benefits — employees can improve productivity by viewing multiple windows without scrolling or tab switching, and organizations can reduce overall IT spending and maintenance by eliminating the need for multiple monitors and cables at each desk. Many professionals including financial analysts, architects and designers take advantage of the increased display size to multitask, scroll through data and run high-processing applications — something smaller monitors can’t do as seamlessly. IDC sees this trend reflected in customer and market behavior; people are transitioning their workspaces to include ultrawide monitors with the shipment of ultrawide monitors growing steadily over the years (2015-2019)4.ColorMaintaining color fidelity and precision is relevant to many industries, including medical imaging for accurate diagnosis and those working with high resolution content like AR/VR, 3D and even mass marketing materials. IDC expects 20% of G2000 workers will have access to various forms of technology assistance including robotics, AR/VR and wearables to aid their work by 20255.The drive to create high-quality, engaging content will encourage companies to adopt color-accurate displays. We see the shift where companies are adopting more In-Plane Switching (IPS) technology, which provides a better visual experience and color reproduction as compared to twisted nematic (TN) panels. According to IDC, monitors with IPS panels accounted for 50% of overall shipments globally in 20196.As monitors continue to evolve, they will need to support wider color fidelity and precision tasks to meet the latest industry standards and cater to broader application of color across multiple industries. Dell will continue to push the boundaries in the area of color technology, delivering monitors with the widest color coverage and most precise color accuracy. ExperiencesIn the ongoing battle for retaining top talent, organizations are placing an increasing importance on delivering a superior employee experience (EX). According to IDC, by 2022, 35% of organizations will run active EX programs to incorporate modern, digital experiences to drive brand affinity7.Academic studies8 have shown a direct correlation between positive employee experience and aesthetically designed workspaces. Monitors are integral to an organization’s success in delivering superior user experiences as the majority of knowledge workers spend on average eight hours a day in front of a screen. Monitors serve as the primary gateway to view, create and interact with digital content as well as collaborate with colleagues.Dell knows that visually appealing monitors with better resolution, wider screen size, richer colors, an ergonomic base and ultra-thin narrow bezel designs provide a sleek and clean aesthetic, while offering a better viewing experience and greater productivity. You can expect to see more monitor innovations from us later this year. ConclusionFuture workloads will require monitors to capture more accurate color content, operate data-intensive applications more smoothly and help employees collaborate smarter and faster. A standardized one-size-fits-all approach with monitors selection is no longer useful. It’s important to have a clear understanding of how different roles in your organization can benefit from a more specific monitor selection to drive the best employee experience, while also looking at how you can boost employee satisfaction and productivity.Interested in learning more? Check out the full IDC Trends paper, sponsored by Dell, The Future of Work: Accelerating Innovation with Monitors to Drive Business Outcomes, March 2020 here.—————————————————————————–1 Forbes, Editors Pick (May 2019), Global 2000: The World’s Largest Public Companies2 IDC FutureScape: Worldwide Future of Work 2020 Predictions, Doc #US44752319, October 20193 IDC FutureScape: Worldwide Future of Work 2020 Predictions, Doc #US44752319, October 20194 IDC Worldwide PC Monitor Tracker Q2 20195 IDC FutureScape: Worldwide Future of Work 2020 Predictions, Doc #US44752319, October 20196 IDC Worldwide PC Monitor Tracker Q1 20207 IDC FutureScape: Worldwide Customer Experience 2020 Predictions, Doc #US45583819, October 20198 Florida State University Study, Workplace Design: Facilitating Collaborative and Individual Work within the Creative Office Environment, 2015; Steffen Robert Giessner, Sut I Wong, Christoph van Ballen, and Vasilis Roufani, Working New Ways of Working to Attract Millennials, 2017; and Jacob Morgan, The Employee Experience Advantage – How to Win the War for Talent by Giving Employees the Workspaces they Want, the Tools They Need and a Culture They Can Celebrate, John Wiley & Sons, 2017, Page 86 read more
LOS ANGELES (AP) — An effort to bring people with disabilities into clearer focus on TV and movie screens is getting a boost from a major media company. NBCUniversal says that actors with disabilities will be included in auditions for each new production. The agreement was sought by the Ruderman Family Foundation, a disability rights advocate. This is the second pledge made to the foundation to boost auditions for people with disabilities. The first came from CBS Entertainment in 2019, and the foundation hopes other Hollywood studios follow suit. Jay Ruderman heads the Boston-based foundation and says what people see on screen can change their perception of those with disabilities.