Massive data sets are generated in the world of big data analytics in an age of explosive data generated by social media, mobile messaging, and many other communication channels. This recently mainstream analytics technology has brought about a transformation in the data analytics environment. Nearly all types of all government sectors, industry sectors, and even the education sectors have embraced big data. Now, data sets easily scaling to 10 or 20 Terabytes (TB) are becoming more the norm than the exception.
The problem in traditional cluster computing
Industry applications of big data often require complex algorithms to tackle memory-intensive applications. For example seismic data in the Oil & Gas industry; real-time trading data in investment finance; and simulation data in scientific modeling require these algorithms. The list can be endless, but the fact remains that these memory-intensive processes cannot be performed on commodity clusters.
Developed to drastically reduce the cost of high performance computing (HPC), the traditional cluster computing based on distributed memory began to fail anytime application memory requirements exceeded the capacity of a single node. On the other hand, the exorbitant cost of mainframes and supercomputers posed a serious challenge in the wider use of shared memory computing.
Numascale's solution: Computing through shared memory and I/O
Numascale, in business for the past four years, has engineered NumaConnect which converts a set of standard servers with separate memories and I/O into a single, unified systemwith the firm goal of providing the functionality of supercomputers or high-end mainframes at a fraction of the cost.
In the NumaConnect environment, commodity servers are linked to form a single, multi-processor system with shared access to the combined memory and I/O resources. At the heart of NumaConnect is NumaChip, which combines the cache coherent shared memory control logic with an on-chip 7-way switch. Please find the full details of this unique technology in The Numascale Solution Affordable Big Data Computing.
Each node, scaled to 4K, can contain multiple processors. The result is an affordable, shared memory computing option to tackle the data-intensive applications. Without going into the technical nitty-gritty, NumaConnect’s capability enables programs to access any memory location and any memory-mapped I/O device in a multiprocessor system. Early adopters like Statoil are sharing the details of improved performance and reduced costs. You can read about the case studies in The Numascale Solution Affordable Big Data Computing
Numascale's Technology Strength
The rich heritage of NumaConnect technology is related to the development of the IEEE standard 1596, Scalable Coherent Interface (SCI). SCI established standards for scalability, global shared address space, and cache-memory coherence.
Benefits of NumaConnect technology
What is significant about this new technology is that it achieves the underlined benefits at 1/20th to 1/30th of the cost of comparable, high-end servers. Here are a few of the advantages shared memory compared to clusters:
- Any processor in a unified cluster can access any data location through direct load and store operations. This capability facilitates less code and easier debugging.
- Compilers can automatically exploit loop-level parallelism, leading to higher efficiency and reduced human intervention.
- System administration tasks are also suitably reduced because of low maintenance.
- Any processor in the unified cluster can map and use any resource in a virtualized environment.
Although all the above functionality are available in high-end mainframes or servers, these systems come with a eyebrow-raising price tag that can be as high as 10 times higher per CPU core compared with commodity servers based on the x86-64 architecture.
The NumaConnect solution, according to the documented use cases, delivers all the capabilities shared memory computing, resulting in smooth application development, the computation of large datasets, execution of complex algorithms, more scalabilityall at a significantly lower cost. You may read about the full details of this technology here: The Numascale Solution Affordable Big Data Computing.