Only collaboration can solve the world’s most pressing data challenges

by Muhammad Ahmad, Technology Leader, Office of the CTO at Seagate Technology

Software developers and hardware designers have a history of rivalry. Take renowned computer scientist Alan Kay, who once famously jibed that anyone “really serious” about software “should make their own hardware”. Albeit sarcasm, but the notion has found grounding. Wirth’s Law, for example; the adage that over time, slowing software all but erodes the gains made by hardware. Wirth’s Law has been ascribed to various alleged faults in the software community – bloat, lazy developers, and worse.

Yet today this is changing. 

Looking forward, we face a staggering deluge of data. DataAge, a Seagate-sponsored IDC report, concluded that by 2025 the global datasphere will grow to 175 zettabytes, a more than five-fold increase in seven years. If you were to download every byte to DVDs and stack them one on top of another, you would reach the moon and back – 23 times over. 

It is a truly mind-bending amount of data. But many of our existing systems need to be upgraded if we are to benefit from this data, rather than be burdened by it. 

Fortunately, coordinated effort from across the two once rivalrous camps is making progress. Consider four of the most challenging areas of work: data access and data flow, the rise of multicloud, system information, and security and privacy. In each case, hardware and software are being developed in tandem, and often via open source projects – increasing transparency while broadening knowledge and understanding. 

Improving data access and enhancing data flow

As the amount of data we have at our disposal grows exponentially, how well we can access it becomes more important. As capacity expands, we need faster read/write speeds at lower cost. To this end, researchers are innovating with NAND technology – flash memory that does not require power to hold data.

And when it comes to data flow, a number of open-source software projects such as Kafka, Redis, and Hive have opened up new possibilities when it comes to streaming, inputting, and storing. This growth has been facilitated by the development of configurable hardware components or building blocks. 

Moving to multicloud

Many enterprises are making the shift from public cloud services to multicloud to handle the increasing amount of data they hold. But they still expect to retain many of the features of their previous system. ‘Scale-out’ storage systems such as Apache Hadoop and Ceph can bridge this gap. Both are also open source, so their knowledge is shared and distributed throughout the community. 

But the customisability of these systems comes from hardware. When lower latency is a priority, All-Flash-Array storage is a good option. If the enterprise needs larger storage, hardware architecture allows enterprises to configure the building blocks to suit. 

Better system information

One of the most impactful recent trends in software is the development of autonomous system management. Popular systems such as Kubernetes can now be integrated with open source monitoring tools such as Prometheus. 

But this software relies upon hardware capable of observing various aspects of its performance – from temperature to vibration. As we see more hardware developments in the future, software will be able to rely on more accurate and more useful metrics. 

Security and privacy

The regulatory environment around holding data is far more stringent than it was just a few years ago, and it may well become even tighter. Software developers will be at the forefront of designing systems that protect security and preserve privacy, but hardware will underpin it.

RISC-V is one such development. It is an open-source, abstract model for a computer – effectively a free-to-use instruction manual for low cost, low power, and high-security computer development.

Working in tandem

As someone who spends most of their time working across both sides – working on software for a company that does mostly hardware ­– I have been fortunate enough to get a unique view across these developments. 

It is heartening to see how experience designing hardware is being used as a vital insight into data processing – and lessons from the software world are being directly employed in hardware design. If the community continues to promote open source development in all fields, everyone benefits. 

In this sense, perhaps Alan Kay was only half right. Serious software developers should learn hardware. But now more than ever, the world is set to reap the benefits of flipping his logic back the other way.