Big data is only getting bigger. As more organisations rely on ballooning volumes of information, the accuracy and integrity of data have become more important than ever.
Whenever data is transferred, processed or stored, there are inherent risks of it being lost or corrupted – with potentially dire consequences. But with technologies such as ECC RAM, it’s possible to minimise these risks and ensure data is managed as safely as possible.
What is ECC RAM?
In our previous overview of RAM (random access memory) we explained how a computer system’s memory functions as a high-performance, temporary workspace for processing data.
ECC RAM, or error-correcting code RAM, is a specialised type of memory that identifies and fixes the most common errors which could otherwise lead to data corruption or system crashes. These are known as single-bit errors, and require some explanation themselves.
What are single-bit errors?
A bit is a single binary digit (a 1 or a 0), with eight bits forming a byte – historically the smallest unit of addressable memory, read by computers as either a single number or letter. A single-bit error is when the electric charge of a bit changes, flipping it from a 0 to a 1, or vice versa.
The causes of single-bit errors come in two main flavours: hard and soft. Hard single-bit errors are caused by physical factors like temperature or power variation, and stress on the hardware. Soft single-bit errors result from factors that are harder to observe, such as magnetic interference and even cosmic rays.
Either way, the result of a single-bit error is the same. An error affecting a single binary digit might not sound like the end of the world, but a flipped bit could have a serious impact on important data.
What does ECC RAM prevent?
While the single-bit error could be harmless, or have a comparatively mild effect (like a wrongly coloured pixel in an image), it could equally result in a completely garbled file, or a system crash. In applications that process large volumes of sensitive or high-value data, even one single-bit error could be disastrous. A one that should be a zero could result in a number being stored with an incorrect decimal place – something that you might not notice until some time later.
Ultimately, it could lead to loss of data, interrupted services, or inaccurate information being stored and displayed. ECC RAM prevents these single-bit errors by detecting and correcting them, and ensuring the data is properly preserved.
How ECC RAM works
Unlike normal RAM, ECC RAM includes an additional ECC memory chip that uses complex algorithms to identify and remedy errors. ECC RAM constantly scans data as it is processed by the system, using a method known as parity checking.
ECC RAM adds an additional bit to each byte, called a parity bit. The parity bit totals the 1s in the byte as either an even (0) or odd (1) binary digit. If the parity bit doesn’t match what was previously recorded for a specific byte, the ECC RAM knows that an error has occurred. It can then use sophisticated code to restore the original, uncorrupted data and therefore correct the error.
ECC RAM vs non-ECC RAM
Compared to non-ECC memory, ECC memory has obvious advantages. Because of the built-in error-correcting capacities, systems with ECC RAM experience far lower failure rates than non-ECC RAM setups. In practice, this means less data corruption, fewer crashes and more uptime – key objectives for applications that process user data while offering high availability.
However, because of the extra processing required on the RAM chips, ECC may have a slight impact on memory performance. But this is hardly a major issue, with users prioritising the error minimisation and maximum uptime that ECC RAM provides, even if it does come with a minor performance hit.The slight performance advantage that comes with non-ECC memory over ECC memory is outweighed by the potential risks of a harmful single-bit error occurring.
Another obvious difference between ECC RAM and non-ECC RAM is the price. Due to its advanced features, ECC memory is more expensive than normal RAM, and is only supported on specialised (and costly) motherboards and high-end server CPUs like Intel’s Xeon range. And ECC RAM can’t be combined with non-ECC RAM, so if you want ECC capabilities, you’ll need to pay for a full system’s worth of ECC memory.
So is ECC RAM worth it?
For business-critical server applications, the short answer is yes. While it can be frustrating when your home computer or laptop crashes due to an error, it’s unlikely to have serious long-term implications. But on a server handling sensitive customer details or financial transactions, even a single error holds the potential for catastrophe.
To protect against financial loss caused by corrupted data, or reputational damage caused by downtime in the aftermath of a system failure, ECC RAM is highly recommended for organisations that process large volumes of customer data online.
At Fasthosts, our Bare Metal servers come with ECC RAM as standard. With a Bare Metal server, you get all the advantages of your own dedicated hardware, combined with the features of our latest cloud hosting platform.