This remarkable article is an abstract of GPFS ScaGPFS Scans 10 Billion Files in 43 Minutes, written by Richard Freitas, Joseph Slember, Wayne Sawdon and Lawrence Chiu, IBM Advanced Storage Laboratory IBM Almaden Research Center, San Jose, CA.
Disk capacity has improved immensely over the last fifty-seven years. This is exemplified by the chart of areal density vs. year of introduction shown below. The growth rate of areal density has varied from 25% to 100% over this period and is now at 25-40%. The maximum capacity of disk drives has tracked this growth in areal density through changes in form factors, disk diameters, media changes, head technologies, etc. It is unlikely to ever be 100% CAGR again, but the current rate will likely be supported for the next few years.
Areal density trend
Unfortunately, the performance of the disk drive is not keeping pace with the rate of performance improvement shown by business and HPC systems. The maximum sustained bandwidth (MB/s) and transaction rate (IO/s) have followed a different path. The bandwidth is roughly proportional to the linear density. So, if the growth in linear density and track density were equal, then one would expect the growth rate for linear density to be the square root of the areal density. That would make it about 20% CAGR. But, if you examine the recent history of maximum sustained disk bandwidth, you will see that it is more likely to fall within the range of 10 – 15%.
Generally, the track density has grown more quickly than the linear density. Currently a high performance disk drive would have a maximum sustained bandwidth of approximately 171 MB/s. The actual average bandwidth would depend on the workload and the location of data on the surface. Further, current projections do not show much change in this over the next few years.
Maximum sustained bandwidth trend
The disk transaction rate has an even more complicated story. The transaction rate is the inverse of the average access time. The access time is the sum of two components: the average disk latency time and the average disk seek time.
Average latency trend
The average disk latency is ½ the rotational time of the disk drive. As you can see from its recent history, shown in the figure above, the latency has settled down to three values 2, 3 and 4.1 milliseconds.
These are ½ the inverses of 15,000, 10,000 and 7,200 revolutions per minute (RPM), respectively. It is unlikely that there will be a disk rotational speed increase in the near future. In fact, the 15,000 RPM drive and perhaps the 10,000 RPM drive may disappear from the marketplace. A slower drive, such as one at 5,400 RPM, may appear in the enterprise space. If this reorganization of the disk rotational speed menu does occur, it will have been driven by the successful combination of solid state storage and slower disk drives into storage systems that provide the same or better performance, cost and power.
Average seek time trend
The recent history of disk seek time is shown above. The seek time is due to the mechanical motion of the head when it is moved from one track to another. It is improving by about 5% CAGR. In general, this is a mature technology and is not likely to change dramatically in the future. It seems to be trending toward values at around 2 ms for high performance disk drives.
The combination of slow growth in seek time and a near stand-still in rotational latency or even possible increase in average rotational latency, lead to the conclusion that the average access time for enterprise disks is not likely to be better than 3-4 milliseconds and that commercial disk drives (7,200 RPM) are not likely to be better than 6-8 milliseconds. Access times greater than this are quite possible.