plain text version (hold shift to open new window)

homepage  The Paulski Pages

G not Giga; B not Byte

Quick Intro:- The issue of hard drive capacity not being what "it ought" to be is a very common cause of confusion. Data sizes and data transfer rates can both be affected by how bytes are represented - or misrepresented if you prefer.

Quick Giga Check:- If you think your capacity in GB is too small it could actually be in GiB. So try multiplying the value by 1.07 (or 1.05 for MB). If the result checks out approximately to what you expected the original to have been then you are probably just suffering from SI-Binary confusion. We hope to help demystify this below.

Bytes and Bits (B versus b)

For all intents and purposes in the modern IT world a byte is the fundamental unit of binary data and is composed of eight binary digits also known as bits. When these terms are abbreviated it is common for bytes to be represented by the upper case B and bits by the lower case b. We wary that this is not a universal practice nor are b and B international standard definitions. In data transfer, where bit transmission has not only been "the norm", but is particularly relevant when data compression is used, it is most common for bps to mean bits (and not bytes) per second even though accompanying discussions tend to refer to the transmission of bytes rather than bits.

SI Prefixes (P versus p)

An SI unit is an international standard for representing a number of things. When it comes to abbreviating numbers there is a new name (and a corresponding prefix) for each increment or decrement by 1000 (103). Case is important since P represents peta (1015) and p represents pico (10-12). The kilo prefix of k is too-commonly miswritten as K in the IT world when referring to kilobytes.

SI and Binary Conflicts (G versus Gi)

Under the SI system the different prefixes represents incremental products by 1000 but in the binary world (for all sorts of historical and functional reasons) the different prefixes represent incremental products by 1024 (210). What is not good is that the SI system has been hijacked and the SI prefixes used to abbreviate data values. This has caused untold confusion since 1kB should represent 1000 bytes whereas in many computer-related situations it may in fact be representing 1024 bytes. Belatedly, the prefix ki (for kibi or kilo-binary) has been promoted to distinguish between the different values. Thus 1kiB = 1.024kB though referring to and using kibibytes has yet to commonly supplant the use of kilobytes - correctly or incorrectly used.

When one moves on from kilo to mega and on to giga and beyond the differential becomes even more exaggerated since in the binary world there are 1024 "kilobytes" in a "megabyte" and 1024 of those "megabytes" in a "gigabyte". Thus coverting between the SI-gigabyte and the binary-gibibyte takes a conversion factor of 1.024 x 1.024 x 1.024 =  1.073741824; much greater than the much simpler conversion between kiB and kB.

Tabular

1kiB(210) = 1.024kB and 1kB(103) = 0.977kiB
1MiB(220) = 1.048MB and 1MB(106) = 0.954MiB
1GiB(230) = 1.074GB and 1GB(109) = 0.931GiB
1TiB(240) = 1.099TB and 1TB(1012)= 0.909TiB

Comments

The terms gibibyte (GiB) and so on (and particularly since the SI prefix is still used) was in our opinion not a great idea. To have used binaryGB or bGB would possibly have been better since it is the G that is different each time it is expressed and not the B. The jump between each prefix being, each time, a factor of 1024 (210) and not 1000 (103). Ah well, it too would probably have been open to misinterpretation.

Hard drive manufacturers have quite reasonably used SI units to describe the size of their drives. The discrepancy between how their GB and the "GB" reported by the operating system or other software differs has magnified as drive sizes have moved from kB to MB and on through GB capacities. We suppose that no one could really criticise if they described the sizes of their drives in bytes - but then again there would be those that would be just as unhappy at seeing the capacity of a 500GB drive written as 500,000,000,000 bytes.

Under windows there is similar inconsistency. If you don't believe us then just right click on a Drive Letter in My Computer and choose properties. You will see the capacity written in MB or GB etc as well as in actual bytes. You will no doubt note that the two figures are not simply the result of moving the decimal point.

Perhaps binary-bytes will catch on, though we doubt it. Were that the case it would be quite simple for the manufacturers and others to describe a "500gig" drive as 500GB (465GiB). Much more on the Binary Prefix at Wikipedia.

[Top of Page]  [Disclaimer]

Web design by paulski.com - last updated 28th February 2010
Pages best viewed using a CSS2-compliant browser such as Firefox or Opera
Valid HTML 4.01! Valid CSS!