I entitled this article “8 bits to a Byte” because people used to say that to me all the time and I would always wonder what the heck they meant. The context would always be around sizes of hard drives or speed of Internet. They would be sincere saying it, expecting that they were imparting great wisdom on me, but they might as well have said 12 hectares to a fortnight per kilowatt-hour. It meant nothing to me.
But after about 12 years of people saying it to me, I finally have honed the skill of actually using the phrase 8 bits to a Byte for good and not just to make someone feel dumb.
Before we dig into making sense of this, let’s just throw out some phrases so you feel some context to start.
- I have a 1 TeraByte drive in my computer.
- My iPhone has 128 GigaBytes of storage.
- My Internet is 10 Megabits per second.
- I have 16 GigaBytes of RAM in my laptop.
- I have gigabit Ethernet.
All of these sound like they’re related (and they are) but they’re slightly different units. When we’re done, you’ll be saying “8 bits to a Byte” with the best of them as a way to explain them to people. I should mention that you won’t be invited to parties any more, but that’s a risk you’ll have to accept.
First let’s break down Mega, Giga and Tera. Each of these is prefix from the international System of Units. There’s a great chart in Wikipedia that shows the definition of all of the prefixes. I kinda love the chart because it shows you the year the prefix was adopted. Deca, hecto, and kilo (1000) were all adopted in 1795. It wasn’t until 1873 that Mega came into vogue which stands for million, or 10002. If we keep going up the chart, we see Giga (10003) and Tera (10004) which stand for billion and trillion (adopted in 1960). Things go a little crazy after that with yotta at the very top of the chart at 10008 approved in 1991! Anyway, this whole raising 1000 to a power is referred to as base 10.
I might have gotten a little carried away there and off topic, but it’s cool to know that each step up from kiloByte to MegaByte to GigaByte to TeraByte is times 1000. That’s the prefix taken care of in all of our examples about Internet speeds and disk drive sizes and RAM. But what about that bit/Byte nonsense?
First let’s talk about bits. But before we do, let’s talk briefly about capitalization in the abbreviations. Up until now, I have been writing out bits and Bytes in words like Megabits and TeraBytes, but normally these would be abbreviated. When you’re abbreviating, bits has a lower case b and Bytes has an upper case B. So Mb (lower case b) means Megabits, where TB (upper case B) means TeraBytes.
So what the heck is a bit??? A bit is a basic unit in information theory that can be a 1 or a 0. That’s it. That’s the entire job of a bit, be a 1 or a 0. (Ref: wikipedia.org/wiki/Bit) We could go down some big rabbit holes about binary math but let’s not just this once.
If a bit is a unit that is either a 1 or a 0, then what is a byte? You’re going to love this. A byte is a unit of digital information that commonly consists of … wait for it … 8 bits! It’s not an arbitrary number, luckily. In early computing, a byte was the smallest number of bits used to encode a single character of text. You can’t create a character with 1 bit, you need 8 of them. (Ref: wikipedia.org/wiki/Byte). Say it with me, 8 bits to a Byte!
Now why the heck do you even care about this? I’m 700 words in and I haven’t defined a problem to be solved. Here’s an example. Disk drive sizes and RAM are usually explained in Bytes, not bits. But Internet speeds are usually in number of bits per second, not Bytes. Let’s say I need to upload a 40MB (mega Byte) file and I have Internet speeds of 10 Mbps (Mega bits per second). How long will it take to upload my file?
I can’t take 40 divided by 10 because the two numbers are in different units. It would be like dividing feet by meters per second and thinking the answer made any sense at all. Let’s start by converting the 40MB file into Mb (megabits). We know there’s 8 bits to a byte, so we can multiply 40MB x 8bits/byte = 320Mb. So our file is 320Mb in size, and we want to transfer it at 10Mbps. Now we can divide 320Mb by our 10Mbps internet connection and we know that theoretically it will take 32 seconds to upload the file. Except of course that the bits manage to travel to Pluto and back to get the server so it always takes longer than we think.
See? 8 bits to a Byte is your friend!
One of the reasons this gets confusing is that people have started to say “I have Gigabit Ethernet.” Well that doesn’t actually make any sense, does it? Gb does not have any time component in it. It’s like saying “My gas mileage is 32 gallons.” Gallons per what? It has no meaning. For some reason, in talking about throughput, common usage is to drop the “per second”. If you don’t have time to say “per second” maybe you should loosen up your schedule.
I hope that learning that 8 bits to a Byte will help you to be smug and all-knowing as you go forward in life. It sure makes me happy to finally understand it. But hey, just for fun, let’s make all of this harder, shall we?
Gibibyte
Remember way back when we had fun with SI units, I talked about how each of the jumps was a multiple of 1000? A kilobyte is 1000bytes, megabyte is 10002, and gigabyte is 10003? Well, what if instead of doing all this math in 1000s we started using base 2 instead of base 10 to describe these values? By that I mean, what if it was 2 raised to a power instead of 10 raised to a power?
We simply must take a side trip over to our good friends at NIST, the National Institute of Standards and Technology. And not just NIST, but physics.nist.gov which takes NIST nerdiness to a whole new level. The title of the page I’m linking to is “The NIST Reference on Constants, Units, and Uncertainty.” I think that’s the best title ever.
Our friends at NIST explain that in 1998 the Society of Nerds whose job it is to make you angry (also known as the International Electrotechnical Commission) established the word Kibibyte because in reality, people in computer science were using kilobyte to mean 1024 bytes instead of 1000 bytes. The NIST nerds explain by the way that the “bi” in Kibibyte is to be pronounced like a long “e” sound in English. So it’s pronounced KiBEEbyte.
The problem arose in the two different fields of computer science; those building hardware and memory vs those building networking and storage equipment. The memory nerds were using base 2, e.g. 210 or 1024, while the networking nerds were using base 10, e.g. 103 or 1000, and both groups were using the same term: kilobyte. Something had to be done!
Enter kibibyte for 1024 bytes, leaving kilobyte to maintain its status as 1000 bytes. They also needed to come up with a way to write these funny new words, and they chose to snuggle the “i” in kibibyte into the acronym, so we have KiB for kibibyte and MiB for mebibyte (also for Men in Black but that’s not important right now) and GiB for gibibyte. By the way, note that mebibyte is spelled me, not mi, but you still write MiB for mebibyte to make sure you’ll never remember how to spell it.
Now do you really care about this whole kibibyte nonsense? Well, maybe. If you’re on Windows and you ask for the properties on a file, the size will say something like “1.95MB (2,047,488 bytes)”. Now you know why it shows you both sizes. The 1.95MB is 2,047,488 divided by 10242. Now isn’t that exciting?
I hope that you found this a little bit interesting and at least from now on, you can with confidence say, “Well actually, there are 8 bits to a Byte you know…”
Thanks for the headache, my synapses are all intertwined and my neurons are all aflutter.
So Kibi is a more accurate notation since the network nerds got lazy?
Nice work!, To Rhomphaia, from my point of view there was an issue with Kilobyte meaning 1024 bytes. Kilo is for 1000. The network guys only used the standard correctly and made clear that a different prefix was necessary for base 2 units.
I remember when I first noticed the difference in how these words were used. It was about the time that drive storage was regularly larger than a single gigabyte. Instead of continuing with describing the drives using sizes based on binary (bytes, kilobytes and megabytes) they started with decimal numbers and kept using the unit prefixes from binary because they sounded bigger and we didn’t have any small easy to use, readily available prefixes for the decimal numbers of that size yet. So – we should blame it on the marketing departments.
For more details:
A kilobyte changed from 1024 ( 2 to the 10th power) bytes to a thousand (10 to the 3rd power) bytes and a gigabyte changed from 1,073,741,824 (2 to the 30th power) bytes to simply 1,000,000,000 (10 to the 9th power). The storage manufacturers save nearly 74 million bytes in each gigabyte. (not to mention the overhead of hard drive file systems)
Early digital computers progressed from using 2 bit, 4 bit 8 bit, 16 bit, 32 bit, 64 bit and 128 bit “words” in their operation. The file systems started out with limited different file types supported. The earliest I ever used was set up for representing simple text files. They were constructed with 4 bit words combined to represent an 8 bit “byte”. At the time we were using ASCII characters which contained either 128 character tables or in expanded 256 character tables. Eventually more file types were developed and storage also increased dramatically. Each time storage got larger more file types were developed using more and more space.
It’s a lot like 2, 3 and 4 car garages. The more space you have, the more things you can find to fill them up. 🙂