Using the C++ 17 std::byte data type
Platform-dependent data types are always a little bit disturbing. A programming language should have data types that don't change in size from one platform to another. Countless hours have been wasted in defining macros and other contrapments that try to ensure that a data type has a constant size across platforms. While they have succeeded to some extent, it has become ever more tedious with new microprocessors and wider data types, from 8-bit to 16-bit to 32-bit to 64-bit integers and beyond.
Even the lowly 8-bit byte or octet has not been immune to this. The fundamental problem with C and C++ (and with many other languages as well) has always been their insistence to originally treat bytes and characters as somehow equivalent. This worked when everyone used ASCII, but that was a long time ago, and with the need to use "wide" characters and ultimately the move to Unicode with up to 32-bit code points, it became clear that the idea of one byte equalling one character was no good anymore (it never was, of course, but we didn't know it yet).
So we have ended up in a situation where handling binary data and
handling text have completely different requirements.
Beyond the practically uniform definition of a byte
as an octet (in modern computer architectures),
there has been confusion in C and C++ about signed and
unsigned bytes with regard to bytes. Ostensibly, a char
is seven bits, while an unsigned char
is eight bits,
but the default char
type of a given platform could
be signed or unsigned, so the madness still continues. (It's even more
complicated, but I don't want to go there, since there is a better
way, so read on.)
Use std::byte
for binary data
Just as it's better to forget about strings as arrays of characters,
and use the C++ std::string
type instead, it's better to
adopt the std::byte
data type for dealing with binary
data.
The std::byte
data type was introduced in C++ 17, and
it is limited on purpose, and somewhat peculiar in the sense that it only describes
a collection of bits and some operations you can perform on them, but it does
not do double duty as a character type or an arithmetic type.
So when you can initialize a variable of type std::byte
with a value from 0 to 255 (inclusive), you end up with a bit pattern
describing that value, but if you want to use it for anything else than
manipulating those bits, you will need to convert it to a numeric value,
for example by using the std::to_integer
function.
Restricting the std::byte
to a collection of bits stops you
from attaching any semantics to the value. That task belongs to any class
or function that actually knows what kind of equivalence those bits might have to any
integer (or even floating-point) values. For more information on the
rationale of using std::byte
, see Marc
Grigoire's blog.
NOTE: If you think that you could just as well use std::string
for binary data, that's not such a great idea (mostly also because of
the wrong semantics; it really does matter). See the Simplify
C++ blog entry std::string
is not a Container for Raw Data for details.
C++ 17 example of std::byte
Here is a quick C++ 17 example of using the std::byte
data type for some lightweight operations on bytes. The bytes in
question come from the world of MIDI System Exclusive messages,
which are just small (from a dozen or so bytes to some hundreds of kilobytes)
vectors of bytes that are passed around using the MIDI interface
(old school serial with 5-pin DIN, or modern USB, or even Bluetooth).
The program produces an Identity Request message that you can send to a MIDI synthesizer. If it supports the Identify Request function, it will reply with a similar message that can be interpreted as an Identity Reply. These are known as Universal System Exclusive messages.
If you have a MIDI-capable synthesizer connected to your computer,
you could try sending the bytes to it using Geert Bevin's excellent
SendMIDI utility.
If you do that, be sure to leave the initial F0
and
the terminating F7
bytes off, because SendMIDI will add them
when you use its syx
command. A suitable command would be
sendmidi dev "Your MIDI Port Name" hex syx 7e 06 01
.
See the SendMIDI documentation for details.
// Using the C++17 std::byte type to make a MIDI SysEx message. // Compile using clang on macOS with "clang++ -std=c++17 bytes.cpp -o bytes" // For more details, see: // - C++ Reference: https://en.cppreference.com/w/cpp/types/byte // - Marc Grigoire's blog: http://www.nuonsoft.com/blog/2018/06/03/c17-stdbyte/ // - MMA reference: https://www.midi.org/specifications-old/item/table-4-universal-system-exclusive-messages // - Geert Bevin's SendMIDI: https://github.com/gbevin/SendMIDI #include <iostream> #include <vector> #include <iomanip> #include <cstddef> int main() { // Define the bytes that can be used to make up // the MIDI System Exclusive message that indicates // an Identity Request to send to a synthesizer. // It's convenient to use a vector of bytes instead of // individual variables of type std::byte, // but the initialization is kind of tedious. auto identityRequest = std::vector<std::byte> { std::byte { 0xf0 }, // System Exclusive initiator std::byte { 0x7e }, // Universal Non-Real-time message std::byte { 0x06 }, // General Information command std::byte { 0x01 }, // Identity Request std::byte { 0xf7 } // System Exclusive terminator }; // Print the contents of the vector as two-digit // hex numbers. We need to cast each byte into an // integer, because std::byte is just a collection of bits. for (auto b : identityRequest) { std::cout << std::setw(2) << std::setfill('0') << std::hex << std::to_integer(b) << " "; } std::cout << std::endl; // If you send this MIDI message to your synthesizer, // for example using Geert Bevin's SendMIDI, it may // respond with an Identity Reply message. }
If you compile and run this program, you should see this output:
f0 7e 06 01 f7
Hopefully this was useful information if you need to deal with binary data in C++. As of this writing in 2022, most mainstream compilers seem to support nearly all C++ 17 features.
For a concise take on the most useful new features of "Modern C++" (especially if you have used C++ before, but haven't kept up with it) see the overview Welcome back to C++ - Modern C++ by Microsoft.