The rise of datacenters and cloud computing is bringing about a resurgence in the use of Multicast. Previously neglected by networking courses and confined to LANs, Multicast now has new relevance. As a brief introduction, this post covers the basic theory behind IPv4 and Ethernet Multicast and provides a code sample showing how it can be used in practice.
Multicast sits on a spectrum between Unicast and Broadcast (and can operate as Broadcast, but with a higher overhead). Unicast, point-to-point messaging is the dominant communication paradigm in networks today. In future networking environments, however, we may need to consider alternative communication paradigms.
When we think of computer networks, we tend to envisage point-to-point Unicast communication. For example, consider HTTP requests to a website: every host wishing to see the page makes an individual request for that page and receives an individual copy. Thus, if 1000 people wish to view the web page, 1000 copies are sent across the network. This duplication is acceptable for web-pages, as web content is generally static and web requests are created, and arrive, independently of one another[i].
In contrast, imagine a scenario in which requested data goes out of date quickly, or many hosts wish to receive the data simultaneously. Consider, for example, the live stream of a sports event or the delivery of timely stock market information to multiple hosts in a datacenter. In both of these cases, sending out multiple copies of the data is inefficient. Ideally we would address a message to multiple recipients and rely on the network to distribute the message accordingly.
Multicast is a form of networking in which messages can be delivered to one or more recipients. The Internet Protocol (IP) is used to deliver data across network boundaries[ii]. We use IP addresses to identify a host across multiple networks (see IP header figure below). If we note that an IP header only has enough space for one destination address, then how does Multicast provide support for multiple recipients[iii]?The solution is to map a single IP address to multiple recipients. Routers (and intelligent switches) have built-in support for maintaining these Multicast mappings. I will focus on behavior within a single LAN, rather than between routers.
When a packet destined for one of these Multicast addresses (which is mapped to multiple recipients) arrives at a router or switch, the packet is forwarded such that each link only receives one copy of the message at most. All hosts on a link receive all frames on that link and hosts simply filter out those frames not intended for them.
The key point is that as few copies as possible of the messages are sent. For instance, at the link layer, a frame only needs to be sent to links where group members reside[v].
In IPv4, the Class D addresses
220.127.116.11 are reserved for Multicast. Each IP address represents a group of recipients, thus there is space for 248,720,625 different sets of recipients (Multicast groups). In practice, however, routers and switches generally do not have enough memory to support this number of groups simultaneously, and there are a number of reserved Multicast addresses used by various protocols[vi].
These Multicast mappings are maintained using the Internet Group Messaging Protocol (IGMP). When a recipient (host) wishes to join a group, it sends a membership report message to the “all routers” group (
18.104.22.168). When the router receives the report, it maps the interface on which it received the report to the group IP address, and any messages sent to the group IP address are forwarded to the host via the specified interface. Thus, the router only needs to know of one group member on each interface.
The router periodically (roughly every minute) sends out a query to the “all hosts” group (
22.214.171.124), to check whether there are any hosts still wishing to receive Multicast messages. All hosts in the network receive this query, but only those hosts wishing to remain members reply with a membership report. Hosts can also leave at any time by sending an explicit leave group message.
All IGMP messages are IP datagrams with the following format:There are several types of IGMP message:
- 0x11 = Membership Query:
- General Query, used to learn which groups have members on an attached network.
- Group-Specific Query, used to learn if a particular group has any members on an attached network.
- 0x16 = Version 2 Membership Report: Used by hosts to declare membership
- 0x17 = Leave Group: Used by hosts to leave a group
The Data Link Layer (Layer 2)
As with all IP traffic, Multicast relies on the data link layer (layer 2) for delivery. So far, I have only covered the network layer (layer 3) implementation, but it is important to also understand the layer 2 implementation.
To deliver to an end host, an IP Multicast address must be converted into an Ethernet address because Ethernet addresses are required for link layer delivery. Ethernet has its own rules for handling Multicast frames, as it can be used independently of IPv4 Multicast[viii].
An Ethernet address consists of 6 bytes, split into two three-octet identifiers. The first three bytes indicate the Organizationally Unique Identifier (OUI), which is generally that of the manufacturer of the network card. The second three bytes indicate the Network Interface Controller (NIC) Specific[ix]. Any packet with an OUI of
01-00-5E is considered to be a Multicast address, meaning that Ethernet Multicast addresses fall into the range
Generally, a switch will only forward frames on an interface where it knows the destination Ethernet address resides. In contrast, a Multicast destination address in an Ethernet frame does not map to a known host in the network, so the switch needs to treat a Multicast frame differently than a standard Ethernet frame. A switch should forward all Multicast addressed frames on all interfaces because the Multicast address OUI starts with
01, which signifies the frame is a Broadcast frame.
By making use of the Broadcast octet, Multicast can be supported on both vanilla and feature-rich switches. A dumb bridge might just forward all packets on all interfaces; a standard off-the-shelf switch with no knowledge of Multicast will forward the Multicast packets on all interfaces due to the presence of the Broadcast octet. An intelligent switch, however, will maintain its own Multicast group mappings to use for forwarding.
By maintaining a mapping of Multicast addresses and only forwarding Multicast frames on links where group members are present, an intelligent switch can avoid needlessly congesting links in the layer 2 network with Multicast frames. Intelligent switches are therefore highly valued in datacenter Multicast deployments.
Intelligent switches can snoop IGMP messages as they pass through. They maintain their own mapping of interfaces to groups (in some cases MAC addresses to groups), and send IGMP reports upstream. By performing this IGMP snooping, switches avoid the congestion caused by many IGMP reports being sent from hosts to the router (this has partly been addressed in IGMPv3 which supports multiple groups per report).
A sharp observer will notice that Ethernet is a 48-bit address space, while IPv4 is a 32-bit address space. What does this mean for IPv4 to Ethernet address conversion? The first four bits of the IPv4 address are fixed for class-D addresses, thus we are left with 28 bits. We use the lower 23 bits of the IPv4 address to convert into the lower 23 bits of the Ethernet address (the first bit of the NIC specific bytes of the Ethernet address must be 0, hence 23 bits rather than 24). This leaves five (28 bits – 23 bits) unused bits in the IPv4 address, meaning each Multicast Ethernet address can represent 32 IPv4 Multicast groups. On receipt of a Multicast frame, the host will analyze the address in the IP header (if present) and filter the unintended messages received for groups for which it is not a member[x].
Multicast in Action (code example)
Generally, best-effort protocols such as UDP and RTP are used with Multicast, as re-transmission causes duplicate packet receipt for the other recipients. All hosts must share the ability to decrypt and encrypt traffic, meaning many applications that wish to use Multicast have to generate a shared key for all recipients.
Multicast is not supported on the internet due to security concerns, largely stemming from fear of Denial of Service attacks[xi]. Multicast is generally supported out-of-the-box on off-the-shelf routers like the ones found in the average home. This code example demonstrates rudimentary Multicast communication between nodes within a network.
To compile and run, execute the following command on two or more machines on your LAN:
javac Node.java && java Node;
You should see messages being sent and received by each node using Multicast!
For the full code listing, see the following Multicast gist.
Multicast allows us to distribute data to multiple recipients simultaneously. Hosts join a group, and from that point onwards they are sent copies of all messages destined for that group. Any host on the network can attempt to join one of these groups by sending a special IGMP message to the network’s router to receive that group’s messages.
There may be efficiency benefits to using Multicast in environments where large numbers of sensors or sensor-enabled devices communicate, such as Ubiquitous (and Personal Area) Networks (UPANs). Imagine, for example, a controller device Multicasting information to a set of sensor devices worn around the body.
In the short term, Big Data and Cloud environments, categorized by large numbers of co-located servers housed in data-centers, can take advantage of Multicast communication. This need for timely data analysis by large numbers of machines is currently driving the resurgence of Multicast.
If you are interested in the specifics of Multicast please see the references below. For a more detailed overview, however, have a look at the following links:
[viii] There are two types of Multicast: layer 3 and layer 2 Multicast. Layer 2 Multicast can be used independently of layer 3 Multicast. Both implementations use IGMP, necessitating the conversion from IP address to MAC address. Layer 3 Multicast has too large a scope to discuss here.