MAC Flaps – why are they bad? | Network Development Team

What is a MAC Flap?

A MAC Flap is caused when a switch receives packets from two different interfaces with the same source MAC address. If this makes no sense, perhaps a quick summary of how switching at layer 2 works will help.

Switches learn where hosts are by examining the source MAC address in frames received on a port, and populating its MAC address-table with an entry for that MAC address and port. Say a device ‘A’ with MAC aaaa.aaaa.aaaa (hereafter aaaa) sends a frame to device ‘B’ with MAC address bbbb. Assume A is on port 0/1 and B is on port 0/2. The switch populates it MAC address-table something like:

Port		Host
0/1		aaaa

and floods the frame out of all other ports. When B replies the MAC address table becomes:

Port		Host
0/1		aaaa
0/2		bbbb

and the switch forwards the frame to port 0/1 – there is no need to flood now since the location of A is known.

If the switch were to then receive a frame on port 0/2 with a source MAC address of aaaa, there would be clash and the switch would log something like this:

1664321: Nov 14 11:18:16 UTC: %MAC_MOVE-SP-4-NOTIF:
Host aaaa.aaaa.aaaa in vlan A is flapping between
port 0/1 and port 0/2

and the MAC address-table would become:

Port		Host
0/1
0/2		bbbb
0/2		aaaa

What happens when B tries to send A a frame now? The switch won’t flood the frame as it knows a destination and it won’t send the frame back down the link – it gets dropped.

Lab time…

Let’s see if we can mimic this. This isn’t an easy thing to replicate so please forgive the artificial nature of the lab. I configured a switch with three hosts directly connected on VLAN 30. The hosts could ping each other and the MAC address-table was as follows:


3750-1#show mac address-table dynamic vlan 30
          Mac Address Table
-------------------------------------------

Vlan    Mac Address       Type        Ports
----    -----------       --------    -----
  30    0008.7c82.5409    DYNAMIC     Fa1/0/1
  30    001a.2f22.d0c2    DYNAMIC     Fa1/0/2
  30    0024.97f0.3a70    DYNAMIC     Fa1/0/3
Total Mac Addresses for this criterion: 3

Host A had an IP of 192.168.30.1 and was on port 1. Host B was 192.168.30.30 and on port 2. Host C was 192.168.30.254 and on port 3.

So, ping with host A:

Host A# ping 192.168.30.254
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5),
round-trip min/avg/max = 1/201/1000 ms

Ping with host B:

Host B#ping 192.168.30.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5),
round-trip min/avg/max = 1/2/8 ms

Next I manually set host A to have the same MAC address as host B (001a.2f22.d0c2). The results? Host B lost connectivity for a few seconds.

Host A# int vlan 30
Host A(config-if)# mac-address 001a.2f22.d0c2
Host A# ping 192.168.30.254
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Here is the switch mac address table after the clone:

3750-1#show mac address-table dynamic vlan 30
 Mac Address Table
-------------------------------------------

Vlan    Mac Address       Type        Ports
----    -----------       --------    -----
 30    0008.7c82.5409    DYNAMIC     Fa1/0/1
 30    001a.2f22.d0c2    DYNAMIC     Fa1/0/1
 30    0024.97f0.3a70    DYNAMIC     Fa1/0/3
Total Mac Addresses for this criterion: 3
3750-1#
*Mar 17 04:22:02.620: %SW_MATM-4-MACFLAP_NOTIF:
Host 001a.2f22.d0c2 in vlan 30 is flapping between
port Fa1/0/2 and port Fa1/0/1
3750-1#

Here is what happened to Host B:

Host B#ping 192.168.30.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Host B#ping 192.168.30.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5),
round-trip min/avg/max = 1/2/8 ms
Host B#ping 192.168.30.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.30.254,
timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5),
round-trip min/avg/max = 1/1/1 ms

Yes, this is the same impact you would have if two hosts had the same MAC on your network – there is a reason they need to be unique!

What does all this mean?

When you have an annexe VLAN [1] the backbone can be thought of as a series of Layer 2 switches for that VLAN. The ‘Broadcast Domain’ stretches over the entire Backbone. This means the CPU of every host (including our core switches) on a VLAN will receive every broadcast from every other host – this is not ideal but the only way we can offer the same subnet at multiple sites in this generation of the backbone. Another term sometimes used is ‘Failure Domain’. That is, a failure in part of the VLAN could impact the entire core. It is because of this risk to other units that we are keen to make sure annexe VLANs are tightly managed.

[1] These are known as Layer 2 end-to-end VLANs as there is no routing involved. We have called them ‘switched’ VLANs in the past. VLANs with a Layer 3 interface or SVI on the backbone are known as Layer 3 Routed VLANs.

To return to the the issues MAC flaps will cause on your network, each switch in the backbone has a MAC address-table for your VLAN. If for some reason your MAC addresses appear from different locations you will get dropped packets and our logs will fill up with messages which cause issues when we raise a support case with Cisco as our network appears to have loops.

What could cause it?

There are two or three common causes that we see.

Local loops
NAC
Wireless

1. Local Loops

If you don’t run STP then you are far more likely to suffer from network loops. Here are a couple of resources: STP is your friend and Implementing Spanning Tree. The issue with an annexe VLAN is that a local loop is no longer so local and could cause problems everywhere, both for you and others.

2. NAC

There is a legitimate but ill-advised network design which can cause issues. If you have a L2 NAC which forces all traffic through itself then it is possible that a frame will need to leave site A, get switched through to site B only to return to site A, all with the same MAC address. See the image below. I’ve represented the Backbone as one red switch and the ingress and egress ports as tunnel entrances and exits. This design mustn’t be used with the current generation of the backbone.

NAC issue

3. Wireless

We used to run OWL and eduroam (Phase 1) over two VLANs which spanned the entire core. Due to the issues I’ve mentioned we changed this last year. Now the VLANs are local to the FroDos and routed through the core. Prior to doing this it was possible to roam from access points connected to different FroDos and cause MAC flaps.

What should I do next?

We’re going to keep an eye on the logs and will let Units know if they are causing MAC flaps. We’ll work with you as far as possible to locate the source of the issue and get things stable. If you aren’t yet running STP please can I urge you do consider doing so. The new backbone is still some years off so for the good of everyone we need to work together to reduce this. For units which cannot resolve this we may need to look at reverting to a fully routed connection, with each Annexe having its own subnet.

Do get in touch if you have any questions.

Gaurav says:

2015-12-09 at 13:39

Hello,

What should be the behavior in the below case ..

mac M1 is statically added to forward to interface xe1 in vlan V1. now packets arrive on xe2 with SA mac as M1. does this packet flood in the vlan ?? (all ports are part of some vlan V1).

There will be no mac transplant since the i have added statically, but will the packet flood ??

Maciej says:

2015-06-10 at 14:42

@Miguel
Configure Port Channel 10 interface with these commands :

switchport trunk allowed vlan 1,11,12
switchport mode trunk

And show us “show interfaces trunk ” command effect before and after change

bijith says:

2014-08-06 at 08:09

Hi
How about the mac address flap between the pair of switch ? for example

MAC address is flapping between the pair of switch.

SWITCH01#sh mac address-table address 00AA.02aa.1234
Mac Address Table
——————————————-
Vlan Mac Address Type Ports
—- ———– ——– —–
00AA.02aa.1234 DYNAMIC Gi0/15

SWITCH02#sh mac address-table address 00AA.02aa.1234
Mac Address Table
——————————————-
Vlan Mac Address Type Ports
—- ———– ——– —–
00AA.02aa.1234 DYNAMIC Gi0/13

Miguel Herrera says:

2012-01-23 at 14:43

I have a Cisco C3560 switch (core switch), which want to make a cisco switch EtherChannel SGE2010 another small business. When I switch the connection between, for a moment lifted the interface port-channel10, but over a time message is displayed:% SW_MATM-4-MACFLAP_NOTIF: Host xxxx.xxxx.xxxx in vlan1 is flapping beween port PO10 and port Gix/x

La configuración del C3560 es:
Building configuration…

Current configuration : 2742 bytes
!
version 12.2
no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
!
hostname SW-CORE
!
boot-start-marker
boot-end-marker
!
enable secret 5 $1$nCgi$sf5IS0qv4ftx/jKN5gT70.
!
no aaa new-model
system mtu routing 1500
ip subnet-zero
ip routing
!
!
spanning-tree mode pvst
spanning-tree extend system-id
!
vlan internal allocation policy ascending
!
!
interface Port-channel10
description Enlace SW1-48P
switchport trunk encapsulation dot1q
switchport mode trunk
!
interface Port-channel2
description Enlace SW2-48P
switchport trunk encapsulation dot1q
switchport mode trunk
!
interface Port-channel3
description Enlace SW3-24P
switchport trunk encapsulation dot1q
switchport mode trunk
!
interface GigabitEthernet0/1
!
interface GigabitEthernet0/2
!
interface GigabitEthernet0/3
!
interface GigabitEthernet0/4
!
interface GigabitEthernet0/5
!
interface GigabitEthernet0/6
!
interface GigabitEthernet0/7
!
interface GigabitEthernet0/8
!
interface GigabitEthernet0/9
!
interface GigabitEthernet0/10
!
interface GigabitEthernet0/11
!
interface GigabitEthernet0/12
!
interface GigabitEthernet0/13
!
interface GigabitEthernet0/14
!
interface GigabitEthernet0/15
!
interface GigabitEthernet0/16
!
interface GigabitEthernet0/17
!
interface GigabitEthernet0/18
!
interface GigabitEthernet0/19
description Enlace SW1
channel-group 10 mode active
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/20
description Enlace SW1
channel-group 10 mode active
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/21
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/22
description Enlace SW2
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/23
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/24
description Enlace SW3-P28
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,11,12
switchport mode trunk
!
interface GigabitEthernet0/25
!
interface GigabitEthernet0/26
!
interface GigabitEthernet0/27
!
interface GigabitEthernet0/28
!
interface Vlan1
ip address 192.0.1.115 255.255.255.0
!
interface Vlan11
description Red PayRoll
ip address 192.0.11.1 255.255.255.0
!
interface Vlan12
description Red Server
ip address 192.0.12.1 255.255.255.0
!
ip classless
ip route 0.0.0.0 0.0.0.0 192.0.1.2
ip http server
!
!
control-plane
!
!
line con 0
line vty 0 4
password cisco
login
length 0
line vty 5 15
password cisco
login
!
end

SW-CORE#

The switch supports only SGE2010 LACP protocol, which is configured in a single step, which is why I set it up the core switch so active.

How I can solve my problem? Is there a configuration error on my core switch? Please help

rob says:

2011-12-22 at 15:58

Great article, answered my question.

OUCS Backbone Network Naming and Numbering Conventions | OUCS Networks Team says:

2011-08-15 at 19:28

[…] it has the disadvantage of creating a large L2 (failure) domain which is a Very Bad Idea. See http://blogs.it.ox.ac.uk/networks/2011/02/04/mac-flaps-why-are-they-bad/ for more on this. Some Annexes have their own L3 connection which is less convenient but better […]

Network Development Team