Last week I invited Neeraj Malhotra, a Principal Engineer at Cisco, to present at the Bay Area Network Operators Group (BANOG) on how EVPN can be used to build multi-tenant data center fabrics. Neeraj gave a great presentation and I’m sharing his slides below.
EVPN has many use cases and this presentation focuses on EVPN as a control plane in the data center to support host/VM mobility and active/active multihoming. I’m currently looking at EVPN as alternative to replace the proprietary MLAG technology in my data centers.
The presentation abstract and slides are below. Enjoy.
EVPN-IRB (Integrated Routing and Bridging) is a technology that leverages BGP EVPN as common overlay control plane to enable VPN routing and bridging service over an MPLS or IP underlay fabric. Point to multi-point bridging service enables VLANs to be stretched across data center IP or MPLS fabric, while VPN routing service enables inter-subnet routing across these stretched subnets. It hence allows for flexible workloads with seamless VM mobility across the stretched subnet.
This talk will provide a tutorial of relevant EVPN constructs and procedures used to enable overlay bridged and routed connectivity between tenant workloads in a data center and compare three main design choices with respect to overlay routing architecture: – Centralized EVPN-IRB: with centralized first-hop any-cast GW on the border leafs OR DCI / DC Edge routers – Asymmetric EVPN-IRB: with distributed first-hop any-cast GW on the ToRs – Symmetric EVPN-IRB: with distributed first-hop any-cast GW on the ToRs
It will further focus on a symmetric EVPN-IRB design with distributed any-cast GW, and go thru detailed packet walks to get a good feel for how an EVPN-IRB based DC fabric works to provide any to any L2 and L3 overlay connectivity.
In this post I will walk you through some steps you can take to troubleshoot BGP neighbor adjacency. These steps become even more helpful when you have access to only one side on the link (in the case where you are trying to run BGP with a service provider). We will focus in this post on some of the reasons that may prevent two BGP routers from forming a relationship and will demonstrate along the way how the BGP state machine moves from Idle to Established.
I’m using here two Cisco CSR1000v routers as my BGP speakers but the tips below apply in general to any router from any vendor.
In the diagram above I have two routers, R1 and R2, with two parallel physical links between them. The two routers want to peer using the loopback addresses via BGP which is a common way to do load sharing between two routers. However the BGP adjacency is not coming up and stuck in Idle state as you can see from the output below:
R2#sh ip bgp sum
BGP router identifier 188.8.131.52, local AS number 200
BGP table version is 1, main routing table version 1
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
184.108.40.206 4 100 0 0 1 0 0 never Idle
Follow the steps below to verify that your configurations are complete and that there are no connectivity issues between the two routers:
1- First test and verify that R1 is reachable from R2 and vice versa. Issue a ping command from R2 sourcing your ping from the loopback0 interface with R2’s loopback0 interface as the destination as shown below:
R2#ping 220.127.116.11 source loopback 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 18.104.22.168, timeout is 2 seconds:
Packet sent with a source address of 22.214.171.124
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
If your ping fails then you have a connectivity problem and you need to fix that before continuing this process. You want to make sure that the static routes are there on each router and that each router is able to ARP for the other router’s IP address.
2- You also need to verify that there is no firewall between R1 and R2 blocking TCP port 179 which is the port BGP uses to establish the connection. A quick way to ensure whether there is no firewall between R1 and R2 is to use the telnet command with port 179 as the destination port. Perform this test from both routers and don’t forget to source the traffic from the loopback interface:
As you can see from the output above, I got a “Connection refused by remote host” response when I tried to telnet from R2 to R1. This simply means that there is no device in the middle blocking traffic and R1 is rejecting the request obviously because 179 is not a standard port for telnet. If there was a firewall in the middle blocking traffic then you would get a “Unable to connect to remote host” response instead.
3- Now that we have verified that there are no connectivity problems, let’s focus on the BGP configurations. I will turn on “debug ip bgp x.x.x.x” on the router which shows me that the router is failing to establish a TCP connection with its peer. A good show command to use at this point is “show ip bgp neighbor x.x.x.x”
R2#show ip bgp nei 126.96.36.199
BGP neighbor is 188.8.131.52, remote AS 100, external link
BGP version 4, remote router ID 0.0.0.0
Address tracking is enabled, the RIB does have a route to 184.108.40.206
Connections established 0; dropped 0
Last reset never
External BGP neighbor not directly connected.
Transport(tcp) path-mtu-discovery is enabled
Graceful-Restart is disabled
No active TCP connection
The output above tells me two important things. First that the RIB does have a route to reach the peer which confirms that the router has static routes needed to reach its peer’s loopback address.
The second important thing this output shows is this line: “External BGP neighbor not directly connected”. By default only directly connected eBGP peers are allowed to establish relationship.In order to change this default behavior, I have to add the “neighbor disable-connected-check” on both routers.
4- Even after disabling the direct-connected check, BGP relationship is still not coming up so my next step is to enable “debug ip tcp transactions” to see if that tells me why the TCP connection is failing:
R1#deb ip tcp transactions
TCP special event debugging is on
*Mar 13 05:53:52.522: Reserved port 0 in Transport Port Agent for TCP IP type 0
*Mar 13 05:53:52.522: TCP: connection attempt to port 179
*Mar 13 05:53:52.522: TCP: sent RST to 192.168.32.20:27730 from 220.127.116.11:179
The last line in the output above is interesting. It is showing that R1 (18.104.22.168:179) is sending a TCP reset to R2 (192.168.32.20). Which means that it was R2 who initiated the TCP session. What this reveals also is that R2 sourced the connection request from its physical interface which is the default behavior in eBGP. But since I want the routers to peer using the loopback addresses and each router is expecting to receive a connection request from its peer loopback address, then i need to add the “neighbor update-source” to BGP on both ends
5- Now if I look at “debug ip bgp“, I can see that the TCP session is getting established and BGP is transitioning from the Idle -> Connect -> OpenSent -> OpenConfirm as shown below:
*Mar 13 06:49:28.979: BGP: 22.214.171.124 passive open to 126.96.36.199
*Mar 13 06:49:28.979: BGP: Fetched peer 188.8.131.52 from tcb
*Mar 13 06:49:28.979: BGP: 184.108.40.206 passive went from Idle to Connect
*Mar 13 06:49:28.979: BGP: ses global 220.127.116.11 (0x7F028066E270:0) pas Receive OPEN
*Mar 13 06:49:28.979: BGP: ses global 18.104.22.168 (0x7F028066E270:0) pas Send OPEN
*Mar 13 06:49:28.979: BGP: 22.214.171.124 passive went from Connect to OpenSent
*Mar 13 06:49:28.979: BGP: 126.96.36.199 passive went from OpenSent to OpenConfirm
*Mar 13 06:49:28.980: %BGP-3-NOTIFICATION: received from neighbor 188.8.131.52 passive 2/2 (peer in wrong AS) 2 bytes 00C8
*Mar 13 06:49:28.980: BGP: ses global 184.108.40.206 (0x7F028066E270:0) pas Receive NOTIFICATION 2/2 (peer in wrong AS) 2 bytes 00C8
*Mar 13 06:49:28.980: %BGP-5-NBR_RESET: Neighbor 220.127.116.11 *Mar 13 06:49:28.980: BGP: 18.104.22.168 passive went from OpenConfirm to Closing
*Mar 13 06:49:28.980: BGP: 22.214.171.124 passive went from Closing to Idle
When BGP is in the OpenConfirm state it’s one step away from reaching its final state (ESTABLISHED) and while in the OpenConfirm state BGP waits to hear a KEEPALIVE from its peer before it moves to the Established state. As you can see from the output above, after reaching OpenConfirm BGP instead closes the connection and transitions back to Idle because it receives an error (peer in wrong AS).
This is a clear indication that the AS number on R1 is wrong so I will fix that and issue a “clear ip bgp *” command to restart the process.
And now after I corrected AS number in the configs on R1, the BGP state machine transitions to Established as you see below and the two peers can start exchanging routing updates and keepalives.
R2#sh ip bgp neighbors 126.96.36.199
BGP neighbor is 188.8.131.52, remote AS 100, external link
BGP version 4, remote router ID 184.108.40.206
BGP state = Established, up for 00:00:42
Obviously that’s not everything, and there are other reasons that could prevent BGP relationship from being established but I wanted to discuss the most common ones that I have seen in the field. Do you have something to share? Please respond and share below.
Here is the final configs for R1 and R2 for your reference.