apparently there is some problem in visualization for someone (colors and images). I'm not willing to debug blogger so here is a link to a public doc containing the same info about openstack with vlan net debugging

Everything You Always Wanted to Know About Openstack network*

* But Were Afraid to Ask

AKA Openstack debugging VLAN setup

Disclaimer

Here is a tentative guide to test and debug mostly the networking in the Openstack cloud world.

We have spent huge amount of time looking at packet dumps in order to distill this information for you in the belief that, following the recipes outlined in the following pages, you will have an easier time !

Keep in mind that this is coming more from a day by day debug than from a structured plan so I tried to separate the pieces according to the architecture that I have in mind... but is and will remain a work in progress.

Reference setup:

The setup is the following:

compute node - Ubuntu server 14 - 4 ethernet interfaces mapped on em1-4 (3 used)
controller - compute node - Ubuntu server 14 - 4 ethernet interfaces mapped on em1-4 (3 used)
network node - Ubuntu server 14 - 4 ethernet interfaces mapped on em1-4 (3 used)

The networking configuration is implemented within neutron service and based on a VLAN approach so to obtain a completly L2 separation of a multiple tenant environment.

Follow the openstack guide to configure the services (in appendix the configuration files that has been used in this case and few configuration scripts).

Preliminary checks

Once you agreed with your network administrators on the switches configuration (If you have no direct access to them) let's double check the port configuration for the vlan ids:

Capture an LLDP packet (0x88cc) from each host and for each interface:

# tcpdump -vvv -s 1500 ether proto 0x88cc -i em1

(wait for a packet and then CTRL-c)

this command will give you some information about the switch that you are connected to and the VLAN configuration. NB if the port is in trunk you may get the same result as if the port is without VLAN settings.

An example of the output of the command for an interface attached to a port that is configured as access:

tcpdump: WARNING: em1: no IPv4 address assigned

tcpdump: listening on em1, link-type EN10MB (Ethernet), capture size 1500 bytes

12:33:03.255101 LLDP, length 351

[...]

System Name TLV (5), length 13: stackdr2.GARR

0x0000: 7374 6163 6b64 7232 2e47 4152 52

[...]

Port Description TLV (4), length 21: GigabitEthernet2/0/31

[...]

Organization specific TLV (127), length 6: OUI Ethernet bridged (0x0080c2)

Port VLAN Id Subtype (1)

port vlan id (PVID): 320

[...]

1 packet captured

1 packet received by filter

0 packets dropped by kernel

an example of the output of the command for an interface attached to a port that is configured as trunk:

# tcpdump -vvv -s 1500 ether proto 0x88cc -i em3

tcpdump: WARNING: em3: no IPv4 address assigned

tcpdump: listening on em3, link-type EN10MB (Ethernet), capture size 1500 bytes

12:32:11.513135 LLDP, length 349

[...]

System Name TLV (5), length 13: stackdr2.GARR

[...]

Port Description TLV (4), length 20: GigabitEthernet2/0/3

[...]

Port VLAN Id Subtype (1)

port vlan id (PVID): 1

[...]^C

1 packet captured

1 packet received by filter

0 packets dropped by kernel

Check Interfaces

On compute nodes, use the following command to see information about interfaces: IPs, VLAN ids and to know wether the interfaces are up:

# ip a

one good initial sanity check is to make sure that your interfaces are up:

# ip a |grep em[1,3] |grep state

2: em3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000

6: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

37: br-em3: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default

Troubleshooting Open vSwitch

Open vSwitch is a multilayer virtual switch. Full documentation can be found at the website. In practice you need to ensure that the required bridges (br-int, br-ex, br-em1, br-em3 etc) exist and have the proper ports connected to them with the ovs-vsctl and ovs-ofctl commands.

To list the bridges on a system (VLAN networks are trunked through the em3 network interface):

# ovs-vsctl list-br

br-em3

br-ex

br-int

Example:on the network node (you should follow the same logic on the compute one)

Let’s check the chain of ports and bridges. The bridge br-em3 contains the physical network interface em3 (trunk network) and the virtual interface phy-br-em3 attached to the int-br-em3 of the br-int:

# ovs-vsctl list-ports br-em3

em3

phy-br-em3

# ovs-vsctl show

Bridge "br-em3"

Port "em3"

Interface "em3"

Port "phy-br-em3"

Interface "phy-br-em3"

type: patch

options: {peer="int-br-em3"}

Port "br-em3"

Interface "br-em3"

type: internal

br-int contains int-br-em3 which pairs with phy-br-em3 to connect to the physical network which is used to connect to the compute nodes and the TAP devices that connect to the DHCP instances and the Tap interfaces that connects to the virtual routers:

# ovs-vsctl list-ports br-int

int-br-em3

int-br-ex

qr-9ae4acd4-92

qr-ae75168a-67

qr-e323976e-2b

qr-e3debf8d-ee

tap1474f18d-a9

tap7c29ce27-4e

tapc974ab53-25

tapd9762af3-4b

# ovs-vsctl show

Bridge br-int
       fail_mode: secure
       Port "tapd9762af3-4b"
           tag: 5
           Interface "tapd9762af3-4b"
               type: internal
       Port int-br-ex
           Interface int-br-ex
               type: patch
               options: {peer=phy-br-ex}
      [...]

Port "qr-9ae4acd4-92"
           tag: 1
           Interface "qr-9ae4acd4-92"
               type: internal
       Port br-int
           Interface br-int
               type: internal
       Port "tap1474f18d-a9"
           tag: 3
           Interface "tap1474f18d-a9"
               type: internal

# ovs-vsctl list-ports br-ex
Bridge br-ex
       Port br-ex
           Interface br-ex
               type: internal
       Port "em4"
           Interface "em4"
       Port phy-br-ex
           Interface phy-br-ex
               type: patch
               options: {peer=int-br-ex}

If any of these links is missing or incorrect, it suggests a configuration error.

NB: you can also check the correct vlan tags translation along the overall chain with ovs-ofctl commands i.e. (more details follows)

# ovs-ofctl dump-flows br-int

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=6718.658s, table=0, n_packets=0, n_bytes=0, idle_age=6718, priority=3,in_port=1,dl_vlan=325 actions=mod_vlan_vid:4,NORMAL

cookie=0x0, duration=6719.335s, table=0, n_packets=0, n_bytes=0, idle_age=6719, priority=3,in_port=1,dl_vlan=327 actions=mod_vlan_vid:3,NORMAL

cookie=0x0, duration=6720.508s, table=0, n_packets=3, n_bytes=328, idle_age=6715, priority=3,in_port=1,dl_vlan=328 actions=mod_vlan_vid:1,NORMAL

cookie=0x0, duration=5840.156s, table=0, n_packets=139, n_bytes=13302, idle_age=972, priority=3,in_port=1,dl_vlan=320 actions=mod_vlan_vid:5,NORMAL

cookie=0x0, duration=6719.906s, table=0, n_packets=58, n_bytes=6845, idle_age=6464, priority=3,in_port=1,dl_vlan=324 actions=mod_vlan_vid:2,NORMAL

cookie=0x0, duration=6792.845s, table=0, n_packets=555, n_bytes=100492, idle_age=9, priority=2,in_port=1 actions=drop

cookie=0x0, duration=6792.025s, table=0, n_packets=555, n_bytes=97888, idle_age=9, priority=2,in_port=2 actions=drop

cookie=0x0, duration=6793.667s, table=0, n_packets=203, n_bytes=22402, idle_age=4535, priority=1 actions=NORMAL

cookie=0x0, duration=6793.605s, table=23, n_packets=0, n_bytes=0, idle_age=6793, priority=0 actions=drop

Bridges can be added with ovs-vsctl add-br, and ports can be added to bridges with ovs-vsctl add-port.

Troubleshoot neutron traffic

Refer to the Cloud Administrator Guide for a variety of networking scenarios and their connection paths. We use the Open vSwitch (OVS) backend.

See the following figure for reference.

The instance generates a packet and sends it through the virtual NIC inside the instance, such as eth0.
The packet transfers to a Test Access Point (TAP) device on the compute host, such as tap1d40b89c-fe. You can find out what TAP is being used by looking at the /etc/libvirt/qemu/instance-xxxxxxxx.xml file.

following an example with the interesting parts in evidence:

<domain type='kvm'>
<name>instance-00000015</name>
<uuid>cc2b7876-6d3a-4b78-b817-ed36146a9b9e</uuid>
[....]

<controller type='pci' index='0' model='pci-root'/>
   <interface type='bridge'>
     <mac address='fa:16:3e:a4:56:3d'/>
     <source bridge='qbrff8e411e-6e'/>
     <target dev='tapff8e411e-6e'/>
     <model type='virtio'/>
     <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
   </interface>
   <serial type='file'>
     <source

[....]

another means of finding the device name is to use the neutron commands: to get the port ID associated with IP address 192.168.20.103, do this

# neutron port-list | grep 192.168.4.102| cut -d \| -f 2
ff8e411e-6e08-499f-b9a5-0beca2c94b85

fig: Neutron network paths see here for more details at the networking scenarios chapter

Looking also at the neutron part and highlighting the VLAN configuration we have something like (I recycled the image so the br-eth1 is br-emXX in my setup and ethYY are emZZ but the flow is the point that I want to stress here):

The TAP device is connected to the integration bridge, br-int. This bridge connects all the instance TAP devices and any other bridges on the system. int-br-eth1 is one half of a veth pair connecting to the bridge br-eth1, which handles VLAN networks trunked over the physical Ethernet device eth1.
The TAP devices and veth devices are normal Linux network devices and may be inspected with the usual tools, such as ip and tcpdump. Open vSwitch internal devices are only visible within the Open vSwitch environment.

# tcpdump -i int-br-em3
tcpdump: int-br-em3: No such device exists
(SIOCGIFHWADDR: No such device)

To watch packets on internal interfaces you need to create a dummy network device and add it to the bridge containing the internal interface you want to snoop on. Then tell Open vSwitch to mirror all traffic to or from the internal port onto this dummy port so to run tcpdump on the dummy interface and see the traffic on the internal port.
Capture packets from an internal interface on integration bridge, br-int (advanced):

Create and bring up a dummy interface, snooper0:
# ip link add name snooper0 type dummy
# ip link set dev snooper0 up
Add device snooper0 to bridge br-int:

# ovs-vsctl add-port br-int snooper0

Create mirror of for example int-br-em3 interface to snooper0 (all in one line - returns UUID of mirror port):

# ovs-vsctl -- set Bridge br-int mirrors=@m -- --id=@snooper0 get Port snooper0 -- --id=@int-br-em3 get Port int-br-em3 -- --id=@m create Mirror name=mymirror select-dst-port=@int-br-em3 select-src-port=@int-br-em3 output-port=@snooper0

dcce2c59-be1a-4f2d-b00b-9d906c77ee8a

and from here you can see the traffic going through int-br-em3 with a tcpdump -i snooper0.
Clean up mirrors:

# ovs-vsctl clear Bridge br-int mirrors
# ovs-vsctl del-port br-int snooper0
# ip link delete dev snooper0

On the integration bridge, networks are distinguished using internal VLAN ids (unrelated to the segmentation IDs used in the network definition and on the physical wire) regardless of how the networking service defines them. This allows instances on the same host to communicate directly without transiting the rest of the virtual, or physical, network. On the br-int, incoming packets are translated from external tags to internal tags. Other translations also happen on the other bridges and will be discussed later.

To discover which internal VLAN tag is in use for a given external VLAN by using the ovs-ofctl command:

Find the external VLAN tag of the network you're interested in with

Grep for the provider:segmentation_id, 324 in this case, in the output of ovs-ofctl dump-flows br-int:

# ovs-ofctl dump-flows br-int|grep vlan=324

cookie=0x0, duration=105039.122s, table=0, n_packets=5963, n_bytes=482203, idle_age=1104, hard_age=65534, priority=3,in_port=1,dl_vlan=324 actions=mod_vlan_vid:1,NORMAL

Here you can see packets received on port ID 1 with the VLAN tag 324 are modified to have the internal VLAN tag 1. Digging a little deeper, you can confirm that port 1 is in fact :
# ovs-ofctl show br-int

OFPT_FEATURES_REPLY (xid=0x2): dpid:0000029a51549b40

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE

1(int-br-em3): addr:52:40:bd:b3:88:9c

config: 0

state: 0

speed: 0 Mbps now, 0 Mbps max

2(qvof3b63d31-a0): addr:4e:db:74:04:53:4d

config: 0

state: 0

current: 10GB-FD COPPER