Recently we have had some issues with multi DC Cassandra clusters, there were issues around latency and timeouts. this had only started since we had created the second datacenter.
The development team had been tasked to ensure they were using a DC aware load balancer and to ensure they were using local consistency, this ensured they should only be talking to the local DC, rather than the second DC which was positioned with a different cloud provider in a different country.
Using opscenter I could see there were some reads going directly to the new datacenter, but the development could not find them where they were coming from.
So it was time to use ngrep , which describes itself as “like GNU grep applied to the network layer”. Perfect to see which IP’s are connecting to cassandra via the 9042 port
Firstly you may need to install ngrep on to your nodes
apt install ngrep
Then it is ready to run:
ngrep ” -d any -x dst port 9042 and dst host xxx.xxx.xxx.xxx
The ” means grep for anything, if you want to look for something in a network packet then that can used here, but for what we are doing we don’t want to limit the packets by any character string
-d any ensures that all devices are checked
dst port 9042 limits the packets displayed to the ones with a destination port of 9042
dst host xxx.xxx.xxx.xxx limits the packets to the ones arriving at this node xxx.xxx.xxx.xxx is the IP address of the node you are running this on
This will now start displaying all the the incoming connections to this node on port 9042 which is the CQL native port, so this is not the port the nodes talk to each other on.
This will give you messages like this:
T zzz.zzz.zzz.zzz:42768 -> xxx.xxx.xxx.xxx:9042 [AP]
In this instance the zzz.zzz.zzz.zzz is the node sending the cql request. You will also see the cql being sent in the message, so you should have a good idea where the requests are now coming from.
There is one complication if you are using Datastax Opscenter, as the Datastax agent uses cql to talk to cassandra on the local node, and also opscenter itself will connect to the cluster. To remove these packets you should use the following command:
ngrep ” -d any -x dst port 9042 and dst host xxx.xxx.xxx.xxx and not src host xxx.xxx.xxx.xxx and not src host yyy.yyy.yyy.yyy
where yyy.yyy.yyy.yyy is the IP address of opscenter.