netstat -s: Meanings of network statistics

1. Find the meanings

It's not easy to find a document talking about the exact meanings of the statistics in the output of netstat -s. One way to find them is to look at the source code of netstat.c, specifically the statistics part of it. Take "incoming packets delivered" in the "Ip" section of the command output as an example. On the surface, this sounds contradictory: it is a counter for incoming packets but is also a counter for something delivered, i.e. sent or outgoing? To know what it is exactly, find in the source code of netstat.c, and notice

{"InDelivers", N_("%llu incoming packets delivered"), number},
Then you look up InDelivers in RFC2011 for the protocol, where we see
The total number of input datagrams successfully delivered to IP user-protocols (including ICMP).
as the description of ipInDelivers. So this counter is for incoming, not outgoing, packets. But what does "delivered" mean here? What is "IP user-protocols"? In RFC986 we see
The Current IP addresses and IP user protocol numbers can be found in [4].
in which Reference [4] points to RFC960, and in there we can confirm that the so-called "IP user-protocols" are indeed upper layer, i.e. transport layer (layer 4 in the OSI model) protocols, such as TCP, UDP, ICMP, etc. So "delivered" simply means successfully passed to the upper network layer above the IP or network layer (layer 3).

2. What to focus on

If you suspect a network is not highly reliable, what statistics in the output of netstat -s should you focus on? The following, taking IP and UDP stats as examples, is a starting point.

Ip:
Forwarding: 2
7522922517 total packets received
0 forwarded
0 incoming packets discarded
5797033274 incoming packets delivered
5215103737 requests sent out
3810 outgoing packets dropped <- 3810/5215103737=.00000073, relative to "requests sent out", this ratio is OK
60 dropped because of missing route <- 60/5215103737=.00000001, low, OK
272481 fragments dropped after timeout <- 272481/7522922517=.000036, relative to "total packets received", this ratio is borderline
2030052935 reassemblies required
304163692 packets reassembled ok
311741515 packet reassemblies failed <- 311741515/304163692=1.025, relative to "packets reassembled ok", this ratio is very high
334447065 fragments received ok
75193 fragments failed <- 75193/334447065=.000225, relative to "fragments received ok", this ratio is borderline or may be a little high
2332099696 fragments created

Udp:
3661310826 packets received
2873 packets to unknown port received <- 2873/3661310826=.0000008, relative to "packets received", this is OK
0 packet receive errors
3620677413 packets sent
0 receive buffer errors
3810 send buffer errors <- 3810/3620677413=.000001, relative to "packets sent", this is borderline
The key is to find the correct base number to calculate the ratio. For example, the stat "fragments dropped after timeout" should be compared to "total packets received" because they are both counters for incoming packets even though the former does not have a word implying so; to confirm, check the source code and RFC protocols. ("Packet reassemblies failed" is compared to "packets reassembled ok" here even though one is about reassembling operations and the other about packets. If you use this same ratio across different servers, it still makes sense.) Secondly, find a good server in a similar environment and check the same ratios to compare. My judgment of "OK" or "borderline" may not be appropriate for your servers.

Ref:
https://github.com/ecki/net-tools/blob/master/statistics.c
https://datatracker.ietf.org/doc/html/rfc2011#page-3
https://blog.packagecloud.io/monitoring-tuning-linux-networking-stack-sending-data/

2023-10