Posted 17 April, 2017
One of those things I never spend much time on was reading about NFS. There is no need for it, it just kinda works out the box. Most of the time that is. The headache starts when it stops working, or pushes your cluster to a grind. One of the first resources you will find is to check /proc/net/rpc/nfsd. But that does not help you much further, so I recently started monitoring the content. Let’s see what is what.
Getting the information is easy :
cat /proc/net/rpc/nfsd rc 0 225891120 1142621245 fh 94 0 0 0 0 io 2419795101 674116438 th 16 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ra 32 599168798 0 0 0 0 0 0 0 0 0 7053691 net 1369894951 0 1369046785 21585 rpc 1369784542 0 0 0 0 proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 proc3 22 84 253414826 11612033 88162237 147421518 109847 606883959 183127783 11620524 290559 110168 0 11837554 213037 2453880 4605041 36226 12203037 1305364 112 56 20349016 proc4 2 1 548713 proc4ops 59 0 0 0 104291 0 0 0 0 0 218574 3 0 0 0 0 3 0 0 0 0 0 0 548712 0 1 0 330138 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Its a bit much, so let’s go line by line :
The reply cache is a cache that will temporary store idempotent operations, those are operation that can be repeated multiple times without failing. Think : reading a block over NFS. So when a reply gets lost due to bad connection (or other reasons) NFS does not have to do the operation again (performance gain), and the client does not see an error. While this seems a simple enough cache, the reported data is however a bit complex :
<hits> <misses> <nocache> 0 225891120 1142621245
This is how reply cache looks like for a busy server : (over the weekend)
On most servers I monitor I see almost no misses and only nocache, I have not seen any hits so far. Interesting article about the size/use of the reply cache (pdf).
File handles are small pieces of memory that keep track of what file is opened; However on all the machines I monitor these are all zero except for the first value stale file handles. Stale file handles happen when a file handle references a location that has been recycled. This also occurs when the server loses connection and applications are still using files that are no longer accessible. All other values are not used.
<stale> <total_lookups> <anonlookups> <dirnocache> <nodirnocache> 94 0 0 0 0
I believe a large amount of stale files are a bad sign, so far I haven’t seen this happen.
This is simple, it is the total amount of bytes read from disk (raid) and total amount of bytes written to disk since the last restart.
<read> <write> 2419795101 674116438
Example workload :
This shows the information on the threads in the “NFS process”; a large part of the values however have been kicked out of the kernel upstream of 2.6.29.
<threads> <fullcnt> <deprecated_histogram> 16 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
For tuning how many threads are needed you could look at : /proc/fs/nfsd/pool_stats, explained here
Read ahead cache, is a cache that is being used when a sequential blocks are expected to be requested. However I found no explanation saying so, so perhaps I’m wrong. When a certain block is requested as read, the next few blocks have a large chance to be requested as well, so the NFS thread already caches these.
<cachesize> <10%> <..> <100%> <notfound> 32 599168798 0 0 0 0 0 0 0 0 0 7053691
In real live example : I only see the 10% and the not found increase. With the most being in 10%.
Statistics on network use.
<netcount> <UDPcount> <TCPcount> <TCPconnect> 1369894951 0 1369046785 21585
These values depend on how you use NFS (UDP or TCP) and how active the NFS is. The following graph is an example of plotted UDP/TCP over time. Clearly I prefer TCP over UDP connections. (TCP connections should following the line of TCP) Seeing allot of UDP traffic is suspicious, normally NFSv3 will default to TCP, NFSv2 did use UDP.
NFS under Linux relies on RPC calls to route requests between client and server.
<count> <badcnt> <badfmt> <badauth> <badcInt> 1369784542 0 0 0 0
So far I haven’t seen much bad calls.
Proc 2 are the stats generated for NFS clients using v2 protocol. This is pretty old (described in 1989) and I could not find a good reason to use these over the v3 or v4 protocols. Except for some tests, I don’t think any current distribution still uses this. In fact in Centos 7 the proc2 line has been dropped totally. (and I assume support also)
The operations however haven’t change that much in v3 protocol, so its still worth looking what they mean.
Proc 3 are subsequent the stats of v3 protocol stats. This was described in 1995 so its the most commonly used protocol. Its most likely to find decent statistics here. I only describe values that are different from v2 protocol.
Working NFS server :
As you see by far most are read/writes. Remarkable allot of commit’s those are called after async writes … (this graph is no database work)
Proc 4 is the newest protocol (2000) and one major change is the use of compound. Instead of sending 3 operations to open a file (lookup, open, read), in v4 this can be done in one compound. This means that the line can be shortened to the two types of compounds : null (do nothing) and compound (do something). Since null is only used as debug, a proc4ops are generated.
These are the operation in v4, once the compound is ‘unpacked’. This is where thinks become really complex, beside the v4 there are also v4.1 and v4.2 extensions to the protocol. You will see between (current) distributions the value count differ pretty big. I found 40, 59 and 71 values in my environment. For my librenms script I used the 59 count. The 40 value’s are v4, the 59 are with v4.1 and the 71 are the v4.2 however you will need to dig into the kernel code to find out to see if the value order hasn’t changed. I assume it hasn’t.
And that’s it… not that I’m an expert now, but at least I know what is happening, based on the charts generated. A while a go we noticed tons of getattr/readdir(plus) calls, and this way we knew some code was looking in a huge (500m+) directory’s.