Each and every Linux user uses the below commands in their day to day system administration activities.
- Ps
- Uptime
- Free
- uname etc.
The above mentioned are few in the long list. The amount of information those commands provide to a user is just outstanding, and is always accurate. But where does these information come from. Another most amazing fact about those commands are that they provide a real time data. Which means each time you run the command, the output will be slightly different.
This means its fetching information from a place which is very dynamic in nature and also is fetching from a source which is very much credible and provides a real and updated data each time.
When you talk about computers, its an obvious fact that you will talk about operating system's. And when you talk about operating system you are talking about the kernel. Linus Torvalds, once quoted in one of the documentary films about Linux, that...
To kind of explain what Linux is, you have to explain what an operating system is, and the thing about operating system is that you are never ever supposed to see it, because nobody really uses an operating system. People use programs on their computer.And the only mission in life, of an operating system is to help those programs run. So an operating system never does anything on its own. Its only waiting for the programs to ask for certain resources or ask for a certain file on the disk, or ask for the programs to connect to outside world, then the operating system kicks in and tries to make it easy for people to write programs.
-Linux Torvalds
Creator, Linux Kernel
The explanation given by Linux Torvalds is quite simple and to the point (Who can explain what Linux is, better than him ). As kernel is the one who maintains the system for making resources available to different programs, only kernel knows the current resource utilization, and other current status about the system.
So when a programs like ps or top needs details about the current running status, it should ask the kernel, because its the perfect source for accurate and precise result.
For making things easier proc file system was made. Its a mechanism provided to access the underlying kernel data structures. It also helps to modify some of the kernel parameters, at run time.
/proc file system is a mechanism provided, so that kernel can send information to processes. This is an interface provided to the user, to interact with the kernel and get the required information about processes running on the system. Please don't forget the fact that, the /proc file system also allows you to change some parameters on the fly (on current running system with immediate effect.)
The /proc file system is nicely documented in the proc man page. You can access this document by running the below command on a linux system.
1
| #man proc |
The first line, you will encounter inside that man page is...
proc - process information pseudo-file system
The man page describes it further as The proc file system is a pseudo-file system which is used as an inter‐face to kernel data structures. It is commonly mounted at /proc. Most of it is read-only, but some files allow kernel variables to be changed.
Now that we have a little bit of idea about what /proc file system is, let's have a look at some of the interesting facts about it.
- PROC file system is completely managed by the kernel, and is not stored on disk like other file system.
- Its stored in RAM (memory)
- Most of the files in /proc is of 0 bytes in size. (this is quite interesting)
The most interesting part is that, almost all files inside /proc file system is of 0 byte in size. The thing that confuses many users is the fact that, although they are in 0 size, they still contain data when viewed. How is this possible?
Linux is capable of handling many different types of file system's because of something called as VFS (Virtual File System). Its something like a single interface to get in touch with for reading and writing to different types of file system. I have described a little bit of VFS in my article about NFS. You can read that by accessing the below link.
VFS makes one simple interface for Linux Kernel to access different file system's under it.
/proc file system is also accessed by the kernel using VFS. Due to this when a user tries to access a file inside the /proc file system, proc file system creates the content of that file with the help of information in the kernel. This is the reason when you list the directory /proc, most of it is shown with 0 bytes in size, but is populated dynamically when you access it.
Let's check this practically and understand it. There is a command in linux called file. "File" command in linux is used to determine the type of the file by checking the contents of the file. It will give you "file is empty" output if the file is empty. Let's try checking the file type of any /proc file using the file command.
Let's check this by viewing the type of the /proc/meminfo, this file is used to fetch the current memory information from the kernel.
1
2
| root@workstation:~# file /proc/meminfo /proc/meminfo: empty |
Saw that?, the output says that the file is empty. But let's try to access the file with any editor like vi, cat or less.
1
2
3
4
5
6
7
8
9
10
| root@workstation:~# cat /proc/meminfo MemTotal: 502712 kB MemFree: 47672 kB Buffers: 16136 kB Cached: 110768 kB SwapCached: 0 kB Active: 357388 kB Inactive: 48264 kB Active(anon): 278804 kB Inactive(anon): 332 kB |
Note: I have not shown full output, as its quite long
So when you access the content, the current values are populated from the kernel. This is the reason, why you get the most current and accurate status of the system from files inside /proc.
We saw earlier that the proc man page defines proc as "process information pseudo-file system".This is because it contains details about all the current running processes. Let's see the directory listing of /proc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| root@workstation:/proc# ll total 4 dr-xr-xr-x 110 root root 0 Dec 8 17 : 06 ./ drwxr-xr-x 23 root root 4096 Nov 19 04 : 55 ../ dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 10 / dr-xr-xr-x 9 whoopsie whoopsie 0 Dec 8 17 : 06 1020 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 11 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1122 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1123 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1127 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1128 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1129 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1153 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 12 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1233 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 1290 / dr-xr-xr-x 9 root root 0 Dec 8 17 : 06 13 / dr-xr-xr-x 9 root ubuntu 0 Dec 8 17 : 07 1393 / |
You can see in the above shown directory listing that there are a lot of numbered directories inside. These directories are numbered with the corresponding PID. For example, if the Apache process is running with a PID number of 2334, then you will have a folder /proc/2334.
The most interesting thing about these directories is that they appear and disappear, dynamically when the process starts and stops. Each and every directories named after their respective PID's contain detailed information of the current status of the process.
Let's see what are the contents of a PID directory in /proc. Below shown is the content of a PID (an nginx worker process) directory.
1
2
3
4
| root@workstation:/proc/ 27140 # ls attr cgroup comm cwd fd latency map_files mountinfo net oom_adj pagemap sched smaps statm task autogroup clear_refs coredump_filter environ fdinfo limits maps mounts ns oom_score personality schedstat stack status wchan auxv cmdline cpuset exe io loginuid mem mountstats numa_maps oom_score_adj root sessionid stat syscall |
Let's now discuss what some of the important files inside that PID directory is meant for. Discussing all of them is beyond the scope of this article, also i myself need to understand them first to write about them :)
- /proc/<pid>/exe file inside the PID directory points to the original exe that is being executed by the process. Its normally a symbolic link to the original location of the exe file. In our case this is /usr/sbin/nginx, as we are seeing an nginx worker process directory. This is confirmed by the below command.
1
2
| root@workstation:/proc/ 27140 # ll exe lrwxrwxrwx 1 www-data www-data 0 Dec 10 01 : 17 exe -> /usr/sbin/nginx* |
Related: Nginx Performance Tuning
- /proc/<pid>/cmdline: This contains the command that was used to start the process. This is shown when you use ps command at the last column.
1
2
| root@workstation:/proc/ 27140 # cat cmdline nginx: master process /usr/sbin/nginx |
- /proc/<pid>/fd/: This directory contains the file descriptors opened by the process. As we are seeing the nginx process, it should have its log files, socket files (basically connections) etc. Let's see the contents of the fd directory.
1
2
3
4
5
6
7
8
9
10
| root@workstation:/proc/ 27140 /fd# ll lrwx------ 1 root root 64 Dec 10 01 : 18 10 -> socket:[ 41479 ] lrwx------ 1 root root 64 Dec 10 01 : 18 11 -> socket:[ 41480 ] lrwx------ 1 root root 64 Dec 10 01 : 18 12 -> socket:[ 41481 ] lrwx------ 1 root root 64 Dec 10 01 : 18 13 -> socket:[ 41482 ] lrwx------ 1 root root 64 Dec 10 01 : 18 14 -> socket:[ 41483 ] l-wx------ 1 root root 64 Dec 10 01 : 18 2 -> / var /log/nginx/error.log lr-x------ 1 root root 64 Dec 10 01 : 18 3 -> /proc/ 27137 /auxv lrwx------ 1 root root 64 Dec 10 01 : 18 4 -> socket:[ 41476 ] l-wx------ 1 root root 64 Dec 10 01 : 18 5 -> / var /log/nginx/access.log |
- /proc/<pid>/maps: This file contains the files that are mapped to the process. Most of the times, this file contains the list of library files like .so files that are used by the process. Let's see an example, of what our nginx process is having inside the maps file.
1
2
3
4
5
6
7
8
9
10
| root@workstation:/proc/ 27140 # cat maps 00400000 -004ba000 r-xp 00000000 08 : 01 25127 /usr/sbin/nginx 006b9000-006ba000 r--p 000b9000 08 : 01 25127 /usr/sbin/nginx 006ba000-006ce000 rw-p 000ba000 08 : 01 25127 /usr/sbin/nginx 006ce000-006dd000 rw-p 00000000 00 : 00 0 00e78000-00ed3000 rw-p 00000000 00 : 00 0 [heap] 7f46e882b000-7f46e8837000 r-xp 00000000 08 : 01 21884 /lib/x86_64-linux-gnu/libnss_files- 2.15 .so 7f46e8837000-7f46e8a36000 ---p 0000c000 08 : 01 21884 /lib/x86_64-linux-gnu/libnss_files- 2.15 .so 7f46e8a36000-7f46e8a37000 r--p 0000b000 08 : 01 21884 /lib/x86_64-linux-gnu/libnss_files- 2.15 .so 7f46e8a37000-7f46e8a38000 rw-p 0000c000 08 : 01 21884 /lib/x86_64-linux-gnu/libnss_files- 2.15 .so |
- /proc/<pid>/status: This file consists of processor and memory usage details. This file also contains the pid details as well as state of the process like sleeping or running. It also has information about the parent process, group id, user id, etc.
1
2
3
4
5
6
7
8
9
10
11
12
| root@workstation:/proc/ 27140 # cat status Name: nginx State: S (sleeping) Tgid: 27138 Pid: 27138 PPid: 1 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 64 Groups: 0 VmPeak: 76848 kB |
Note: I have not shown the full output of most of the commands shown above.
Let's now see some other important files inside /proc other than the proc PID directory contents. The files which we will now discuss will provide you with a detailed information about the current status of the system. Most of the commands that reports the process status, uptime load average etc fetches information by accessing files inside /proc. How will you confirm this?
You can confirm this by running a debugging command available in Linux, which is normally used for program debugging. The command is called strace. Strace will show you the list of files that a program is accessing. The output is quite untidy as it shows you a lot of details. Try running the below command and you will come to know that commands like uname, uptime, ps and top uses files inside /proc to fetch details.
1
| #s trace ps |
Also commands like uptime shows you the current load average. This detail is fetched from the below file.
1
2
| root@workstation# cat /proc/loadavg 0.00 0.01 0.05 1 / 115 27562 |
- /proc/meminfo : This file contains the current memory details of the running system. It will show you the full details about the memory usage. Like how much is being used for caching and how much is real use. Full memory and currently available memory etc. This is place from where the command free shows you memory details.
1
2
3
4
5
6
7
8
9
10
11
12
13
| root@workstation:/proc# cat meminfo MemTotal: 502712 kB MemFree: 197672 kB Buffers: 10796 kB Cached: 141996 kB SwapCached: 0 kB Active: 168432 kB Inactive: 75304 kB Active(anon): 90988 kB Inactive(anon): 280 kB Active(file): 77444 kB Inactive(file): 75024 kB Unevictable: 0 kB |
- /proc/version: This file contains the linux and distribution version details.
1
2
| root@workstation:/proc# cat /proc/version Linux version 3.8 . 0 - 29 -generic (buildd@panlong) (gcc version 4.6 . 3 (Ubuntu/Linaro 4.6 . 3 -1ubuntu5) ) # 42 ~precise1-Ubuntu SMP Wed Aug 14 16 : 19 : 23 UTC 2013 |
- /proc/diskstats: This file contains the details of disk devices. The details include reads, writes, reads completed, writes completed, time spent on reading, sectors written etc. This is the file from where commands like iostats, fetches its information. Access the below link to know about the 14 different fields in the file.
1
2
3
| root@workstation:/proc# cat diskstats 8 0 sda 48088 175 2460234 433304 43196 36838 2571832 1694864 0 497496 2127260 8 1 sda1 47924 175 2458922 432540 36743 36838 2571832 1688772 0 491412 2120404 |
Related: Linux IO monitoring
- /proc/modules: This file contains the list of currently loaded kernel modules on your system. If you run a strace command for lsmod (which lists the current list of modules in the kernel), you will see that its accessing the /proc/modules file to fetch the details.
- /proc/cpuinfo: Consist of complete details related to the processor. This file will show you the processor flags, processor speed, processor model name etc.
- /proc/filesystems: This file shows you the total number of file systems that are supported by the kernel currently. This list itself contains proc file system as one of the file system supported by the kernel. But it comes under those special file system with nodevice.
I would recommend to explore more files in the /proc directory as explaining all of them is beyond my reach and scope of this article. If you get any interesting information, please don't forget to share with us through comments.
Now as discussed earlier, there are files inside /proc which can be modified (or files with write permissions.). This is a mechanism provided by the Linux kernel to users, so that they can modify the system behavior and kernel parameters at run time.
Extreme care must be taken while modifying files in /proc directory. Do this only if you know what you are doing, otherwise the system can become unstable. Most of the files accept boolean values or either predefined values. So don't edit or send values that are unacceptable.
The directory /proc/sys contains the files which are writable at run time. This can be done by redirecting required values to that file. Let's see some of the files inside /proc/sys/kernel. This directory contains hostname, domain name etc.
- /proc/sys/kernel/hostname: Host name of the system
- /proc/sys/kernel/domainname: Domain to which the host belongs to
When you fire the hostname command, what it does is to write the hostname inside the file/proc/sys/kernel/hostname file. Please remember the fact that this way of setting up the hostname will not make it permanent.
Anything you modify on the fly, by editing /proc/sys/ files, is all temporary. Which means the data will be flushed on shutdown. Hence to make it permanent, you either need to modify the sysctl parameters or other relevant recommended locations. For example edit /etc/sysconfig/network file for making your hostname permanent.
Another directory which is writable is /proc/sys/net/ipv4/. This directory contains all those files which can be used to modify the networking behavior of your Linux system. Most of the system admin's, must be familiar with this directory, as this directory contains ip_forward file, which is modified to enable ip forwarding (to make Linux act as a router.)
Related: VPN gateway server in linux
Some interesting files inside this directory are mentioned below.
- /proc/sys/net/ipv4/ip_forward: This accepts two values, either 1 or 0. 1 means ip forwarding is enabled, and 0 means its disabled. To enable ip forward on the fly, you can run the below command.
1
| echo 1 > /proc/sys/net/ipv4/ip_forward |
- /proc/sys/net/ipv4/ipfrag_high_thresh: Highest memory allowed to be used to reassemble ip fragments. If the data is too large while sending its breaked down to different fragments and sent. when this limit is reached the sending machine has to resend those fragments. Its measured in bytes.
- /proc/sys/net/ipv4/icmp_echo_ignore_all: This will ignore all ping requests to the host. This also has either a 0 or 1 value. 0 for disabled and 1 for enabled.
- /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts: This will ignore icmp broadcast requests.
- /proc/sys/net/ipv4/ip_default_ttl: This file can be modified to change the default TTL value. TTL is time to live, in number of hops that comes in between the source and destination.
If you are interested to understand how TTL works, i would recommend reading my article on traceroute and its working.
Read: How does traceroute work
Some other interesting files inside /proc/sys/net/ipv4/, are mentioned below.
- tcp_rmem
- tcp_window_scaling
- tcp_wmem
They are used to tune TCP performance in Linux. I would recommend you to read the below article to understand their use.
As i told before, its quite difficult to explain all the files in /proc. So explore more files inside this directory to understand its use. Also please dont forget to share it with me through comments if you find anything interesting (this will help me as well as our readers to gain more information about them).
Another interesting directory inside /proc is the self directory (/proc/self). Its always a symbolic link to the current process. Which means if you do a ls -l on /proc/self directory, each time the symbolic link is different. The sym link points to the process directory of the process that views the /proc/self directory (in our case the ls -l process.)
As each time you run ls -l on the /proc/self, the ls -l is a different process with different PID, hence it will show the different sym links.
Some points to remember about /proc file system in linux
- proc is a special file system and is not associated with any hard drive device.
- Files inside /proc are not real files, they act as an interface to kernel data structures and process information. As they are not real files, properties like file size is not applicable to them (hence shown as zero bytes)
- Contents inside /proc files are populated dynamically when requested. Due to this the data fetched from /proc is the most recent data provided by the kernel
- Certian files inside /proc can be modified to change the behaviour of a running kernel. For example, /proc/sys/ files.
- Most of the system monitoring commands like ps, top, free, etc use process files inside /proc/to fetch information.
- Complete detail of a running process can be fetched from the /proc/<pid> directory.
- A special directory called /proc/self can be used by a program to find details about its own process.
As i always say, Please let me know if you find any mistake in this article through comments. Because rectifying it will be helpful to me as well as other readers.
No comments:
Post a Comment