linux文件描述符与inode的关系

本文最后更新于:2025年6月25日 上午

文件描述符

Linux中,文件描述符(File descriptor,fd),是表示指向文件的引用的抽象化概念,在形式上是一个非负整数,当进程打开现有文件或创建新文件时,内核向进程返回一个文件描述符

下面open系统调用返回一个文件描述符给fd

1
fd = open(pathname, flags, mode)

常见的几个文件描述符:

  • STDIN,文件描述符:0;标准输入,默认从键盘读取信息;

  • STDOUT,文件描述符:1;标准输出,默认将输出结果输出至终端;

  • STDERR,文件描述符:2;标准错误,默认将输出结果输出至终端

向标准输出中写入字符串

标准输出的文件描述符为1

1
2
3
4
5
6
7
8
9
10
11
#include <string.h>
#include <sys/syscall.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
char *msg = "Hello, world!\n";
syscall(SYS_write, 1, msg, strlen(msg));

return 0;
}

编译运行:

1
2
$  ./a.out
Hello, world!

inode

inode是用来存储文件元数据的,stat命令可以输出一个文件的元信息

1
2
3
4
5
6
7
8
9
stat C
File: C
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 801h/2049d Inode: 786444 Links: 2
Access: (0775/drwxrwxr-x) Uid: ( 1000/ zaralts) Gid: ( 1000/ zaralts)
Access: 2022-04-09 16:44:06.279746438 +0800
Modify: 2022-04-07 22:51:11.594725534 +0800
Change: 2022-04-07 22:51:11.594725534 +0800
Birth: -

df -i 命令查看每个硬盘分区的inode总数和已经使用的数量

1
2
3
4
5
6
7
8
9
10
 df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 493942 484 493458 1% /dev
tmpfs 500214 1088 499126 1% /run
/dev/sda1 2621440 227759 2393681 9% /
tmpfs 500214 1 500213 1% /dev/shm
tmpfs 500214 5 500209 1% /run/lock
tmpfs 500214 18 500196 1% /sys/fs/cgroup
tmpfs 500214 25 500189 1% /run/user/121
tmpfs 500214 49 500165 1% /run/user/1000

ll -i可以列出当前目录所有的包括inode号的文件信息

inode号是在第一列

1
2
3
4
5
6
ll -i
528761 drwxr-xr-x 22 zaralts zaralts 4096 48 21:17 ./
524289 drwxr-xr-x 3 root root 4096 42 16:31 ../
524792 -rw------- 1 zaralts zaralts 5848 48 15:36 .bash_history
528764 -rw-r--r-- 1 zaralts zaralts 220 42 16:31 .bash_logout
528762 -rw-r--r-- 1 zaralts zaralts 3771 42 16:31 .bashrc

  • 每个进程都会有文件描述符表,该表的每一条都记录了单个文件描述符的相关信息
  • 内核对所有打开的文件维护有一个系统级的描述表格(open file description table),表中各条目称为打开文件句柄(open file handle),一个打开文件句柄存储了与一个打开文件相关的全部信息,比如当前文件偏移量、对该文件 i-node 对象的引用等

/proc/pid/fd

该文件夹包含了某一进程所占用的文件描述符
下图中socket:后面显示的是对应的inode号

1
2
3
4
5
6
7
/proc/92044/fd$ ll
total 0
dr-x------ 2 zaralts zaralts 0 49 21:54 ./
dr-xr-xr-x 9 zaralts zaralts 0 49 21:17 ../
lr-x------ 1 zaralts zaralts 64 49 21:54 0 -> /dev/null
lrwx------ 1 zaralts zaralts 64 49 21:54 1 -> 'socket:[39859]'
lrwx------ 1 zaralts zaralts 64 49 21:54 10 -> 'socket:[477522]'

/proc/sys/fs/inode-state

  • 第一列的值为已经分配的inode数量(nr_inodes)
  • 第二列的值为剩余可分配的inode数量(nr_free_inodes)
  • 第三列当nr_inodes > inode-max时为非 0(preshrink)
  • 后面三个为dummy values(总是为0)
1
2
$ cat /proc/sys/fs/inode-state                             
139193 25844 0 0 0 0 0

linux文件描述符与inode的关系

一切皆文件

在 Linux 中,一切皆文件。许多操作,比如文件打开、进程创建、管道创建、新的网络连接创建,甚至计时器创建(timerfd _ create)和事件生成(eventfd) ,都会增加 nr_files

文件描述符与inode的关系

file-max

file-max定义了系统级别的、所有进程的打开文件限制

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
This file defines a system-wide limit on the number of
open files for all processes. System calls that fail when
encountering this limit fail with the error ENFILE. (See
also setrlimit(2), which can be used by a process to set
the per-process limit, RLIMIT_NOFILE, on the number of
files it may open.) If you get lots of error messages in
the kernel log about running out of file handles (open
file descriptions) (look for "VFS: file-max limit <number>
reached"), try increasing this value:

echo 100000 > /proc/sys/fs/file-max

Privileged processes (CAP_SYS_ADMIN) can override the
file-max limit.

查看目前的file-max限制:

1
2
$ cat /proc/sys/fs/file-max
394568

file-nr

/proc/sys/fs/file-nr是个只读文件,三个输出分别表示已经分配的文件描述符数、已经分配但未使用的文件描述符数、文件描述符数最大值

1
2
$ cat /proc/sys/fs/file-nr
11616 0 394568
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/proc/sys/fs/file-nr
This (read-only) file contains three numbers: the number
of allocated file handles (i.e., the number of open file
descriptions; see open(2)); the number of free file
handles; and the maximum number of file handles (i.e., the
same value as /proc/sys/fs/file-max). If the number of
allocated file handles is close to the maximum, you should
consider increasing the maximum. Before Linux 2.6, the
kernel allocated file handles dynamically, but it didn't
free them again. Instead the free file handles were kept
in a list for reallocation; the "free file handles" value
indicates the size of that list. A large number of free
file handles indicates that there was a past peak in the
usage of open file handles. Since Linux 2.6, the kernel
does deallocate freed file handles, and the "free file
handles" value is always zero.

nr_open

单个进程打开文件数的最大值

1
2
3
4
5
6
7
/proc/sys/fs/nr_open (since Linux 2.6.25)
This file imposes a ceiling on the value to which the
RLIMIT_NOFILE resource limit can be raised (see
getrlimit(2)). This ceiling is enforced for both
unprivileged and privileged process. The default value in
this file is 1048576. (Before Linux 2.6.25, the ceiling
for RLIMIT_NOFILE was hard-coded to the same value.)
1
2
$ cat /proc/sys/fs/nr_open
1048576

函数调用链分析

  • get_nr_files函数返回当前的nr_files值
  • alloc_empty_file在分配file结构体之前,会进行检查,判断当前nr_file值是否大于最大值files_stat.max_files
  • 如果已经超过了最大值,那么会输出:"VFS: file-max limit %lu reached\n"
    在这里插入图片描述

get_nr_files函数

1
2
3
4
static long get_nr_files(void)
{
return percpu_counter_read_positive(&nr_files);
}

alloc_empty_file函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct file *alloc_empty_file(int flags, const struct cred *cred)
{
static long old_max;
struct file *f;

/* Privileged users can go above max_files
*/
if (get_nr_files() >= files_stat.max_files && !capable(CAP_SYS_ADMIN)) {
/* percpu_counters are inaccurate. Do an expensive check before we go and fail. */
if (percpu_counter_sum_positive(&nr_files) >= files_stat.max_files)
goto over;
}
f = __alloc_file(flags, cred);
if (!IS_ERR(f)) percpu_counter_inc(&nr_files);

return f;

over:
/* Ran out of filps - report that */
if (get_nr_files() > old_max) {
pr_info("VFS: file-max limit %lu reached\n", get_max_files());
old_max = get_nr_files();
}
return ERR_PTR(-ENFILE);
}

linux文件描述符与inode的关系
http://gls.show/p/f1bb17a0/
作者
郭佳明
发布于
1970年1月1日
许可协议