System calls for working
with files and directories in Linux
The
following article presents the way to use the most common system calls in
order to make input-output operations on files, as well as operations to handle
files and directories in the Linux operating system.
The
operating system assigns internally to each opened file a descriptor or an
identifier (usually this is a positive integer). When opening or creating a new
file the system returns a file descriptor to the process that executed the
call. Each application has its own file descriptors. By convention, the first
three file descriptors are opened at the beginning of each process. The 0 file
descriptor identifies the standard input, 1 identifies the standard output and
2 the standard output for errors. The rest of the descriptors are used by the
processes when opening an ordinary, pipe or special file, or directories. There
are five system calls that generate file descriptors: create, open, fcntl, dup and pipe.
Opening
or creating a file can be done using the system call open. The syntax is:
#include
<sys/types.h>
#include
<sys/stat.h>
#include <fcntl.h>
int open(const char *path,
int flags,... /* mode_t mod */);
This
function returns the file descriptor or in case of an error -1. The number of
arguments that this function can have is two or three. The third argument is
used only when creating a new file. When we want to open an existing file only two
arguments are used. The function returns the smallest available file
descriptor. This can be used in the following system calls: read, write, lseek and close. The effective UID or the
effective GID of the process that executes the call has to have read/write
rights, based on the value of the argument flags.
The file pointer is places on the first byte in the file. The argument flags is formed by a bitwise OR
operation made on the constants defined in the fcntl.h header.
O_RDONLY
Opens the file for reading.
O_WRONLY
Opens the file for writing.
O_RDWR
The file is opened for reading and
writing.
O_APPEND
It writes successively to the end of
the file.
O_CREAT
The file is created in case it
didn\92t already exist.
O_EXCL
If the file exists and O_CREAT is
positioned, calling open will fail.
O_NONBLOCK
In the case of pipes and special files, this causes the
open system call and any other future I/O operations to never block.
O_TRUNC
If the file exists all of its
content will be deleted.
O_SYNC
It forces to write on the disk with
function write. Though it slows down
all the system, it can be useful in critical situations.
The
third argument, mod, is a bitwise OR
made between a combination of two from the following list:
S_IRUSR, S_IWUSR,
S_IXUSR
Owner: read,
write, execute.
S_IRGRP, S_IWGRP,
S_IXGRP
Group: read,
write, execute.
S_IROTH, S_IWOTH,
S_IXOTH
Others: read,
write, execute.
The above define the access
rights for a file and they are defined in the sys/stat.h header.
A
new file can be created by:
#include
<sys/types.h>
#include
<sys/stat.h>
#include
<fcntl.h>
int creat(const char *path, mode_t mod);
The
function returns the file descriptor or in case of an error it returns the
value -1. This call is equivalent with:
open(path, O_WRONLY |
O_CREAT | O_TRUNC, mod);
The
argument path specifies the name of
the file, while mod defines the
access rights. If the created file doesn\92t exist, a new i-node is allocated and
a link is made to this file from the directory it was created in. The owner of
the process that executes the call - given by the effective UID and the
effective GUID - must have writing permission in the directory. The open file
will have the access rights that were specified in the second argument (see umask, too). The call returns the
smallest file descriptor available. The file is opened for writing and its
initial size is 0. The access time and the modification time are updated in the
i-node. If the file exists (permission to search the directory is needed), it
looses its contents and it will be opened for writing. The ownership and the access
permissions won\92t be modified. The second argument is ignored.
When
we want to read a certain number of bytes starting from the current position in
a file, we use the read call. The
syntax is:
#include
<unistd.h>
ssize_t read(int
fd, void* buf, size_t noct);
The
function returns the number of bytes read, 0 for end of file (EOF) and -1 in
case an error occurred. It reads noct
bytes from the open file referred by the fd descriptor and it puts it into a buffer buf. The pointer (current position) is
incremented automatically after a reading that certain amount of bytes. The
process that executes a read operation waits until the system puts the data from
the disk into the buffer.
For
writing a certain number of bytes into a file starting from the current
position we use the write call. Its
syntax is:
#include
<unistd.h>
ssize_t write(int fd, const
void* buf, size_t noct);
The
function returns the number of bytes written and the value -1 in case of an
error. It writes noct bytes from the
buffer buf into the file that has as
its descriptor fd.
It is interesting to note that the actual writing onto the disk is delayed.
This is done at the initiative of the root, without informing the user when it
is done. If the process that did the call or an other process reads the data
that haven\92t been written on the disk yet, the system reads all this data out
from the cache buffers. The delayed writing is faster, but it has three
disadvantages:
a) a disk error or a system error may cause loosing all
the data
b) a process that had the initiative of a write operation
cannot be informed in case a writing error occurred
c) the physical order of the write operations cannot be
controlled.
To
eliminate these disadvantages, in some cases the O_SYNC is used. But as this
slows down the system and considering the reliability of today\92s systems it is
better to use the mechanism which includes using cache buffers.
For
closing a file and thus eliminating the assigned descriptor we use the system
call close.
#include
<unistd.h>
int
close(int fd);
The
function returns 0 in case of success and -1 in case of an error. At the
termination of a process an open file is closed anyway.
To
position a pointer (that points to the current position) in an absolute
or relative way can be done by calling the lseek
function. Read and write operations are done relative to the current position
in the file. The syntax for lseek is:
#include
<sys/types.h>
#include <unistd.h>
off_t lseek(int fd, off_t offset, int ref);
The
function returns the displacement of the new current position from the beginning
of the file or -1 in case of an error. There isn\92t done any I/O operation and
the function doesn\92t send any commands to the disk controller. It ref is set to SEEK_SET the positioning
is done relative to the beginning of the file (the first byte in the file is at
position 0). If ref is SEEK_CUR the
positioning is done relative to the current position. If ref is SEEK_END then the positioning is done relative to the end of
the file. The system calls open, creat, write and read execute an
lseek by default. If a file was
opened using the symbolic constant O_APPEND then an lseek call is made to the end of the file before a write operation.
To
link an existing file to another directory (or to the same directory) link can
be used. To make such a link in fact means to set a new name or a path to an
existing file. The link system call creates a hard link.
Creating symbolic links can be done using symlink system call.
The syntax of link is:
#include <unistd.h>
int link(const char* oldpath, const char* newpath);
int symlink(const char* oldpath, const char* newpath);
The
function returns 0 in case of success and -1 in case of an error. The argument oldpath has to be a path to an existing
file. Only the root has the right to set a link to a directory.
To
delete a link (a path) in a directory we can use the unlink system call. Its syntax is:
#include <unistd.h>
int unlink(const char* path);
The
function returns 0 in case of success and -1 otherwise. The function decrements
the hard link counter in the i-node and deletes the
appropriate directory entry for the file whose link was deleted. If the number of
links of a file becomes 0 then the space occupied by the file and its i-node
will be freed. Only the root can delete a directory.
In
order to obtain more details about a file the following system calls can be
used: stat, lstat or fstat.
#include <sys/types.h>
#include <sys/stat.h>
int stat(const char* path, struct stat* buf);
int lstat(const char* path, struct stat* buf);
int fstat(int df, struct stat* buf);
These
three functions return 0 in case of success and -1 in case of an error. The
first two gets as input parameter a name of a file and completes the structure
of the buffer with additional information read from its i-node. The fstat function is similar, but it works
for files that were already opened and for which the file descriptor is known.
The difference between stat and lstat is that in case of a symbolic link,
function stat returns information
about the linked (refered) file, while lstat
returns information about the symbolic link file. The struct stat structure is described in
the sys/stat.h header and has the
following fields:
struct stat {
mode_t st_mode; /* file type & rights */
ino_t st_ino; /* i-node */
dev_t st_dev; /* număr de dispozitiv (SF) */
nlink_t st_nlink; /* nr of links */
uid_t st_uid; /* owner ID */
gid_t st_gid; /* group ID */
off_t st_size; /* ordinary file size */
time_t st_atime; /* last time it was accessed */
time_t st_mtime; /* last time it was modified */
time_t st_ctime; /* last time settings were changed */
dev_t st_rdev; /* nr.
dispozitiv */
/* pt. fişiere speciale /
long st_blksize; /* optimal size of the I/O block */
long st_blocks; /* nr of 512 byte blocks allocated */
};
The
Linux command that the most frequently uses this function is ls. Type declarations for the members of
this structure can be found in the sys/stat.h
header. The type and access rights for the file are encrypted in the st_mode field and can be determined
using the following macros:
Table 1. Macros for obtaining the type of a file |
|
Macro
|
Meaning |
S_ISREG(st_mode) |
|
S_ISDIR(st_mode) |
Directory file. |
S_ISCHR(st_mode) |
Special device of
type character. |
S_ISBLK(st_mode) |
Special device of type block. |
S_ISFIFO(st_mode) |
Pipe file or FIFO. |
S_ISLNK(st_mode) |
Symbolic link. |
Decrypting
the information contained in the st_mode
field can be done by testing the result of a bitwise AND made between the st_mode field and one of the constants
(bit mask): S_IFIFO, S_IFCHR, S_IFBLK, S_IFDIR, S_IFREG, S_IFLNK, S_ISUID (suid bit set), S_ISGID (sgid bit set), S_ISVTX (sticky bit set), S_IRUSR (read right for
the owner), S_IWUSR (write right for the owner), S_IWUSR (execution right for
the owner), etc.
When
opening a file with system call open
the root verifies the access rights in function of the UID and the effective
GID. There are some cases though when a process verifies these rights based
upon the real UID and GID. A situation when this can be useful is when a
process is executed with other access right using the suid or sgid bit. Even
though a process may have root rights during execution, sometimes it is
necessary to test whether the real user can or cannot access the file. For this
we can use access which allows
verifying the access rights of a file based on the real UID or GID. The syntax
for this system call is:
#include <unistd.h>
int access(const char* path, int mod);
The
function returns 0 if the access right exists and -1 otherwise. The argument
mod is a bitwise AND between R_OK (permission to read), W_OK (permission to
write), X_OK (execution right), F_OK (the file exists).
To
enhance the security in case of operations regarding the creation of files, the
Linux operating system offers a default mask to reset some access rights.
Encrypting this mask is made in a similar way to the encrypting of the access
rights in the i-node of a file. When creating a file those bits that are set to
1 in the mask invalidate the corresponding bits in the argument that specify
the access rights. The mask doesnot affect the system call chmod, so the processes can explicitly set their access rights independently
form the umask mask. The syntax for
the call is:
#include <sys/types.h>
#include <sys/stat.h>
mode_t umask(mode_t mask);
The
function returns the value of the previous mask. The effect of the call is
shown below:
main() /* test umask */
{
int fd;
umask(022);
if ((fd=creat("temp", 0666))==-1)
perror("creat");
system("ls -l temp");
}
The
result will be of the following form:
-rw-r--r-- temp
Note
that the write permission for the group and other users beside the owner was
automatically reset.
To
modify the access rights for an existing file we use:
#include <sys/types.h>
#include <sys/stat.h>
int chmod(const char* path, mode_t mod);
The
function returns 0 in case of a success and -1 otherwise. The chmod call modifies the access rights of
the file specified by the path
depending on the access rights specified by the mod argument. To be able to modify the access rights the effective
UID of the process has to be identical to the owner of the file or the process
must have root rights.
The
mod argument can be specified by one
of the symbolic constants defined in the sys/stat.h
header. Their effect can be obtained by making a bitwise OR operation on them:
Table 2. Bit
masks for testing the access rights of a file |
|
Mode |
Description |
S_ISUID |
Sets the suid bit. |
S_ISGID |
Sets the sgid bit. |
S_ISVTX |
Sets the sticky bit. |
S_IRWXU |
Read, write, execute rights
for the owner obtained from: S_IRUSR | S_IWUSR | S_IXUSR |
S_IRWXG |
Read, write, execute rights
for the group obtained from: S_IRGRP | S_IWGRP | S_IXGRP |
S_IRWXO |
Read, write, execute rights
for others obtained from: S_IROTH | S_IWOTH | S_IXOTH |
This
system call is used to modify the owner (UID) and the group (GID) that a
certain file belongs to. The syntax of the function is:
#include <sys/types.h>
#include <unistd.h>
int chown(const char* path, uid_t owner, gid_t grp);
The
function returns 0 in case of success and -1 in case of an error. Calling this
function will change the owner and the group of the file specified by the argument
path to the values specified by the
arguments owner and grp. None of the users can change the
owner of any file (even of his/her own files), except the root user, but they can change the GID for
their own files to that of any group they belong to.
There
are three members of the structure stat
that refer to time. They are presented in the following table:
Table 3. Timing
information associated with a file |
||
Field |
Description |
Operation |
st_atime |
Last time the data in the file
was accessed |
Read |
st_mtime |
Last time the data in the
file was modified |
Write |
st_ctime |
Changing the settings for
the i-node |
chmod, chown |
The difference between the time the file was last
modified and the change in the setting of the i-node is that the first one
refers to the time when the contents of the file were modified while the second
one refers to the time when the information in the i-node was last
modified. This is due to the fact that
the information in the i-node is kept separately from the contents of the file.
System calls that change the i-node are those ones which modify the access
rights of a file, change the UID, change the number of links, etc. The system
doesnot keep the time when the i-node was last accessed. This is why neither of
the system calls access or stat do not change these times.
The access time and last modification time of any kind
of files can be changed by calling one of the system call presented below:
#include <sys/time.h>
int utimes(const char* path,
const struct timeval* times);
int lutimes(const char* path,
const struct timeval* times);
int futimes(int fd, const struct timeval* times);
The
functions return 0 in case of success and -1 otherwise. Only the owner of a
file or the root can change the times associated with a file. The parameter times represents the address (pointer)
of a list of two timeval structures,
corresponding to the access and modification time. The fields of the timeval structure are:
struct timeval {
long tv_sec; /* seconds passed since 1.01.1970 */
suseconds_t tv_usec; /* microseconds */
}
To
obtain the current time in the form it is required by the timeval structure, we can use the gettimeofday function. For different conversions between the normal
format of a data and hour and the format specific to the timeval structure the function ctime
can be used or any other functions belonging to the same family (for more
details see the textbook).
A
directory can be read as a file by anyone whoever has reading permissions for
it. Writing a directory as a file can only be done by the kernel. The structure
of the directory appears to the user as a succession of structures named
directory entries. A directory entry contains, among other information, the
name of the file and the i-node of this. For reading the directory entries one
after the other we can use the following functions:
#include
<sys/types.h>
#include
<dirent.h>
DIR* opendir(const
char* pathname);
struct dirent*
readdir(DIR* dp);
void rewinddir(DIR*
dp);
int closedir(DIR*
dp);
The opendir
function opens a directory. It returns a valid pointer if the opening was
successful and NULL otherwise.
The readdir
function, at every call, reads another directory entry from the current
directory. The first readdir will
read the first directory entry; the second call will read the next entry and so
on. In case of a successful reading the function will return a valid pointer to
a structure of type dirent and NULL
otherwise (in case it reached the end of the directory, for example).
The rewinddir
function repositions the file pointer to the first directory entry (the
beginning of the directory).
The closedir
function closes a previously opened directory. In case of an error it returns
the value -1.
The structure dirent
is defined in the dirent.h file. It
contains at least two elements:
struct dirent {
ino_t d_fileno; // i-node nr.
char d_name[MAXNAMLEN + 1]; // file
name
}
Write
a program that creates a file with a 4K bytes free space. Such files are called
files with holes.
#include
<sys/types.h>
#include
<sys/stat.h>
#include
<fcntl.h>
#include
<unistd.h>
char buf1[]="LAB ";
char
buf2[]="OS Linux";
int main( void)
{
int fd;
if ((fd=creat("file.gol", 0666)) < 0) {
perror("Creation error");
exit (1);
}
if (write(fd, buf1, sizeof(buf1)) < 0)
perror("Writing
error");
exit(2);
}
if (lseek(fd, 4096, SEEK_SET) < 0)
perror("Positioning
error");
exit(3);
}
if (write(fd, buf2, sizeof(buf2)) < 0)
perror("Writing
error");
exit(2);
}
}
Trace the execution of the
program with the help of the following commands:
ls -l
stat file.gol
od -c file.gol
Write
a program that copies the contents of an existing file into another file. The
names of the two file should be read as an input from the command line. You may
presume that any of the commands read
or write may cause errors.
#include
<sys/types.h>
#include
<sys/uio.h>
#include
<unistd.h>
#include
<fcntl.h>
#define
BUFSIZE 512
int
main (int argc, char** argv)
{
int from, to, nr, nw, n;
char buf[BUFSIZE];
if ((from=open(argv[1], O_RDONLY)) < 0) {
perror(Error
opening source file);
exit(1);
}
if
((to=creat(argv[2], 0666)) < 0) {
perror("Error
creating destination file");
exit(2);
}
while((nr=read(from, buf, sizeof( buf))) != 0) {
if (nr < 0) {
perror("Error reading source file");
exit(3);
}
nw=0;
do {
if ((n=write(to, &buf[nw],
nr-nw)) < 0) {
perror("Error
writing destination file");
exit(4);
}
nw += n;
} while (nw < nr);
}
close(from);
close(to);
}
Write
a program that displays the contents of a directory, specifying the type for
each of its files. The name for the directory should be an input parameter.
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
void listDir(char *dirName)
{
DIR* dir;
struct dirent *dirEntry;
struct stat inode;
char name[1000];
dir = opendir(dirName);
if (dir == 0) {
perror ("Eroare deschidere fisier");
exit(1);
}
while ((dirEntry=readdir(dir)) != 0) {
sprintf(name,"%s/%s",dirName,dirEntry->d_name);
lstat (name, &inode);
// test the type of file
if (S_ISDIR(inode.st_mode))
printf("dir ");
else if (S_ISREG(inode.st_mode))
printf ("fis ");
else
if (S_ISLNK(inode.st_mode))
printf ("lnk ");
else;
printf(" %s\n", dirEntry->d_name);
}
}
int main(int argc, char **argv)
{
if (argc != 2) {
printf ("UTILIZARE: %s nume_dir\n", argv[0]);
exit(0);
}
printf(\94Continutul directorului este:\n\94);
listDir(argv[1]);
}