System call
In computing, a system call is the programmatic way in which a computer program requests a service from the kernel of the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.
In most systems, system calls can only be made from userspace processes, while in some systems, OS/360 and successors for example, privileged system code also issues system calls.
Privileges
The architecture of most modern processors, with the exception of some embedded systems, involves a security model. For example, the rings model specifies multiple privilege levels under which software may be executed: a program is usually limited to its own address space so that it cannot access or modify other running programs or the operating system itself, and is usually prevented from directly manipulating hardware devices.However, many applications need access to these components, so system calls are made available by the operating system to provide well-defined, safe implementations for such operations. The operating system executes at the highest level of privilege, and allows applications to request services via system calls, which are often initiated via interrupts. An interrupt automatically puts the CPU into some elevated privilege level, and then passes control to the kernel, which determines whether the calling program should be granted the requested service. If the service is granted, the kernel executes a specific set of instructions over which the calling program has no direct control, returns the privilege level to that of the calling program, and then returns control to the calling program.
The library as an intermediary
Generally, systems provide a library or API that sits between normal programs and the operating system. On Unix-like systems, that API is usually part of an implementation of the C library, such as glibc, that provides wrapper functions for the system calls, often named the same as the system calls they invoke. On Windows NT, that API is part of the Native API, in the ntdll.dll library; this is an undocumented API used by implementations of the regular Windows API and directly used by some system programs on Windows. The library's wrapper functions expose an ordinary function calling convention for using the system call, as well as making the system call more modular. Here, the primary function of the wrapper is to place all the arguments to be passed to the system call in the appropriate processor registers, and also setting a unique system call number for the kernel to call. In this way the library, which exists between the OS and the application, increases portability.The call to the library function itself does not cause a switch to kernel mode and is usually a normal subroutine call. The actual system call does transfer control to the kernel. For example, in Unix-like systems,
fork
and execve
are C library functions that in turn execute instructions that invoke the fork
and exec
system calls. Making the system call directly in the application code is more complicated and may require embedded assembly code to be used as well as knowledge of the low-level binary interface for the system call operation, which may be subject to change over time and thus not be part of the application binary interface; the library functions are meant to abstract this away.On exokernel based systems, the library is especially important as an intermediary. On exokernels, libraries shield user applications from the very low level kernel API, and provide abstractions and resource management.
IBM operating systems descended from OS/360 and DOS/360, including z/OS and z/VSE, implement system calls through a library of assembly language macros. This reflects their origin at a time when programming in assembly language was more common than high-level language usage. IBM system calls are therefore not directly executable by high-level language programs, but require a callable assembly language wrapper subroutine.
Examples and tools
On Unix, Unix-like and other POSIX-compliant operating systems, popular system calls areopen
, read
, write
, close
, wait
, exec
, fork
, exit
, and kill
. Many modern operating systems have hundreds of system calls. For example, Linux and OpenBSD each have over 300 different calls, NetBSD has close to 500, FreeBSD has over 500, Windows 7 has close to 700, while Plan 9 has 51.Tools such as strace, ftrace and truss allow a process to execute from start and report all system calls the process invokes, or can attach to an already running process and intercept any system call made by said process if the operation does not violate the permissions of the user. This special ability of the program is usually also implemented with a system call, e.g. strace is implemented with ptrace or system calls on files in procfs.
Typical implementations
Implementing system calls requires a transfer of control from user space to kernel space, which involves some sort of architecture-specific feature. A typical way to implement this is to use a software interrupt or trap. Interrupts transfer control to the operating system kernel, so software simply needs to set up some register with the system call number needed, and execute the software interrupt.This is the only technique provided for many RISC processors, but CISC architectures such as x86 support additional techniques. For example, the x86 instruction set contains the instructions SYSCALL/SYSRET and SYSENTER/SYSEXIT. These are "fast" control transfer instructions that are designed to quickly transfer control to the kernel for a system call without the overhead of an interrupt. Linux 2.5 began using this on the x86, where available; formerly it used the INT instruction, where the system call number was placed in the EAX register before interrupt 0x80 was executed.
An older mechanism is the call gate; originally used in Multics and later on the Intel x86. It allows a program to call a kernel function directly using a safe control transfer mechanism, which the operating system sets up in advance. This approach has been unpopular on x86, presumably due to the requirement of a far call which uses x86 memory segmentation and the resulting lack of portability it causes, and existence of the faster instructions mentioned above.
For IA-64 architecture, EPC instruction is used. The first eight system call arguments are passed in registers, and the rest are passed on the stack.
In the IBM System/360 mainframe family, and its successors, a Supervisor Call instruction, with the number in the instruction rather than in a register, implements a system call for legacy facilities in most of IBM's own operating systems, and for all system calls in Linux. In IBM's own operating systems, the Program Call instruction is used for newer facilities. In particular, PC is used when the caller might be in SRB mode.
Categories of system calls
System calls can be grouped roughly into six major categories:- Process control
- * create process
- *terminate process
- *load, execute
- * get/set process attributes
- *wait for time, wait event, signal event
- *allocate and free memory
- File management
- * create file, delete file
- * open, close
- * read, write, reposition
- * get/set file attributes
- Device management
- * request device, release device
- * read, write, reposition
- * get/set device attributes
- * logically attach or detach devices
- Information maintenance
- * get/set time or date
- * get/set system data
- * get/set process, file, or device attributes
- Communication
- * create, delete communication connection
- * send, receive messages
- * transfer status information
- * attach or detach remote devices
- Protection
- *get/set file permissions
Processor mode and context switching
In a multithreaded process, system calls can be made from multiple threads. The handling of such calls is dependent on the design of the specific operating system kernel and the application runtime environment. The following list shows typical models followed by operating systems:
- Many-to-one model: All system calls from any user thread in a process are handled by a single kernel-level thread. This model has a serious drawback any blocking system call can freeze all the other threads. Also, since only one thread can access the kernel at a time, this model cannot utilize multiple cores of processor.
- One-to-one model: Every user thread gets attached to a distinct kernel-level thread during a system call. This model solves the above problem of blocking system calls. It is found in all major Linux distributions, macOS, iOS, recent Windows and Solaris versions.
- Many-to-many model: In this model a pool of user threads is mapped to a pool of kernel threads. All system calls from a user thread pool are handled by the threads in their corresponding kernel thread pool
- Hybrid model: This model implements both many to many and one to one model depending upon choice made by the kernel. This is found in old versions of IRIX, HP-UX and Solaris.