07 April 2023
io_uring
is a new asynchronous I/O API for Linux created by Jens Axboe from Facebook.
It aims to provide an API without the limitations of similar interfaces
read(2)
/write(3)
are synchronousaio_read(3)
/aio_write(3)
provide asynchronous functionality, but only supports with files opened with O_DIRECT
or in unbuffered modeselect(2)
/poll(2)
/epoll(7)
work well with socks but do not behave as expected with regular files (always “ready”)To have a more consistency API between file descriptors (sockets and regular files) we can use libuv
(will probably explore it in the future) or liburing/io_uring
(the star of the show).
As the name suggests, it uses ring buffers as the main interface for kernel-user space communication.
There are two ring buffers, one for submission of requests (submission queue or SQ) and the other that informs you about completion of those requests (completion queue or CQ).
These ring buffers are shared between kernel and user space.
io_uring_setup()
and then map them into user space with two mmap(2)
callsio_uring_enter()
syscall to signal SQEs are ready to be processed
io_uring_enter()
can also wait for requests to be processed by the kernel before it returns, so you know you’re ready to read off the completion queue for resultsOrdering in the CQ may not correspond to the request order in the SQ. This may happen because all requests are performed in parallel, and their results will be added to the CQ as they become available. This is done for performance reasons. If a file is on an HDD and another on an SSD, we don’t want the HDD request to block the faster SSD request.
There is a polling mode available, in which the kernel polls for new entries in the submission queue. This avoids the syscall overhead of calling io_uring_enter()
every time you submit entries for processing.
Because of the shared ring buffers between the kernel and user space, io_uring can be a zero-copy system.
Most sources indicate that the kernel interface was adopted in Linux kernel version 5.1. But from what I saw in the linux git, the linux/io_ring
is only present in linux 6.0 (does anyone know where it might be declared in previous versions?).
There is also a liburing
library that provides an API to interact with the kernel interface easily from userspace.
I will eventually try to interact with io_uring
using Go, so keep an eye on future articles if that interests you.