Video for Linux Two API Specification

Draft 0.12

Michael H Schimek

            
          

Bill Dirks

Hans Verkuil

This document is copyrighted © 1999-2006 by Bill Dirks, Michael H. Schimek and Hans Verkuil.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the appendix entitled "GNU Free Documentation License".

Programming examples can be used and distributed without restrictions.


Table of Contents
Introduction
1. Common API Elements
1.1. Opening and Closing Devices
1.1.1. Device Naming
1.1.2. Related Devices
1.1.3. Multiple Opens
1.1.4. Shared Data Streams
1.1.5. Functions
1.2. Querying Capabilities
1.3. Application Priority
1.4. Video Inputs and Outputs
1.5. Audio Inputs and Outputs
1.6. Tuners and Modulators
1.6.1. Tuners
1.6.2. Modulators
1.6.3. Radio Frequency
1.6.4. Satellite Receivers
1.7. Video Standards
1.8. Controls
1.9. Data Formats
1.9.1. Data Format Negotiation
1.9.2. Image Format Enumeration
1.10. Cropping and Scaling
1.11. Streaming Parameters
2. Image Formats
2.1. Standard Image Formats
2.2. Colorspaces
2.3. RGB Formats
2.4. YUV Formats
V4L2_PIX_FMT_GREY ('GREY') -- Grey-scale image.
V4L2_PIX_FMT_YUYV ('YUYV') -- Packed format with ½ horizontal chroma resolution, also known as YUV 4:2:2.
V4L2_PIX_FMT_UYVY ('UYVY') -- Variation of V4L2_PIX_FMT_YUYV with different order of samples in memory.
V4L2_PIX_FMT_Y41P ('Y41P') -- Packed format with ¼ horizontal chroma resolution, also known as YUV 4:1:1.
V4L2_PIX_FMT_YVU420 ('YV12'), V4L2_PIX_FMT_YUV420 ('YU12') -- Planar formats with ½ horizontal and vertical chroma resolution, also known as YUV 4:2:0.
V4L2_PIX_FMT_YVU410 ('YVU9'), V4L2_PIX_FMT_YUV410 ('YUV9') -- Planar formats with ¼ horizontal and vertical chroma resolution, also known as YUV 4:1:0.
V4L2_PIX_FMT_YUV422P ('422P') -- Format with ½ horizontal chroma resolution, also known as YUV 4:2:2. Planar layout as opposed to V4L2_PIX_FMT_YUYV.
V4L2_PIX_FMT_YUV411P ('411P') -- Format with ¼ horizontal chroma resolution, also known as YUV 4:1:1. Planar layout as opposed to V4L2_PIX_FMT_Y41P.
V4L2_PIX_FMT_NV12 ('NV12'), V4L2_PIX_FMT_NV21 ('NV21') -- Formats with ½ horizontal and vertical chroma resolution, also known as YUV 4:2:0. One luminance and one chrominance plane with alternating chroma samples as opposed to V4L2_PIX_FMT_YVU420.
2.5. Compressed Formats
2.6. Reserved Format Identifiers
3. Input/Output
3.1. Read/Write
3.2. Streaming I/O (Memory Mapping)
3.3. Streaming I/O (User Pointers)
3.4. Asynchronous I/O
3.5. Buffers
3.5.1. Timecodes
3.6. Field Order
4. Device Types
4.1. Video Capture Interface
4.1.1. Querying Capabilities
4.1.2. Supplemental Functions
4.1.3. Image Format Negotiation
4.1.4. Reading Images
4.2. Video Overlay Interface
4.2.1. Querying Capabilities
4.2.2. Supplemental Functions
4.2.3. Setup
4.2.4. Overlay Window
4.2.5. Enabling Overlay
4.3. Video Output Interface
4.3.1. Querying Capabilities
4.3.2. Supplemental Functions
4.3.3. Image Format Negotiation
4.3.4. Writing Images
4.4. Codec Interface
4.5. Effect Devices Interface
4.6. Raw VBI Data Interface
4.6.1. Querying Capabilities
4.6.2. Supplemental Functions
4.6.3. Raw VBI Format Negotiation
4.6.4. Reading and writing VBI images
4.7. Sliced VBI Data Interface
4.7.1. Querying Capabilities
4.7.2. Supplemental Functions
4.7.3. Sliced VBI Format Negotiation
4.7.4. Reading and writing sliced VBI data
4.8. Teletext Interface
4.9. Radio Interface
4.9.1. Querying Capabilities
4.9.2. Supplemental Functions
4.9.3. Programming
4.10. RDS Interface
I. Function Reference
V4L2 close() -- Close a V4L2 device
V4L2 ioctl() -- Program a V4L2 device
ioctl VIDIOC_CROPCAP -- Information about the video cropping and scaling abilities.
ioctl VIDIOC_ENUMAUDIO -- Enumerate audio inputs
ioctl VIDIOC_ENUMAUDOUT -- Enumerate audio outputs
ioctl VIDIOC_ENUM_FMT -- Enumerate image formats
ioctl VIDIOC_ENUMINPUT -- Enumerate video inputs
ioctl VIDIOC_ENUMOUTPUT -- Enumerate video outputs
ioctl VIDIOC_ENUMSTD -- Enumerate supported video standards
ioctl VIDIOC_G_AUDIO, VIDIOC_S_AUDIO -- Query or select the current audio input and its attributes
ioctl VIDIOC_G_AUDOUT, VIDIOC_S_AUDOUT -- Query or select the current audio output
ioctl VIDIOC_G_MPEGCOMP, VIDIOC_S_MPEGCOMP -- Get or set compression parameters
ioctl VIDIOC_G_CROP, VIDIOC_S_CROP -- Get or set the current cropping rectangle
ioctl VIDIOC_G_CTRL, VIDIOC_S_CTRL -- Get or set the value of a control
ioctl VIDIOC_G_FBUF, VIDIOC_S_FBUF -- Get or set frame buffer overlay parameters.
ioctl VIDIOC_G_FMT, VIDIOC_S_FMT, VIDIOC_TRY_FMT -- Get or set the data format, try a format.
ioctl VIDIOC_G_FREQUENCY, VIDIOC_S_FREQUENCY -- Get or set tuner or modulator radio frequency
ioctl VIDIOC_G_INPUT, VIDIOC_S_INPUT -- Query or select the current video input
ioctl VIDIOC_G_JPEGCOMP, VIDIOC_S_JPEGCOMP -- 
ioctl VIDIOC_G_MODULATOR, VIDIOC_S_MODULATOR -- Get or set modulator attributes
ioctl VIDIOC_G_OUTPUT, VIDIOC_S_OUTPUT -- Query or select the current video output
ioctl VIDIOC_G_PARM, VIDIOC_S_PARM -- Get or set streaming parameters
ioctl VIDIOC_G_PRIORITY, VIDIOC_S_PRIORITY -- Query or request the access priority associated with a file descriptor
ioctl VIDIOC_G_SLICED_VBI_CAP -- Query sliced VBI capabilities
ioctl VIDIOC_G_STD, VIDIOC_S_STD -- Query or select the video standard of the current input
ioctl VIDIOC_G_TUNER, VIDIOC_S_TUNER -- Get or set tuner attributes
ioctl VIDIOC_LOG_STATUS -- Log driver status information
ioctl VIDIOC_OVERLAY -- Start or stop video overlay
ioctl VIDIOC_QBUF, VIDIOC_DQBUF -- Exchange a buffer with the driver
ioctl VIDIOC_QUERYBUF -- Query the status of a buffer
ioctl VIDIOC_QUERYCAP -- Query device capabilities
ioctl VIDIOC_QUERYCTRL, VIDIOC_QUERYMENU -- Enumerate controls and menu control items
ioctl VIDIOC_QUERYSTD -- Sense the video standard received by the current input
ioctl VIDIOC_REQBUFS -- Initiate Memory Mapping or User Pointer I/O
ioctl VIDIOC_STREAMON, VIDIOC_STREAMOFF -- Start or stop streaming I/O
V4L2 mmap() -- Map device memory into application address space
V4L2 munmap() -- Unmap device memory
V4L2 open() -- Open a V4L2 device
V4L2 poll() -- Wait for some event on a file descriptor
V4L2 read() -- Read from a V4L2 device
V4L2 select() -- Synchronous I/O multiplexing
V4L2 write() -- Write to a V4L2 device
5. V4L2 Driver Programming
6. History
6.1. Differences between V4L and V4L2
6.1.1. Opening and Closing Devices
6.1.2. Querying Capabilities
6.1.3. Video Sources
6.1.4. Tuning
6.1.5. Image Properties
6.1.6. Audio
6.1.7. Frame Buffer Overlay
6.1.8. Cropping
6.1.9. Reading Images, Memory Mapping
6.1.10. Reading Raw VBI Data
6.1.11. Miscellaneous
6.2. History of the V4L2 API
6.2.1. Early Versions
6.2.2. V4L2 Version 0.16 1999-01-31
6.2.3. V4L2 Version 0.18 1999-03-16
6.2.4. V4L2 Version 0.19 1999-06-05
6.2.5. V4L2 Version 0.20 1999-09-10
6.2.6. V4L2 Version 0.20 incremental changes
6.2.7. V4L2 Version 0.20 2000-11-23
6.2.8. V4L2 Version 0.20 2002-07-25
6.2.9. V4L2 in Linux 2.5.46, 2002-10
6.2.10. V4L2 2003-06-19
6.2.11. V4L2 2003-11-05
6.2.12. V4L2 in Linux 2.6.6, 2004-05-09
6.2.13. V4L2 in Linux 2.6.8
6.2.14. V4L2 spec erratum 2004-08-01
6.2.15. V4L2 in Linux 2.6.14
6.2.16. V4L2 in Linux 2.6.15
6.2.17. V4L2 spec erratum 2005-11-27
6.2.18. V4L2 spec erratum 2006-01-10
6.2.19. V4L2 spec erratum 2006-02-03
6.3. Relation of V4L2 to other Linux multimedia APIs
6.3.1. X Video Extension
6.3.2. Digital Video
6.3.3. Audio Interfaces
A. Video For Linux Two Header File
B. Video Capture Example
C. GNU Free Documentation License
C.1. 0. PREAMBLE
C.2. 1. APPLICABILITY AND DEFINITIONS
C.3. 2. VERBATIM COPYING
C.4. 3. COPYING IN QUANTITY
C.5. 4. MODIFICATIONS
C.6. 5. COMBINING DOCUMENTS
C.7. 6. COLLECTIONS OF DOCUMENTS
C.8. 7. AGGREGATION WITH INDEPENDENT WORKS
C.9. 8. TRANSLATION
C.10. 9. TERMINATION
C.11. 10. FUTURE REVISIONS OF THIS LICENSE
C.12. Addendum
Bibliography
List of Tables
1-1. Control IDs
2-1. struct v4l2_pix_format
2-2. enum v4l2_colorspace
2-3. Packed RGB Image Formats
2-4. Reserved Image Formats
3-1. struct v4l2_buffer
3-2. enum v4l2_buf_type
3-3. Buffer Flags
3-4. enum v4l2_memory
3-5. struct v4l2_timecode
3-6. Timecode Types
3-7. Timecode Flags
3-8. enum v4l2_field
4-1. struct v4l2_window
4-2. struct v4l2_clip[22]
4-3. struct v4l2_rect
4-4. struct v4l2_vbi_format
4-5. Raw VBI Format Flags
4-6. struct v4l2_sliced_vbi_format
4-7. Sliced VBI services
4-8. struct v4l2_sliced_vbi_data
1. struct v4l2_cropcap
2. struct v4l2_rect
1. struct v4l2_fmtdesc
2. Image Format Description Flags
1. struct v4l2_input
2. Input Types
3. Input Status Flags
1. struct v4l2_output
2. Output Type
1. struct v4l2_standard
2. struct v4l2_fract
3. typedef v4l2_std_id
4. Video Standards (based on [ITU470>])
1. struct v4l2_audio
2. Audio Capability Flags
3. Audio Modes
1. struct v4l2_audioout
1. struct v4l2_mpeg_compression
1. struct v4l2_crop
1. struct v4l2_control
1. struct v4l2_framebuffer
2. Frame Buffer Capability Flags
3. Frame Buffer Flags
1. struct v4l2_format
1. struct v4l2_frequency
1. struct v4l2_jpegcompression
2. JPEG Markers Flags
1. struct v4l2_modulator
2. Modulator Audio Transmission Flags
1. struct v4l2_streamparm
2. struct v4l2_captureparm
3. struct v4l2_outputparm
4. Streaming Parameters Capabilites
5. Capture Parameters Flags
1. enum v4l2_priority
1. struct v4l2_sliced_vbi_cap
2. Sliced VBI services
1. struct v4l2_tuner
2. enum v4l2_tuner_type
3. Tuner and Modulator Capability Flags
4. Tuner Audio Reception Flags
5. Tuner Audio Modes
6. Tuner Audio Matrix
1. struct v4l2_capability
2. Device Capabilities Flags
1. struct v4l2_queryctrl
2. struct v4l2_querymenu
3. enum v4l2_ctrl_type
4. Control Flags
1. struct v4l2_requestbuffers
6-1. V4L Device Types, Names and Numbers
List of Figures
1-1. Cropping and Scaling
3-1. Field Order, Top Field First Transmitted
3-2. Field Order, Bottom Field First Transmitted
4-1. Line synchronization
4-2. ITU-R 525 line numbering (M/NTSC and M/PAL)
4-3. ITU-R 625 line numbering
List of Examples
1-1. Information about the current video input
1-2. Switching to the first video input
1-3. Information about the current audio input
1-4. Switching to the first audio input
1-5. Information about the current video standard
1-6. Listing the video standards supported by the current input
1-7. Selecting a new video standard
1-8. Enumerating all controls
1-9. Changing controls
1-10. Resetting the cropping parameters
1-11. Simple downscaling
1-12. Current scaling factor and pixel aspect
2-1. ITU-R Rec. BT.601 color conversion
2-2. V4L2_PIX_FMT_BGR24 4 × 4 pixel image
2-1. V4L2_PIX_FMT_GREY 4 × 4 pixel image
2-1. V4L2_PIX_FMT_YUYV 4 × 4 pixel image
2-1. V4L2_PIX_FMT_UYVY 4 × 4 pixel image
2-1. V4L2_PIX_FMT_Y41P 8 × 4 pixel image
2-1. V4L2_PIX_FMT_YVU420 4 × 4 pixel image
2-1. V4L2_PIX_FMT_YVU410 4 × 4 pixel image
2-1. V4L2_PIX_FMT_YUV422P 4 × 4 pixel image
2-1. V4L2_PIX_FMT_YUV411P 4 × 4 pixel image
2-1. V4L2_PIX_FMT_NV12 4 × 4 pixel image
3-1. Mapping buffers
3-2. Initiating streaming I/O with user pointers

Introduction

[to do]

If you have questions or ideas regarding the API, please try the Video4Linux mailing list: https://listman.redhat.com/mailman/listinfo/video4linux-list

For documentation related requests contact the maintainer at mschimek@gmx.at.

The latest version of this document and the DocBook SGML sources is currently hosted at http://v4l2spec.bytesex.org, and http://linuxtv.org/downloads/video4linux/API/V4L2_API.


Chapter 1. Common API Elements

Programming a V4L2 device consists of these steps:

In practice most steps are optional and can be executed out of order. It depends on the V4L2 device type, you can read about the details in Chapter 4>. In this chapter we will discuss the basic concepts applicable to all devices.


1.1. Opening and Closing Devices

1.1.1. Device Naming

V4L2 drivers are implemented as kernel modules, loaded manually by the system administrator or automatically when a device is first opened. The driver modules plug into the "videodev" kernel module. It provides helper functions and a common application interface specified in this document.

Each driver thus loaded registers one or more device nodes with major number 81 and a minor number between 0 and 255. Assigning minor numbers to V4L2 devices is entirely up to the system administrator, this is primarily intended to solve conflicts between devices.[1] The module options to select minor numbers are named after the device special file with a "_nr" suffix. For example "video_nr" for /dev/video video capture devices. The number is an offset to the base minor number associated with the device type. [2] When the driver supports multiple devices of the same type more than one minor number can be assigned, separated by commas:

> insmod mydriver.o video_nr=0,1 radio_nr=0,1

In /etc/modules.conf this may be written as:

alias char-major-81-0 mydriver
alias char-major-81-1 mydriver
alias char-major-81-64 mydriver              (1)>
options mydriver video_nr=0,1 radio_nr=0,1   (2)>
          
(1)
When an application attempts to open a device special file with major number 81 and minor number 0, 1, or 64, load "mydriver" (and the "videodev" module it depends upon).
(2)
Register the first two video capture devices with minor number 0 and 1 (base number is 0), the first two radio device with minor number 64 and 65 (base 64).

When no minor number is given as module option the driver supplies a default. Chapter 4> recommends the base minor numbers to be used for the various device types. Obviously minor numbers must be unique. When the number is already in use the offending device will not be registered.

By convention system administrators create various character device special files with these major and minor numbers in the /dev directory. The names recomended for the different V4L2 device types are listed in Chapter 4>.

The creation of character special files (with mknod) is a privileged operation and devices cannot be opened by major and minor number. That means applications cannot reliable scan for loaded or installed drivers. The user must enter a device name, or the application can try the conventional device names.

Under the device filesystem (devfs) the minor number options are ignored. V4L2 drivers (or by proxy the "videodev" module) automatically create the required device files in the /dev/v4l directory using the conventional device names above.


1.1.2. Related Devices

Devices can support several related functions. For example video capturing, video overlay and VBI capturing are related because these functions share, amongst other, the same video input and tuner frequency. V4L and earlier versions of V4L2 used the same device name and minor number for video capturing and overlay, but different ones for VBI. Experience showed this approach has several problems[3], and to make things worse the V4L videodev module used to prohibit multiple opens of a device.

As a remedy the present version of the V4L2 API relaxed the concept of device types with specific names and minor numbers. For compatibility with old applications drivers must still register different minor numbers to assign a default function to the device. But if related functions are supported by the driver they must be available under all registered minor numbers. The desired function can be selected after opening the device as described in Chapter 4>.

Imagine a driver supporting video capturing, video overlay, raw VBI capturing, and FM radio reception. It registers three devices with minor number 0, 64 and 224 (this numbering scheme is inherited from the V4L API). Regardless if /dev/video (81, 0) or /dev/vbi (81, 224) is opened the application can select any one of the video capturing, overlay or VBI capturing functions. Without programming (e. g. reading from the device with dd or cat) /dev/video captures video images, while /dev/vbi captures raw VBI data. /dev/radio (81, 64) is invariable a radio device, unrelated to the video functions. Being unrelated does not imply the devices can be used at the same time, however. The open() function may very well return an EBUSY error code.

Besides video input or output the hardware may also support audio sampling or playback. If so, these functions are implemented as OSS or ALSA PCM devices and eventually OSS or ALSA audio mixer. The V4L2 API makes no provisions yet to find these related devices. If you have an idea please write to the Video4Linux mailing list: https://listman.redhat.com/mailman/listinfo/video4linux-list.


1.1.3. Multiple Opens

In general, V4L2 devices can be opened more than once. When this is supported by the driver, users can for example start a "panel" application to change controls like brightness or audio volume, while another application captures video and audio. In other words, panel applications are comparable to an OSS or ALSA audio mixer application. When a device supports multiple functions like capturing and overlay simultaneously, multiple opens allow concurrent use of the device by forked processes or specialized applications.

Multiple opens are optional, although drivers should permit at least concurrent accesses without data exchange, i. e. panel applications. This implies open() can return an EBUSY error code when the device is already in use, as well as ioctl() functions initiating data exchange (namely the VIDIOC_S_FMT ioctl), and the read() and write() functions.

Mere opening a V4L2 device does not grant exclusive access.[4] Initiating data exchange however assigns the right to read or write the requested type of data, and to change related properties, to this file descriptor. Applications can request additional access privileges using the priority mechanism described in Section 1.3>.


1.1.4. Shared Data Streams

V4L2 drivers should not support multiple applications reading or writing the same data stream on a device by copying buffers, time multiplexing or similar means. This is better handled by a proxy application in user space. When the driver supports stream sharing anyway it must be implemented transparently. The V4L2 API does not specify how conflicts are solved.


1.1.5. Functions

To open and close V4L2 devices applications use the open() and close() function, respectively. Devices are programmed using the ioctl() function as explained in the following sections.


1.2. Querying Capabilities

Because V4L2 covers a wide variety of devices not all aspects of the API are equally applicable to all types of devices. Furthermore devices of the same type have different capabilities and this specification permits the omission of a few complicated and less important parts of the API.

The VIDIOC_QUERYCAP ioctl is available to check if the kernel device is compatible with this specification, and to query the functions and I/O methods supported by the device. Other features can be queried by calling the respective ioctl, for example VIDIOC_ENUMINPUT to learn about the number, types and names of video connectors on the device. Although abstraction is a major objective of this API, the ioctl also allows driver specific applications to reliable identify the driver.

All V4L2 drivers must support VIDIOC_QUERYCAP. Applications should always call this ioctl after opening the device.


1.3. Application Priority

When multiple applications share a device it may be desirable to assign them different priorities. Contrary to the traditional "rm -rf /" school of thought a video recording application could for example block other applications from changing video controls or switching the current TV channel. Another objective is to permit low priority applications working in background, which can be preempted by user controlled applications and automatically regain control of the device at a later time.

Since these features cannot be implemented entirely in user space V4L2 defines the VIDIOC_G_PRIORITY and VIDIOC_S_PRIORITY ioctls to request and query the access priority associate with a file descriptor. Opening a device assigns a medium priority, compatible with earlier versions of V4L2 and drivers not supporting these ioctls. Applications requiring a different priority will usually call VIDIOC_S_PRIORITY after verifying the device with the VIDIOC_QUERYCAP ioctl.

Ioctls changing driver properties, such as VIDIOC_S_INPUT, return an EBUSY error code after another application obtained higher priority. An event mechanism to notify applications about asynchronous property changes has been proposed but not added yet.


1.4. Video Inputs and Outputs

Video inputs and outputs are physical connectors of a device. These can be for example RF connectors (antenna/cable), CVBS a.k.a. Composite Video, S-Video or RGB connectors. Only video and VBI capture devices have inputs, output devices have outputs, at least one each. Radio devices have no video inputs or outputs.

To learn about the number and attributes of the available inputs and outputs applications can enumerate them with the VIDIOC_ENUMINPUT and VIDIOC_ENUMOUTPUT ioctl, respectively. The struct v4l2_input returned by the VIDIOC_ENUMINPUT ioctl also contains signal status information applicable when the current video input is queried.

The VIDIOC_G_INPUT and VIDIOC_G_OUTPUT ioctl return the index of the current video input or output. To select a different input or output applications call the VIDIOC_S_INPUT and VIDIOC_S_OUTPUT ioctl. Drivers must implement all the input ioctls when the device has one or more inputs, all the output ioctls when the device has one or more outputs.

Example 1-1. Information about the current video input

struct v4l2_input input;
int index;

if (-1 == ioctl (fd, VIDIOC_G_INPUT, &index)) {
        perror ("VIDIOC_G_INPUT");
        exit (EXIT_FAILURE);
}

memset (&input, 0, sizeof (input));
input.index = index;

if (-1 == ioctl (fd, VIDIOC_ENUMINPUT, &input)) {
        perror ("VIDIOC_ENUMINPUT");
        exit (EXIT_FAILURE);
}

printf ("Current input: %s\n", input.name);
      

Example 1-2. Switching to the first video input

int index;

index = 0;

if (-1 == ioctl (fd, VIDIOC_S_INPUT, &index)) {
        perror ("VIDIOC_S_INPUT");
        exit (EXIT_FAILURE);
}
      

1.5. Audio Inputs and Outputs

Audio inputs and outputs are physical connectors of a device. Video capture devices have inputs, output devices have outputs, zero or more each. Radio devices have no audio inputs or outputs. They have exactly one tuner which in fact is an audio source, but this API associates tuners with video inputs or outputs only, and radio devices have none of these.[5] A connector on a TV card to loop back the received audio signal to a sound card is not considered an audio output.

Audio and video inputs and outputs are associated. Selecting a video source also selects an audio source. This is most evident when the video and audio source is a tuner. Further audio connectors can combine with more than one video input or output. Assumed two composite video inputs and two audio inputs exist, there may be up to four valid combinations. The relation of video and audio connectors is defined in the audioset field of the respective struct v4l2_input or struct v4l2_output, where each bit represents the index number, starting at zero, of one audio input or output.

To learn about the number and attributes of the available inputs and outputs applications can enumerate them with the VIDIOC_ENUMAUDIO and VIDIOC_ENUMAUDOUT ioctl, respectively. The struct v4l2_audio returned by the VIDIOC_ENUMAUDIO ioctl also contains signal status information applicable when the current audio input is queried.

The VIDIOC_G_AUDIO and VIDIOC_G_AUDOUT ioctl report the current audio input and output, respectively. Note that, unlike VIDIOC_G_INPUT and VIDIOC_G_OUTPUT these ioctls return a structure as VIDIOC_ENUMAUDIO and VIDIOC_ENUMAUDOUT do, not just an index.

To select an audio input and change its properties applications call the VIDIOC_S_AUDIO ioctl. To select an audio output (which presently has no changeable properties) applications call the VIDIOC_S_AUDOUT ioctl.

Drivers must implement all input ioctls when the device has one or more inputs, all output ioctls when the device has one or more outputs. When the device has any audio inputs or outputs the driver must set the V4L2_CAP_AUDIO flag in the struct v4l2_capability returned by the VIDIOC_QUERYCAP ioctl.

Example 1-3. Information about the current audio input

struct v4l2_audio audio;

memset (&audio, 0, sizeof (audio));

if (-1 == ioctl (fd, VIDIOC_G_AUDIO, &audio)) {
        perror ("VIDIOC_G_AUDIO");
        exit (EXIT_FAILURE);
}

printf ("Current input: %s\n", audio.name);
      

Example 1-4. Switching to the first audio input

struct v4l2_audio audio;

memset (&audio, 0, sizeof (audio)); /* clear audio.mode, audio.reserved */

audio.index = 0;

if (-1 == ioctl (fd, VIDIOC_S_AUDIO, &audio)) {
        perror ("VIDIOC_S_AUDIO");
        exit (EXIT_FAILURE);
}
      

1.6. Tuners and Modulators

1.6.1. Tuners

Video input devices can have one or more tuners demodulating a RF signal. Each tuner is associated with one or more video inputs, depending on the number of RF connectors on the tuner. The type field of the respective struct v4l2_input returned by the VIDIOC_ENUMINPUT ioctl is set to V4L2_INPUT_TYPE_TUNER and its tuner field contains the index number of the tuner.

Radio devices have exactly one tuner with index zero, no video inputs.

To query and change tuner properties applications use the VIDIOC_G_TUNER and VIDIOC_S_TUNER ioctl, respectively. The struct v4l2_tuner returned by VIDIOC_G_TUNER also contains signal status information applicable when the tuner of the current video input, or a radio tuner is queried. Note that VIDIOC_S_TUNER does not switch the current tuner, when there is more than one at all. The tuner is solely determined by the current video input. Drivers must support both ioctls and set the V4L2_CAP_TUNER flag in the struct v4l2_capability returned by the VIDIOC_QUERYCAP ioctl when the device has one or more tuners.


1.6.2. Modulators

Video output devices can have one or more modulators, uh, modulating a video signal for radiation or connection to the antenna input of a TV set or video recorder. Each modulator is associated with one or more video outputs, depending on the number of RF connectors on the modulator. The type field of the respective struct v4l2_output returned by the VIDIOC_ENUMOUTPUT is set to V4L2_OUTPUT_TYPE_MODULATOR and its modulator field contains the index number of the modulator. This specification does not define radio output devices.

To query and change modulator properties applications use the VIDIOC_G_MODULATOR and VIDIOC_S_MODULATOR ioctl. Note that VIDIOC_S_MODULATOR does not switch the current modulator, when there is more than one at all. The modulator is solely determined by the current video output. Drivers must support both ioctls and set the V4L2_CAP_TUNER (sic) flag in the struct v4l2_capability returned by the VIDIOC_QUERYCAP ioctl when the device has one or more modulators.


1.6.3. Radio Frequency

To get and set the tuner or modulator radio frequency applications use the VIDIOC_G_FREQUENCY and VIDIOC_S_FREQUENCY ioctl which both take a pointer to a struct v4l2_frequency. These ioctls are used for TV and radio devices alike. Drivers must support both ioctls when the tuner or modulator ioctls are supported, or when the device is a radio device.


1.6.4. Satellite Receivers

To be discussed. See also proposals by Peter Schlaf, video4linux-list@redhat.com on 23 Oct 2002, subject: "Re: [V4L] Re: v4l2 api".


1.7. Video Standards

Video devices typically support one or more different video standards or variations of standards. Each video input and output may support another set of standards. This set is reported by the std field of struct v4l2_input and struct v4l2_output returned by the VIDIOC_ENUMINPUT and VIDIOC_ENUMOUTPUT ioctl, respectively.

V4L2 defines one bit for each analog video standard currently in use worldwide, and sets aside bits for driver defined standards, e. g. hybrid standards to watch NTSC video tapes on PAL TVs and vice versa. Applications can use the predefined bits to select a particular standard, although presenting the user a menu of supported standards is preferred. To enumerate and query the attributes of the supported standards applications use the VIDIOC_ENUMSTD ioctl.

Many of the defined standards are actually just variations of a few major standards. The hardware may in fact not distinguish between them, or do so internal and switch automatically. Therefore enumerated standards also contain sets of one or more standard bits.

Assume a hypothetic tuner capable of demodulating B/PAL, G/PAL and I/PAL signals. The first enumerated standard is a set of B and G/PAL, switched automatically depending on the selected radio frequency in UHF or VHF band. Enumeration gives a "PAL-B/G" or "PAL-I" choice. Similar a Composite input may collapse standards, enumerating "PAL-B/G/H/I", "NTSC-M" and "SECAM-D/K".[6]

To query and select the standard used by the current video input or output applications call the VIDIOC_G_STD and VIDIOC_S_STD ioctl, respectively. The received standard can be sensed with the VIDIOC_QUERYSTD ioctl. Note parameter of all these ioctls is a pointer to a v4l2_std_id type (a standard set), not an index into the standard enumeration.[7] Drivers must implement all video standard ioctls when the device has one or more video inputs or outputs.

Special rules apply to USB cameras where the notion of video standards makes little sense. More generally any capture device, output devices accordingly, which is

  • incapable of capturing fields or frames at the nominal rate of the video standard, or

  • where timestamps refer to the instant the field or frame was received by the driver, not the capture time, or

  • where sequence numbers refer to the frames received by the driver, not the captured frames.

Here the driver shall set the std field of struct v4l2_input and struct v4l2_output to zero, the VIDIOC_G_STD, VIDIOC_S_STD, VIDIOC_QUERYSTD and VIDIOC_ENUMSTD ioctls shall return the EINVAL error code.[8]

Example 1-5. Information about the current video standard

v4l2_std_id std_id;
struct v4l2_standard standard;

if (-1 == ioctl (fd, VIDIOC_G_STD, &std_id)) {
        /* Note when VIDIOC_ENUMSTD always returns EINVAL this
           is no video device or it falls under the USB exception,
           and VIDIOC_G_STD returning EINVAL is no error. */

        perror ("VIDIOC_G_STD");
        exit (EXIT_FAILURE);
}

memset (&standard, 0, sizeof (standard));
standard.index = 0;

while (0 == ioctl (fd, VIDIOC_ENUMSTD, &standard)) {
        if (standard.id & std_id) {
               printf ("Current video standard: %s\n", standard.name);
               exit (EXIT_SUCCESS);
        }

        standard.index++;
}

/* EINVAL indicates the end of the enumeration, which cannot be
   empty unless this device falls under the USB exception. */

if (errno == EINVAL || standard.index == 0) {
        perror ("VIDIOC_ENUMSTD");
        exit (EXIT_FAILURE);
}
      

Example 1-6. Listing the video standards supported by the current input

struct v4l2_input input;
struct v4l2_standard standard;

memset (&input, 0, sizeof (input));

if (-1 == ioctl (fd, VIDIOC_G_INPUT, &input.index)) {
        perror ("VIDIOC_G_INPUT");
        exit (EXIT_FAILURE);
}

if (-1 == ioctl (fd, VIDIOC_ENUMINPUT, &input)) {
        perror ("VIDIOC_ENUM_INPUT");
        exit (EXIT_FAILURE);
}

printf ("Current input %s supports:\n", input.name);

memset (&standard, 0, sizeof (standard));
standard.index = 0;

while (0 == ioctl (fd, VIDIOC_ENUMSTD, &standard)) {
        if (standard.id & input.std)
                printf ("%s\n", standard.name);

        standard.index++;
}

/* EINVAL indicates the end of the enumeration, which cannot be
   empty unless this device falls under the USB exception. */

if (errno != EINVAL || standard.index == 0) {
        perror ("VIDIOC_ENUMSTD");
        exit (EXIT_FAILURE);
}
      

Example 1-7. Selecting a new video standard

struct v4l2_input input;
v4l2_std_id std_id;

memset (&input, 0, sizeof (input));

if (-1 == ioctl (fd, VIDIOC_G_INPUT, &input.index)) {
        perror ("VIDIOC_G_INPUT");
        exit (EXIT_FAILURE);
}

if (-1 == ioctl (fd, VIDIOC_ENUMINPUT, &input)) {
        perror ("VIDIOC_ENUM_INPUT");
        exit (EXIT_FAILURE);
}

if (0 == (input.std & V4L2_STD_PAL_BG)) {
        fprintf (stderr, "Oops. B/G PAL is not supported.\n");
        exit (EXIT_FAILURE);
}

/* Note this is also supposed to work when only B
   or G/PAL is supported. */

std_id = V4L2_STD_PAL_BG;

if (-1 == ioctl (fd, VIDIOC_S_STD, &std_id)) {
        perror ("VIDIOC_S_STD");
        exit (EXIT_FAILURE);
}
      

1.8. Controls

Devices typically have a number of user-settable controls such as brightness, saturation and so on, which would be presented to the user on a graphical user interface. But, different devices will have different controls available, and furthermore, the range of possible values, and the default value will vary from device to device. The control ioctls provide the information and a mechanism to create a nice user interface for these controls that will work correctly with any device.

All controls are accessed using an ID value. V4L2 defines several IDs for specific purposes. Drivers can also implement their own custom controls using V4L2_CID_PRIVATE_BASE and higher values. The pre-defined control IDs have the prefix V4L2_CID_, and are listed in Table 1-1>. The ID is used when querying the attributes of a control, and when getting or setting the current value.

Generally applications should present controls to the user without assumptions about their purpose. Each control comes with a name string the user is supposed to understand. When the purpose is non-intuitive the driver writer should provide a user manual, a user interface plug-in or a driver specific panel application. Predefined IDs were introduced to change a few controls programmatically, for example to mute a device during a channel switch.

Drivers may enumerate different controls after switching the current video input or output, tuner or modulator, or audio input or output. Different in the sense of other bounds, another default and current value, step size or other menu items. A control with a certain custom ID can also change name and type.[9] Control values are stored globally, they do not change when switching except to stay within the reported bounds. They also do not change e. g. when the device is opened or closed, when the tuner radio frequency is changed or generally never without application request. Since V4L2 specifies no event mechanism, panel applications intended to cooperate with other panel applications (be they built into a larger application, as a TV viewer) may need to regularly poll control values to update their user interface.[10]

Table 1-1. Control IDs

IDTypeDescription
V4L2_CID_BASE First predefined ID, equal to V4L2_CID_BRIGHTNESS.
V4L2_CID_BRIGHTNESSintegerPicture brightness, or more precisely, the black level. Will not turn up the intelligence of the program you're watching.
V4L2_CID_CONTRASTintegerPicture contrast or luma gain.
V4L2_CID_SATURATIONintegerPicture color saturation or chroma gain.
V4L2_CID_HUEintegerHue or color balance.
V4L2_CID_AUDIO_VOLUMEintegerOverall audio volume. Note some drivers also provide an OSS or ALSA mixer interface.
V4L2_CID_AUDIO_BALANCEintegerAudio stereo balance. Minimum corresponds to all the way left, maximum to right.
V4L2_CID_AUDIO_BASSintegerAudio bass adjustment.
V4L2_CID_AUDIO_TREBLEintegerAudio treble adjustment.
V4L2_CID_AUDIO_MUTEbooleanMute audio, i. e. set the volume to zero, however without affecting V4L2_CID_AUDIO_VOLUME. Like ALSA drivers, V4L2 drivers must mute at load time to avoid excessive noise. Actually the entire device should be reset to a low power consumption state.
V4L2_CID_AUDIO_LOUDNESSbooleanLoudness mode (bass boost).
V4L2_CID_BLACK_LEVELintegerAnother name for brightness (not a synonym of V4L2_CID_BRIGHTNESS). [?]
V4L2_CID_AUTO_WHITE_BALANCEbooleanAutomatic white balance (cameras).
V4L2_CID_DO_WHITE_BALANCEbuttonThis is an action control. When set (the value is ignored), the device will do a white balance and then hold the current setting. Contrast this with the boolean V4L2_CID_AUTO_WHITE_BALANCE, which, when activated, keeps adjusting the white balance.
V4L2_CID_RED_BALANCEintegerRed chroma balance.
V4L2_CID_BLUE_BALANCEintegerBlue chroma balance.
V4L2_CID_GAMMAintegerGamma adjust.
V4L2_CID_WHITENESSintegerWhiteness for grey-scale devices. This is a synonym for V4L2_CID_GAMMA.
V4L2_CID_EXPOSUREintegerExposure (cameras). [Unit?]
V4L2_CID_AUTOGAINbooleanAutomatic gain/exposure control.
V4L2_CID_GAINintegerGain control.
V4L2_CID_HFLIPbooleanMirror the picture horizontally.
V4L2_CID_VFLIPbooleanMirror the picture vertically.
V4L2_CID_HCENTERintegerHorizontal image centering.
V4L2_CID_VCENTERintegerVertical image centering. Centering is intended to physically adjust cameras. For image cropping see Section 1.10>, for clipping Section 4.2>.
V4L2_CID_LASTP1 End of the predefined control IDs (currently V4L2_CID_VCENTER + 1).
V4L2_CID_PRIVATE_BASE ID of the first custom (driver specific) control. Applications depending on particular custom controls should check the driver name and version, see Section 1.2>.

Applications can enumerate the available controls with the VIDIOC_QUERYCTRL and VIDIOC_QUERYMENU ioctls, get and set a control value with the VIDIOC_G_CTRL and VIDIOC_S_CTRL ioctls. Drivers must implement VIDIOC_QUERYCTRL, VIDIOC_G_CTRL and VIDIOC_S_CTRL when the device has one or more controls, VIDIOC_QUERYMENU when it has one or more menu type controls.

Example 1-8. Enumerating all controls

struct v4l2_queryctrl queryctrl;
struct v4l2_querymenu querymenu;

static void
enumerate_menu (void)
{
        printf ("  Menu items:\n");

        memset (&querymenu, 0, sizeof (querymenu));
        querymenu.id = queryctrl.id;

        for (querymenu.index = queryctrl.minimum;
             querymenu.index <= queryctrl.maximum;
              querymenu.index++) {
                if (0 == ioctl (fd, VIDIOC_QUERYMENU, &querymenu)) {
                        printf ("  %s\n", querymenu.name);
                } else {
                        perror ("VIDIOC_QUERYMENU");
                        exit (EXIT_FAILURE);
                }
        }
}

memset (&queryctrl, 0, sizeof (queryctrl));

for (queryctrl.id = V4L2_CID_BASE;
     queryctrl.id < V4L2_CID_LASTP1;
     queryctrl.id++) {
        if (0 == ioctl (fd, VIDIOC_QUERYCTRL, &queryctrl)) {
                if (queryctrl.flags & V4L2_CTRL_FLAG_DISABLED)
                        continue;

                printf ("Control %s\n", queryctrl.name);

                if (queryctrl.type == V4L2_CTRL_TYPE_MENU)
                        enumerate_menu ();
        } else {
                if (errno == EINVAL)
                        continue;

                perror ("VIDIOC_QUERYCTRL");
                exit (EXIT_FAILURE);
        }
}

for (queryctrl.id = V4L2_CID_PRIVATE_BASE;;
     queryctrl.id++) {
        if (0 == ioctl (fd, VIDIOC_QUERYCTRL, &queryctrl)) {
                if (queryctrl.flags & V4L2_CTRL_FLAG_DISABLED)
                        continue;

                printf ("Control %s\n", queryctrl.name);

                if (queryctrl.type == V4L2_CTRL_TYPE_MENU)
                        enumerate_menu ();
        } else {
                if (errno == EINVAL)
                        break;

                perror ("VIDIOC_QUERYCTRL");
                exit (EXIT_FAILURE);
        }
}
        

Example 1-9. Changing controls

struct v4l2_queryctrl queryctrl;
struct v4l2_control control;

memset (&queryctrl, 0, sizeof (queryctrl));
queryctrl.id = V4L2_CID_BRIGHTNESS;

if (-1 == ioctl (fd, VIDIOC_QUERYCTRL, &queryctrl)) {
        if (errno != EINVAL) {
                perror ("VIDIOC_QUERYCTRL");
                exit (EXIT_FAILURE);
        } else {
                printf ("V4L2_CID_BRIGHTNESS is not supported\n");
        }
} else if (queryctrl.flags & V4L2_CTRL_FLAG_DISABLED) {
        printf ("V4L2_CID_BRIGHTNESS is not supported\n");
} else {
        memset (&control, 0, sizeof (control));
        control.id = V4L2_CID_BRIGHTNESS;
        control.value = queryctrl.default_value;

        if (-1 == ioctl (fd, VIDIOC_S_CTRL, &control)) {
                perror ("VIDIOC_S_CTRL");
                exit (EXIT_FAILURE);
        }
}

memset (&control, 0, sizeof (control));
control.id = V4L2_CID_CONTRAST;

if (0 == ioctl (fd, VIDIOC_G_CTRL, &control)) {
        control.value += 1;

        /* The driver may clamp the value or return ERANGE, ignored here */

        if (-1 == ioctl (fd, VIDIOC_S_CTRL, &control)
            && errno != ERANGE) {
                perror ("VIDIOC_S_CTRL");
                exit (EXIT_FAILURE);
        }
/* Ignore if V4L2_CID_CONTRAST is unsupported */
} else if (errno != EINVAL) {
        perror ("VIDIOC_G_CTRL");
        exit (EXIT_FAILURE);
}

control.id = V4L2_CID_AUDIO_MUTE;
control.value = TRUE; /* silence */

/* Errors ignored */
ioctl (fd, VIDIOC_S_CTRL, &control);
        

1.9. Data Formats

1.9.1. Data Format Negotiation

Different devices exchange different kinds of data with applications, for example video images, raw or sliced VBI data, RDS datagrams. Even within one kind many different formats are possible, in particular an abundance of image formats. Although drivers must provide a default and the selection persists across closing and reopening a device, applications should always negotiate a data format before engaging in data exchange. Negotiation means the application asks for a particular format and the driver selects and reports the best the hardware can do to satisfy the request. Of course applications can also just query the current selection.

A single mechanism exists to negotiate all data formats using the aggregate struct v4l2_format and the VIDIOC_G_FMT and VIDIOC_S_FMT ioctls. Additionally the VIDIOC_TRY_FMT ioctl can be used to examine what the hardware could do, without actually selecting a new data format. The data formats supported by the V4L2 API are covered in the respective device section in Chapter 4>. For a closer look at image formats see Chapter 2>.

The VIDIOC_S_FMT ioctl is a major turning-point in the initialization sequence. Prior to this point multiple panel applications can access the same device concurrently to select the current input, change controls or modify other properties. The first VIDIOC_S_FMT assigns a logical stream (video data, VBI data etc.) exclusively to one file descriptor.

Exclusive means no other application, more precisely no other file descriptor, can grab this stream or change device properties inconsistent with the negotiated parameters. A video standard change for example, when the new standard uses a different number of scan lines, can invalidate the selected image format. Therefore only the file descriptor owning the stream can make invalidating changes. Accordingly multiple file descriptors which grabbed different logical streams prevent each other from interfering with their settings. When for example video overlay is about to start or already in progress, simultaneous video capturing may be restricted to the same cropping and image size.

When applications omit the VIDIOC_S_FMT ioctl its locking side effects are implied by the next step, the selection of an I/O method with the VIDIOC_REQBUFS ioctl or implicit with the first read() or write() call.

Generally only one logical stream can be assigned to a file descriptor, the exception being drivers permitting simultaneous video capturing and overlay using the same file descriptor for compatibility with V4L and earlier versions of V4L2. Switching the logical stream or returning into "panel mode" is possible by closing and reopening the device. Drivers may support a switch using VIDIOC_S_FMT.

All drivers exchanging data with applications must support the VIDIOC_G_FMT and VIDIOC_S_FMT ioctl. Implementation of the VIDIOC_TRY_FMT is highly recommended but optional.


1.9.2. Image Format Enumeration

Apart of the generic format negotiation functions a special ioctl to enumerate all image formats supported by video capture, overlay or output devices is available.[11]

The VIDIOC_ENUM_FMT ioctl must be supported by all drivers exchanging image data with applications.

Important: Drivers are not supposed to convert image formats in kernel space. They must enumerate only formats directly supported by the hardware. If necessary driver writers should publish an example conversion routine or library for integration into applications.


1.10. Cropping and Scaling

Some video capture devices can take a subsection of the complete picture and shrink or enlarge to an image of arbitrary size. We call these abilities cropping and scaling. Not quite correct "cropping" shall also refer to the inverse process, output devices showing an image in only a region of the picture, and/or scaled from a source image of different size.

To crop and scale this API defines a source and target rectangle. On a video capture and overlay device the source is the received video picture, the target is the captured or overlaid image. On a video output device the source is the image passed by the application and the target is the generated video picture. The remainder of this section refers only to video capture drivers, the definitions apply to output drivers accordingly.

Figure 1-1. Cropping and Scaling

It is assumed the driver can capture a subsection of the picture within an arbitrary capture window. Its bounds are defined by struct v4l2_cropcap, giving the coordinates of the top, left corner and width and height of the window in pixels. Origin and units of the coordinate system in the analog domain are arbitrarily chosen by the driver writer.[12]

The source rectangle is defined by struct v4l2_crop, giving the coordinates of its top, left corner, width and height using the same coordinate system as struct v4l2_cropcap. The source rectangle must lie completely within the capture window. Further each driver defines a default source rectangle. The center of this rectangle shall align with the center of the active picture area of the video signal, and cover what the driver writer considers the complete picture. The source rectangle is set to the default when the driver is first loaded, but not later.

The target rectangle is given either by the width and height fields of struct v4l2_pix_format or the width and height fields of the struct v4l2_rect w substructure of struct v4l2_window.

In principle cropping and scaling always happens. When the device supports scaling but not cropping, applications will be unable to change the cropping rectangle. It remains at the defaults all the time. When the device supports cropping but not scaling, changing the image size will also affect the cropping size in order to maintain a constant scaling factor. The position of the cropping rectangle is only adjusted to move the rectangle completely inside the capture window.

When cropping and scaling is supported applications can change both the source and target rectangle. Various hardware limitations must be expected, for example discrete scaling factors, different scaling abilities in horizontal and vertical direction, limitations of the image size or the cropping alignment. Therefore as usual drivers adjust the requested parameters against hardware capabilities and return the actual values selected. An important difference, because two rectangles are defined, is that the last rectangle changed shall take priority, and the driver may also adjust the opposite rectangle.

Suppose scaling is restricted to a factor 1:1 or 2:1 in either direction and the image size must be a multiple of 16 × 16 pixels. The cropping rectangle be set to the upper limit, 640 × 400 pixels at offset 0, 0. Let a video capture application request an image size of 300 × 225 pixels, assuming video will be scaled down from the "full picture" accordingly. The driver will set the image size to the closest possible values 304 × 224, then choose the cropping rectangle closest to the requested size, that is 608 × 224 (224 × 2:1 would exceed the limit 400). The offset 0, 0 is still valid, thus unmodified. Given the default cropping rectangle reported by VIDIOC_CROPCAP the application can easily propose another offset to center the cropping rectangle. Now the application may insist on covering an area using an aspect closer to the original request. Sheepish it asks for a cropping rectangle of 608 × 456 pixels. The present scaling factors limit cropping to 640 × 384, so the driver returns the cropping size 608 × 384 and accordingly adjusts the image size to 304 × 192.

Eventually some crop or scale parameters are locked, for example when the driver supports simultaneous video capturing and overlay, another application already started overlay and the cropping parameters cannot be changed anymore. Also VIDIOC_TRY_FMT cannot change the cropping rectangle. In these cases the driver has to approach the closest values possible without adjusting the opposite rectangle.

The struct v4l2_cropcap, which also reports the pixel aspect ratio, can be obtained with the VIDIOC_CROPCAP ioctl. To get or set the current cropping rectangle applications call the VIDIOC_G_CROP or VIDIOC_S_CROP ioctl, respectively. All video capture and output devices must support the VIDIOC_CROPCAP ioctl. The VIDIOC_G_CROP and VIDIOC_S_CROP ioctls only when the cropping rectangle can be changed.

Note as usual the cropping parameters remain unchanged across closing and reopening a device. Applications should ensure the parameters are suitable before starting I/O.

Example 1-10. Resetting the cropping parameters

(A video capture device is assumed.)

struct v4l2_cropcap cropcap;
struct v4l2_crop crop;

memset (&cropcap, 0, sizeof (cropcap));
cropcap.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;

if (-1 == ioctl (fd, VIDIOC_CROPCAP, &cropcap)) {
        perror ("VIDIOC_CROPCAP");
        exit (EXIT_FAILURE);
}

memset (&crop, 0, sizeof (crop));
crop.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
crop.c = cropcap.defrect; 

/* Ignore if cropping is not supported (EINVAL) */

if (-1 == ioctl (fd, VIDIOC_S_CROP, &crop)
    && errno != EINVAL) {
        perror ("VIDIOC_S_CROP");
        exit (EXIT_FAILURE);
}
      

Example 1-11. Simple downscaling

(A video capture device is assumed.)

struct v4l2_cropcap cropcap;
struct v4l2_format format;

reset_cropping_parameters ();

/* Scale down to 1/4 size of full picture */

memset (&format, 0, sizeof (format)); /* defaults */

format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;

format.fmt.pix.width = cropcap.defrect.width >> 1;
format.fmt.pix.height = cropcap.defrect.height >> 1;
format.fmt.pix.pixelformat = V4L2_PIX_FMT_YUYV;

if (-1 == ioctl (fd, VIDIOC_S_FMT, &format)) {
        perror ("VIDIOC_S_FORMAT");
        exit (EXIT_FAILURE);
}

/* We could check now what we got, the exact scaling factor
   or if the driver can scale at all. At mere 2:1 the cropping
   rectangle was probably not changed. */
        

Example 1-12. Current scaling factor and pixel aspect

(A video capture device is assumed.)

struct v4l2_cropcap cropcap;
struct v4l2_crop crop;
struct v4l2_format format;
double hscale, vscale;
double aspect;
int dwidth, dheight;

memset (&cropcap, 0, sizeof (cropcap));
cropcap.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;

if (-1 == ioctl (fd, VIDIOC_CROPCAP, &cropcap)) {
        perror ("VIDIOC_CROPCAP");
        exit (EXIT_FAILURE);
}

memset (&crop, 0, sizeof (crop));
crop.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;

if (-1 == ioctl (fd, VIDIOC_G_CROP, &crop)) {
        if (errno != EINVAL) {
                perror ("VIDIOC_G_CROP");
                exit (EXIT_FAILURE);
        }

        /* Cropping not supported */
        crop.c = cropcap.defrect;
}

memset (&format, 0, sizeof (format));
format.fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;

if (-1 == ioctl (fd, VIDIOC_G_FMT, &format)) {
        perror ("VIDIOC_G_FMT");
        exit (EXIT_FAILURE);
}

hscale = format.fmt.pix.width / (double) crop.c.width;
vscale = format.fmt.pix.height / (double) crop.c.height;

aspect = cropcap.pixelaspect.numerator /
         (double) cropcap.pixelaspect.denominator;
aspect = aspect * hscale / vscale;

/* Aspect corrected display size */

dwidth = format.fmt.pix.width / aspect;
dheight = format.fmt.pix.height;
        

1.11. Streaming Parameters

Streaming parameters are intended to optimize the video capture process as well as I/O. Presently applications can request a high quality capture mode with the VIDIOC_S_PARM ioctl.

The current video standard determines a nominal number of frames per second. If less than this number of frames is to be captured or output, applications can request frame skipping or duplicating on the driver side. This is especially useful when using the read() or write(), which are not augmented by timestamps or sequence counters, and to avoid unneccessary data copying.

Finally these ioctls can be used to determine the number of buffers used internally by a driver in read/write mode. For implications see the section discussing the read() function.

To get and set the streaming parameters applications call the VIDIOC_G_PARM and VIDIOC_S_PARM ioctl, respectively. They take a pointer to a struct v4l2_streamparm, which contains a union holding separate parameters for input and output devices.

These ioctls are optional, drivers need not implement them. If so, they return the EINVAL error code.


Chapter 2. Image Formats

The V4L2 API was primarily designed for devices exchanging image data with applications. The v4l2_pix_format structure defines the format and layout of an image in memory. Image formats are negotiated with the VIDIOC_S_FMT ioctl. (The explanations here focus on video capturing and output, for overlay frame buffer formats see also VIDIOC_G_FBUF.)

Table 2-1. struct v4l2_pix_format

__u32widthImage width in pixels.
__u32heightImage height in pixels.
Applications set these fields to request an image size, drivers return the closest possible values. In case of planar formats the width and height applies to the largest plane. To avoid ambiguities drivers must return values rounded up to a multiple of the scale factor of any smaller planes. For example when the image format is YUV 4:2:0, width and height must be multiples of two.
__u32pixelformatThe pixel format or type of compression, set by the application. This is a little endian four character code. V4L2 defines standard RGB formats in Table 2-3>, YUV formats in Section 2.4>, and reserved codes in Table 2-4>
enum v4l2_fieldfieldVideo images are typically interlaced. Applications can request to capture or output only the top or bottom field, or both fields interlaced or sequentially stored in one buffer or alternating in separate buffers. Drivers return the actual field order selected. For details see Section 3.6>.
__u32bytesperlineDistance in bytes between the leftmost pixels in two adjacent lines.

Both applications and drivers can set this field to request padding bytes at the end of each line. Drivers however may ignore the value requested by the application, returning width times bytes per pixel or a larger value required by the hardware. That implies applications can just set this field to zero to get a reasonable default.

Video hardware may access padding bytes, therefore they must reside in accessible memory. Consider cases where padding bytes after the last line of an image cross a system page boundary. Input devices may write padding bytes, the value is undefined. Output devices ignore the contents of padding bytes.

When the image format is planar the bytesperline value applies to the largest plane and is divided by the same factor as the width field for any smaller planes. For example the Cb and Cr planes of a YUV 4:2:0 image have half as many padding bytes following each line as the Y plane. To avoid ambiguities drivers must return a bytesperline value rounded up to a multiple of the scale factor.

__u32sizeimageSize in bytes of the buffer to hold a complete image, set by the driver. Usually this is bytesperline times height. When the image consists of variable length compressed data this is the maximum number of bytes required to hold an image.
enum v4l2_colorspacecolorspaceThis information supplements the pixelformat and must be set by the driver, see Section 2.2>.
__u32privReserved for custom (driver defined) additional information about formats. When not used drivers and applications must set this field to zero.

2.1. Standard Image Formats

In order to exchange images between drivers and applications, it is necessary to have standard image data formats which both sides will interpret the same way. V4L2 includes several such formats, and this section is intended to be an unambiguous specification of the standard image data formats in V4L2.

V4L2 drivers are not limited to these formats, however. Driver-specific formats are possible. In that case the application may depend on a codec to convert images to one of the standard formats when needed. But the data can still be stored and retrieved in the proprietary format. For example, a device may support a proprietary compressed format. Applications can still capture and save the data in the compressed format, saving much disk space, and later use a codec to convert the images to the X Windows screen format when the video is to be displayed.

Even so, ultimately, some standard formats are needed, so the V4L2 specification would not be complete without well-defined standard formats.

The V4L2 standard formats are mainly uncompressed formats. The pixels are always arranged in memory from left to right, and from top to bottom. The first byte of data in the image buffer is always for the leftmost pixel of the topmost row. Following that is the pixel immediately to its right, and so on until the end of the top row of pixels. Following the rightmost pixel of the row there may be zero or more bytes of padding to guarantee that each row of pixel data has a certain alignment. Following the pad bytes, if any, is data for the leftmost pixel of the second row from the top, and so on. The last row has just as many pad bytes after it as the other rows.

In V4L2 each format has an identifier which looks like PIX_FMT_XXX, defined in the videodev.h header file. These identifiers represent four character codes which are also listed below, however they are not the same as those used in the Windows world.


2.2. Colorspaces

[intro]

Gamma Correction

[to do]

E'R = f(R)

E'G = f(G)

E'B = f(B)

Construction of luminance and color-difference signals

[to do]

E'Y = CoeffR E'R + CoeffG E'G + CoeffB E'B

(E'R - E'Y) = E'R - CoeffR E'R - CoeffG E'G - CoeffB E'B

(E'B - E'Y) = E'B - CoeffR E'R - CoeffG E'G - CoeffB E'B

Re-normalized color-difference signals

The color-difference signals are scaled back to unity range [-0.5;+0.5]:

KB = 0.5 / (1 - CoeffB)

KR = 0.5 / (1 - CoeffR)

PB = KB (E'B - E'Y) = 0.5 (CoeffR / CoeffB) E'R + 0.5 (CoeffG / CoeffB) E'G + 0.5 E'B

PR = KR (E'R - E'Y) = 0.5 E'R + 0.5 (CoeffG / CoeffR) E'G + 0.5 (CoeffB / CoeffR) E'B

Quantization

[to do]

Y' = (Lum. Levels - 1) · E'Y + Lum. Offset

CB = (Chrom. Levels - 1) · PB + Chrom. Offset

CR = (Chrom. Levels - 1) · PR + Chrom. Offset

Rounding to the nearest integer and clamping to the range [0;255] finally yields the digital color components Y'CbCr stored in YUV images.

Example 2-1. ITU-R Rec. BT.601 color conversion

Forward Transformation

int ER, EG, EB;         /* gamma corrected RGB input [0;255] */
int Y1, Cb, Cr;         /* output [0;255] */

double r, g, b;         /* temporaries */
double y1, pb, pr;

int
clamp (double x)
{
        int r = x;      /* round to nearest */

        if (r < 0)         return 0;
        else if (r > 255)  return 255;
        else               return r;
}

r = ER / 255.0;
g = EG / 255.0;
b = EB / 255.0;

y1  =  0.299  * r + 0.587 * g + 0.114  * b;
pb  = -0.169  * r - 0.331 * g + 0.5    * b;
pr  =  0.5    * r - 0.419 * g - 0.081  * b;

Y1 = clamp (219 * y1 + 16);
Cb = clamp (224 * pb + 128);
Cr = clamp (224 * pr + 128);

/* or shorter */

y1 = 0.299 * ER + 0.587 * EG + 0.114 * EB;

Y1 = clamp ( (219 / 255.0)                    *       y1  + 16);
Cb = clamp (((224 / 255.0) / (2 - 2 * 0.114)) * (EB - y1) + 128);
Cr = clamp (((224 / 255.0) / (2 - 2 * 0.299)) * (ER - y1) + 128);
      

Inverse Transformation

int Y1, Cb, Cr;         /* gamma pre-corrected input [0;255] */
int ER, EG, EB;         /* output [0;255] */

double r, g, b;         /* temporaries */
double y1, pb, pr;

int
clamp (double x)
{
        int r = x;      /* round to nearest */

        if (r < 0)         return 0;
        else if (r > 255)  return 255;
        else               return r;
}

y1 = (255 / 219.0) * (Y1 - 16);
pb = (255 / 224.0) * (Cb - 128);
pr = (255 / 224.0) * (Cr - 128);

r = 1.0 * y1 + 0     * pb + 1.402 * pr;
g = 1.0 * y1 - 0.344 * pb - 0.714 * pr;
b = 1.0 * y1 + 1.772 * pb + 0     * pr;

ER = clamp (r * 255); /* [ok? one should prob. limit y1,pb,pr] */
EG = clamp (g * 255);
EB = clamp (b * 255);
      

Table 2-2. enum v4l2_colorspace

IdentifierValueDescriptionChromaticities[a]White PointGamma CorrectionLuminance E'YQuantization
RedGreenBlueY'Cb, Cr
V4L2_COLORSPACE_SMPTE170M1NTSC/PAL according to SMPTE170M>, ITU601>x = 0.630, y = 0.340x = 0.310, y = 0.595x = 0.155, y = 0.070x = 0.3127, y = 0.3290, Illuminant D65E' = 4.5 I for I ≤0.018, 1.099 I0.45 - 0.099 for 0.018 < I0.299 E'R + 0.587 E'G + 0.114 E'B219 E'Y + 16224 PB,R + 128
V4L2_COLORSPACE_SMPTE240M21125-Line (US) HDTV, see SMPTE240M>x = 0.630, y = 0.340x = 0.310, y = 0.595x = 0.155, y = 0.070x = 0.3127, y = 0.3290, Illuminant D65E' = 4 I for I ≤0.0228, 1.1115 I0.45 - 0.1115 for 0.0228 < I0.212 E'R + 0.701 E'G + 0.087 E'B219 E'Y + 16224 PB,R + 128
V4L2_COLORSPACE_REC7093HDTV and modern devices, see ITU709>x = 0.640, y = 0.330x = 0.300, y = 0.600x = 0.150, y = 0.060x = 0.3127, y = 0.3290, Illuminant D65E' = 4.5 I for I ≤0.018, 1.099 I0.45 - 0.099 for 0.018 < I0.2125 E'R + 0.7154 E'G + 0.0721 E'B219 E'Y + 16224 PB,R + 128
V4L2_COLORSPACE_BT8784Broken Bt878 extents[b], ITU601>?????0.299 E'R + 0.587 E'G + 0.114 E'B237 E'Y + 16224 PB,R + 128 (probably)
V4L2_COLORSPACE_470_SYSTEM_M5M/NTSC[c] according to ITU470>, ITU601>x = 0.67, y = 0.33x = 0.21, y = 0.71x = 0.14, y = 0.08x = 0.310, y = 0.316, Illuminant C?0.299 E'R + 0.587 E'G + 0.114 E'B219 E'Y + 16224 PB,R + 128
V4L2_COLORSPACE_470_SYSTEM_BG6625-line PAL and SECAM systems according to ITU470>, ITU601>x = 0.64, y = 0.33x = 0.29, y = 0.60x = 0.15, y = 0.06x = 0.313, y = 0.329, Illuminant D65?0.299 E'R + 0.587 E'G + 0.114 E'B219 E'Y + 16224 PB,R + 128
V4L2_COLORSPACE_JPEG7JPEG Y'CbCr, see JFIF>, ITU601>?????0.299 E'R + 0.587 E'G + 0.114 E'B256 E'Y + 16[d]256 PB,R + 128
V4L2_COLORSPACE_SRGB8[?]x = 0.640, y = 0.330x = 0.300, y = 0.600x = 0.150, y = 0.060x = 0.3127, y = 0.3290, Illuminant D65E' = 4.5 I for I ≤0.018, 1.099 I0.45 - 0.099 for 0.018 < In/a
Notes:
a. The coordinates of the color primaries are given in the CIE system (1931)
b. The ubiquitous Bt878 video capture chip quantizes E'Y to 238 levels, yielding a range of Y' = 16 … 253, unlike Rec. 601 Y' = 16 … 235. This is not a typo in the Bt878 documentation, it has been implemented in silicon. The chroma extents are unclear.
c. No identifier exists for M/PAL which uses the chromaticities of M/NTSC, the remaining parameters are equal to B and G/PAL.
d. Note JFIF quantizes Y'PBPR in range [0;+1] and [-0.5;+0.5] to 257 levels, however Y'CbCr signals are still clamped to [0;255].

2.3. RGB Formats

These formats are designed to match the pixel formats of typical PC graphics frame buffers. They occupy 8, 16, 24 or 32 bits per pixel. These are all packed-pixel formats, meaning all the data for a pixel lie next to each other in memory.

When one of these formats is used, drivers shall report the colorspace V4L2_COLORSPACE_SRGB.

Table 2-3. Packed RGB Image Formats

IdentifierCode Byte 0 Byte 1 Byte 2 Byte 3
Bit76543210 76543210 76543210 76543210
V4L2_PIX_FMT_RGB332'RGB1' b1b0g2g1g0r2r1r0                          
V4L2_PIX_FMT_RGB555'RGBO' g2g1g0r4r3r2r1r0 ?b4b3b2b1b0g4g3                 
V4L2_PIX_FMT_RGB565'RGBP' g2g1g0r4r3r2r1r0 b4b3b2b1b0g5g4g3                 
V4L2_PIX_FMT_RGB555X'RGBQ' ?b4b3b2b1b0g4g3 g2g1g0r4r3r2r1r0                 
V4L2_PIX_FMT_RGB565X'RGBR' b4b3b2b1b0g5g4g3 g2g1g0r4r3r2r1r0                 
V4L2_PIX_FMT_BGR24'BGR3' b7b6b5b4b3b2b1b0 g7g6g5g4g3g2g1g0 r7r6r5r4r3r2r1r0        
V4L2_PIX_FMT_RGB24'RGB3' r7r6r5r4r3r2r1r0 g7g6g5g4g3g2g1g0 b7b6b5b4b3b2b1b0        
V4L2_PIX_FMT_BGR32'BGR4' b7b6b5b4b3b2b1b0 g7g6g5g4g3g2g1g0 r7r6r5r4r3r2r1r0 ????????
V4L2_PIX_FMT_RGB32'RGB4' r7r6r5r4r3r2r1r0 g7g6g5g4g3g2g1g0 b7b6b5b4b3b2b1b0 ????????

Bit 7 is the most significant bit. ? = undefined bit, ignored on output, random value on input.

Example 2-2. V4L2_PIX_FMT_BGR24 4 × 4 pixel image

Byte Order. Each cell is one byte.

start + 0:B00G00R00B01G01R01B02G02R02B03G03R03
start + 12:B10G10R10B11G11R11B12G12R12B13G13R13
start + 24:B20G20R20B21G21R21B22G22R22B23G23R23
start + 36:B30G30R30B31G31R31B32G32R32B33G33R33

Important: Drivers may interpret these formats differently.

The V4L2_PIX_FMT_RGB555, V4L2_PIX_FMT_RGB565, V4L2_PIX_FMT_RGB555X and V4L2_PIX_FMT_RGB565X formats are uncommon. Video and display hardware typically supports variants with reversed order of color components, i. e. blue towards the least, red towards the most significant bit. Although presumably the original authors had the common formats in mind, the definitions were always very clear and cannot be simply regarded as erroneous.

If V4L2_PIX_FMT_RGB332 has been chosen in accordance with the 15 and 16 bit formats, this format might as well be interpreted differently, as "rrrgggbb" rather than "bbgggrrr".

Finally some drivers, most prominently the BTTV driver, might interpret V4L2_PIX_FMT_RGB32 as the big-endian variant of V4L2_PIX_FMT_BGR32, consisting of bytes "?RGB" in memory. V4L2 never defined such a format, lack of a X suffix to the symbol suggests it was intended this way, and a new symbol and four character code should have been used instead.

Until these issues are solved, application writers are advised that drivers might interpret these formats either way.


2.4. YUV Formats

Table of Contents
V4L2_PIX_FMT_GREY ('GREY') -- Grey-scale image.
V4L2_PIX_FMT_YUYV ('YUYV') -- Packed format with ½ horizontal chroma resolution, also known as YUV 4:2:2.
V4L2_PIX_FMT_UYVY ('UYVY') -- Variation of V4L2_PIX_FMT_YUYV with different order of samples in memory.
V4L2_PIX_FMT_Y41P ('Y41P') -- Packed format with ¼ horizontal chroma resolution, also known as YUV 4:1:1.
V4L2_PIX_FMT_YVU420 ('YV12'), V4L2_PIX_FMT_YUV420 ('YU12') -- Planar formats with ½ horizontal and vertical chroma resolution, also known as YUV 4:2:0.
V4L2_PIX_FMT_YVU410 ('YVU9'), V4L2_PIX_FMT_YUV410 ('YUV9') -- Planar formats with ¼ horizontal and vertical chroma resolution, also known as YUV 4:1:0.
V4L2_PIX_FMT_YUV422P ('422P') -- Format with ½ horizontal chroma resolution, also known as YUV 4:2:2. Planar layout as opposed to V4L2_PIX_FMT_YUYV.
V4L2_PIX_FMT_YUV411P ('411P') -- Format with ¼ horizontal chroma resolution, also known as YUV 4:1:1. Planar layout as opposed to V4L2_PIX_FMT_Y41P.
V4L2_PIX_FMT_NV12 ('NV12'), V4L2_PIX_FMT_NV21 ('NV21') -- Formats with ½ horizontal and vertical chroma resolution, also known as YUV 4:2:0. One luminance and one chrominance plane with alternating chroma samples as opposed to V4L2_PIX_FMT_YVU420.

YUV is the format native to TV broadcast and composite video signals. It separates the brightness information (Y) from the color information (U and V or Cb and Cr). The color information consists of red and blue color difference signals, this way the green component can be reconstructed by subtracting from the brightness component. See Section 2.2> for conversion examples. YUV was chosen because early television would only transmit brightness information. To add color in a way compatible with existing receivers a new signal carrier was added to transmit the color difference signals. Secondary in the YUV format the U and V components usually have lower resolution than the Y component. This is an analog video compression technique taking advantage of a property of the human visual system, being more sensitive to brightness information.

V4L2_PIX_FMT_GREY ('GREY')

Name

V4L2_PIX_FMT_GREY -- Grey-scale image.

Description

This is a grey-scale image. It is really a degenerate Y'CbCr format which simply contains no Cb or Cr data.

Example 2-1. V4L2_PIX_FMT_GREY 4 × 4 pixel image

Byte Order. Each cell is one byte.

start + 0:Y'00Y'01Y'02Y'03
start + 4:Y'10Y'11Y'12Y'13
start + 8:Y'20Y'21Y'22Y'23
start + 12:Y'30Y'31Y'32Y'33

V4L2_PIX_FMT_YUYV ('YUYV')

Name

V4L2_PIX_FMT_YUYV -- Packed format with ½ horizontal chroma resolution, also known as YUV 4:2:2.

Description

In this format each four bytes is two pixels. Each four bytes is two Y's, a Cb and a Cr. Each Y goes to one of the pixels, and the Cb and Cr belong to both pixels. As you can see, the Cr and Cb components have half the horizontal resolution of the Y component. V4L2_PIX_FMT_YUYV is known in the Windows environment as YUY2.

Example 2-1. V4L2_PIX_FMT_YUYV 4 × 4 pixel image

Byte Order. Each cell is one byte.