V4L2 Video Image Format Specification

Video for Linux Two - Image Data Formats

Bill Dirks - June 26, 2003

In order to exchange images between drivers and applications, it is necessary to have standard image data formats which both sides will interpret the same way. V4L2 includes several such formats, and this document is intended to be an unambiguous specification for the standard image data formats in V4L2.

V4L2 drivers are not limited to these formats, however. Driver-specific formats are possible. In that case the application may depend on a codec driver to convert images to one of the standard formats when needed. But the data can still be stored and retreived in the proprietary format. For example, a device may support a proprietary compressed format. Applications can still capture and save the data in the compressed format, saving much disk space, and later use a codec device driver to convert the images to the X Windows screen format when the video is to be displayed.

Even so, ultimately, some standard formats are needed, so the V4L2 specification would not be complete without well-defined standard formats.

About the V4L2 Standard Formats

The V4L2 standard formats are all uncompressed formats.

The pixels are always arranged in memory from left to right, and from top to bottom. The first byte of data in the image buffer is always for the leftmost pixel of the topmost row. Following that is the pixel immediately to its right, and so on until the end of the top row of pixels. Following the rightmost pixel of the row there may be zero or more bytes of padding to guarantee that each row of pixel data has a certain alignment. Following the pad bytes, if any, is data for the leftmost pixel of the second row from the top, and so on. The last row has just as many pad bytes after it as the other rows.

The formats fall into two broad categories, the RGB formats and YUV formats. The YUV formats all use the YCbCr color space used in the ITU-R601 and ITU-R656 digital video standards. There is more information about the YCbCr color space later in this document.

In V4L2, each format has an identifier which looks like PIX_FMT_XXX, defined in videodev.h.

The rest of this document describes each standard format.

In order to make the specifications endianness independent, the following diagrams show the order of the data in memory on a byte by byte basis. Each cell of the diagrams is one byte. The bytes are arranged in memory from left to right, top to bottom. Possible pad bytes after each row are not shown.

The RGB Formats

These formats are designed to match the pixel formats of typical PC graphics frame buffers. Four formats are defined, two 16-bits per pixel, one 24 bpp, and one 32 bpp. These are all packed-pixel formats, meaning all the data for a pixel are next to each other in memory.

V4L2_PIX_FMT_RGB555,
V4L2_PIX_FMT_RGB565

A 4x4 image. Each cell is one byte.
p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

Each pixel is two bytes, denoted here as p and q. For RGB 5-5-5, each pair of bytes contains five bits of red, five bits of green, five bits of blue, and one extra bit. The value of the extra bit is undefined. For RGB 5-6-5 there are six green bits and no extra bits. The RGB bits are arranged in p and q like this:

RGB 5-5-5
bit	(MSB) 7	6	5	4	3	2	1	0 (LSB)
p	G2	G1	G0	R4	R3	R2	R1	R0
q	?	B4	B3	B2	B1	B0	G4	G3

RGB 5-6-5
bit	7	6	5	4	3	2	1	0
p	G2	G1	G0	R4	R3	R2	R1	R0
q	B4	B3	B2	B1	B0	G5	G4	G3

V4L2_PIX_FMT_RGB555X,
V4L2_PIX_FMT_RGB565X

A 4x4 image. Each cell is one byte.
p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

RGB 5-5-5
bit	(MSB) 7	6	5	4	3	2	1	0 (LSB)
p	?	B4	B3	B2	B1	B0	G4	G3
q	G2	G1	G0	R4	R3	R2	R1	R0

RGB 5-6-5
bit	7	6	5	4	3	2	1	0
p	B4	B3	B2	B1	B0	G5	G4	G3
q	G2	G1	G0	R4	R3	R2	R1	R0

V4L2_PIX_FMT_BGR24

A 4x4 image. Each cell is one byte.
B00	G00	R00	B01	G01	R01	B02	G02	R02	B03	G03	R03
B10	G10	R10	B11	G11	R11	B12	G12	R12	B13	G13	R13
B20	G20	R20	B21	G21	R21	B22	G22	R22	B23	G23	R23
B30	G30	R30	B31	G31	R31	B22	G32	R32	B33	G33	R33

Each pixel is three bytes. B is first, then G then R.

V4L2_PIX_FMT_RGB24

A 4x4 image. Each cell is one byte.
R00	G00	B00	R01	G01	B01	R02	G02	B02	R03	G03	B03
R10	G10	B10	R11	G11	B11	R12	G12	B12	R13	G13	B13
R20	G20	B20	R21	G21	B21	R22	G22	B22	R23	G23	B23
R30	G30	B30	R31	G31	B31	R22	G32	B32	R33	G33	B33

Each pixel is three bytes. R is first, then G then B.

V4L2_PIX_FMT_BGR32

A 4x4 image. Each cell is one byte.
B00	G00	R00	?	B01	G01	R01	?	B02	G02	R02	?	B03	G03	R03	?
B10	G10	R10	?	B11	G11	R11	?	B12	G12	R12	?	B13	G13	R13	?
B20	G20	R20	?	B21	G21	R21	?	B22	G22	R22	?	B23	G23	R23	?
B30	G30	R30	?	B31	G31	R31	?	B22	G32	R32	?	B33	G33	R33	?

Each pixel is four bytes. B is first, then G then R, then an extra byte. The value of the extra byte is undefined.

V4L2_PIX_FMT_RGB32

A 4x4 image. Each cell is one byte.
R00	G00	B00	?	R01	G01	B01	?	R02	G02	B02	?	R03	G03	B03	?
R10	G10	B10	?	R11	G11	B11	?	R12	G12	B12	?	R13	G13	B13	?
R20	G20	B20	?	R21	G21	B21	?	R22	G22	B22	?	R23	G23	B23	?
R30	G30	B30	?	R31	G31	B31	?	R22	G32	B32	?	R33	G33	B33	?

Each pixel is four bytes. R is first, then G then B, then an extra byte. The value of the extra byte is undefined.

V4L2_PIX_FMT_RGB332

A 4x4 image. Each cell is one byte.
p00	p01	p02	p03
p10	p11	p12	p13
p20	p21	p22	p23
p30	p31	p32	p33

Each pixel is one byte. This format is intended for use with 8-bit colormap displays. Each byte contains three bits of red, three bits of green, and two bits of blue. The RGB bits are arranged in the bytes like this:

bit	7	6	5	4	3	2	1	0
p	B1	B0	G2	G1	G0	R2	R1	R0

YUV Formats

These formats are designed to be compatible with devices that use ITU-R601 or ITU-R656 digital video internally. They use the YCbCr color space. YCbCr is a modified YUV format. In YCbCr, Y ranges from 16, corresponding to 0.0; to 235, corresponding to 1.0 or full brightness. Cb and Cr range from 16, corresponding to -0.5; to 240, corresponding to +0.5 (128 corresponds to 0.0). To convert from YCbCr to RGB, where the R, G and B should range from 0 to 255, use the following transforms:

Y = (255/219)(Y - 16)

U = (127/112)(Cb - 128)

V = (127/112)(Cr - 128)

That gives a Y as 0...255, and U and V as -127...+127. Convert to RGB:

R = Y + 1.402V

G = Y - 0.344U - 0.714V

B = Y + 1.772U

If you are writing a color space conversion routine take note: Due to image filtering, brightness controls, and other common video operations, it is normal that YCbCr values can go out of range. It is also normal for the computed R, G, or B values to be below 0 or above 255, even if YCbCr were in their legal range. It is necessary for a conversion algorithm to clamp all the result values to their legal range.

The inverse transform to convert RGB into YCbCr can be derived (how are your linear algebra skills?) and is as follows:

Y = 0.2990R + 0.5670G + 0.1140B

U = -0.1687R - 0.3313G + 0.5000B

V = 0.5000R - 0.4187G - 0.0813B

Which gives Y as 0...255, and U and V as -127 to 127. Then convert to YCbCr ranges:

Y = (219/255)Y + 16

Cb = (112/127)U + 128

Cr = (112/127)V + 128

The purpose of using this color space is to separate the brightness information (Y) from the color information (U and V or Cb and Cr). It is a property of the human visual system that brightness information is more important, and color information can be partially discarded with little loss of perceptual quality. Therefore the YUV formats always use fewer Cb's and Cr's than Y's. There is always one Y per pixel. The YUV formats differ by how much color information is discarded, and by how the Y's, Cb's and Cr's are arranged in memory.

V4L2_PIX_FMT_YUYV,
V4L2_PIX_FMT_UYVY
V4L2_PIX_FMT_VYUY
V4L2_PIX_FMT_YVYU

A 4x4 YUYV image. Each cell is one byte.
Y00	Cb00	Y01	Cr00	Y02	Cb02	Y03	Cr02
Y10	Cb10	Y11	Cr10	Y12	Cb12	Y13	Cr12
Y20	Cb20	Y21	Cr20	Y22	Cb22	Y23	Cr22
Y30	Cb30	Y31	Cr30	Y32	Cb32	Y33	Cr32

In these formats each four bytes is two pixels. Each four bytes is two Y's, a Cb and a Cr. Each Y goes to one of the pixels, and the Cb and Cr belong to both pixels. As you can see, the Cr and Cb components have half the horizontal resolution of the Y component. V4L2_PIX_FMT_UYVY is the same, except the data are arranged in a different order: Cb-Y-Cr-Y. V4L2_PIX_FMT_YUYV is known in the Windows environment as YUY2. Similarly, V4L2_PIX_FMT_VYUY uses byte order Cr-Y-Cb-Y, and V4L2_PIX_FMT_YVYU is Y-Cr-Y-Cb.

V4L2_PIX_FMT_Y41P

An 8x4 image. Each cell is one byte.
Cb00	Y00	Cr00	Y01	Cb04	Y02	Cr04	Y03	Y04	Y05	Y06	Y07
Cb10	Y10	Cr10	Y11	Cb14	Y12	Cr14	Y13	Y14	Y15	Y16	Y17
Cb20	Y20	Cr20	Y21	Cb24	Y22	Cr24	Y23	Y24	Y25	Y26	Y27
Cb30	Y30	Cr30	Y31	Cb34	Y32	Cr34	Y33	Y34	Y35	Y36	Y37

In this format each 12 bytes is eight pixels. In the twelve bytes are two CbCr pairs and eight Y's. The first CbCr pair goes with the first four Y's, and the second CbCr pair goes with the other four Y's. The Cb and Cr components have one fourth the horizontal resolution of the Y component.

V4L2_PIX_FMT_YVU420,
V4L2_PIX_FMT_YUV420

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

Cr00	Cr02
Cr20	Cr22

Cb00	Cb02
Cb20	Cb22

These are planar formats, as opposed to a packed format. The three components are separated into three sub-images or planes. The Y plane is first. The Y plane has one byte per pixel. For V4L2_PIX_FMT_YVU420, the Cr plane immediately follows the Y plane in memory. The Cr plane is half the width and half the height of the Y plane (and of the image). Each Cr belongs to four pixels, a two-by-two square of the image. For example, Cr00 belongs to Y00, Y01, Y10, and Y11. Following the Cr plane is the Cb plane, just like the Cr plane. V4L2_PIX_FMT_YUV420 is the same except the Cb plane comes first, then the Cr plane.

If the Y plane has pad bytes after each row, then the Cr and Cb planes have half as many pad bytes after their rows. In other words, two Cx rows (including padding) is exactly as long as one Y row (including padding).

V4L2_PIX_FMT_YVU410,
V4L2_PIX_FMT_YUV410

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

Cr00

Cb00

This is a planar format, as opposed to a packed format. The three components are separated into three sub-images or planes. The Y plane is first. The Y plane has one byte per pixel. For V4L2_PIX_FMT_YVU410, the Cr plane immediately follows the Y plane in memory. The Cr plane is ¼ the width and ¼ the height of the Y plane (and of the image). Each Cr belongs to 16 pixels, a four-by-four square of the image. Following the Cr plane is the Cb plane, just like the Cr plane. V4L2_PIX_FMT_YUV410 is the same, except the Cb plane comes first, then the Cr plane.

If the Y plane has pad bytes after each row, then the Cr and Cb planes have ¼ as many pad bytes after their rows. In other words, four C x rows (including padding) is exactly as long as one Y row (including padding).

V4L2_PIX_FMT_YUV422P

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

Cb00	Cb02
Cb10	Cb12
Cb20	Cb22
Cb30	Cb32

Cr00	Cr02
Cr10	Cr12
Cr20	Cr22
Cr30	Cr32

This format is not commonly used. This is a planar version of the YUYV format. The three components are separated into three sub-images or planes. The Y plane is first. The Y plane has one byte per pixel. The Cb plane immediately follows the Y plane in memory. The Cb plane is half the width of the Y plane (and of the image). Each Cb belongs to two pixels. For example, Cb00 belongs to Y00, Y01. Following the Cb plane is the Cr plane, just like the Cb plane.

V4L2_PIX_FMT_YUV411P

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

Cb00

Cb10

Cb20

Cb30

Cr00

Cr10

Cr20

Cr30

This format is not commonly used. This is a planar format similar to the 422 planar format except with half as many chroma. The three components are separated into three sub-images or planes. The Y plane is first. The Y plane has one byte per pixel. The Cb plane immediately follows the Y plane in memory. The Cb plane is ¼ the width of the Y plane (and of the image). Each Cb belongs to 4 pixels all on the same row. For example, Cb00 belongs to Y00, Y01, Y02 and Y03. Following the Cb plane is the Cr plane, just like the Cb plane.

V4L2_PIX_FMT_NV12

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

Cb00	Cr00	Cb02	Cr02
Cb20	Cr20	Cb20	Cr22

This is a two-plane version of the YUV420 format. The three components are separated into two sub-images or planes. The Y plane is first. The Y plane has one byte per pixel. Immediately following that in memory is a combined CbCr plane. The CbCr plane is the same width, in bytes, as the Y plane (and of the image), but is half as tall. Each CbCr pair belongs to four pixels. For example, Cb00/Cr00 belongs to Y00, Y01, Y10, Y11.

If the Y plane has pad bytes after each row, then the CbCr plane has as many pad bytes after its rows.

V4L2_PIX_FMT_GREY

A 4x4 image. Each cell is one byte.
Y00	Y01	Y02	Y03
Y10	Y11	Y12	Y13
Y20	Y21	Y22	Y23
Y30	Y31	Y32	Y33

This is a greyscale (black and white) image. It is really a degenerate YCbCr format which simply contains no Cr or Cb data. Y ranges from 16 (darkest) to 235 (lightest).

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

B00	G00	R00	B01	G01	R01	B02	G02	R02	B03	G03	R03
B10	G10	R10	B11	G11	R11	B12	G12	R12	B13	G13	R13
B20	G20	R20	B21	G21	R21	B22	G22	R22	B23	G23	R23
B30	G30	R30	B31	G31	R31	B22	G32	R32	B33	G33	R33

R00	G00	B00	R01	G01	B01	R02	G02	B02	R03	G03	B03
R10	G10	B10	R11	G11	B11	R12	G12	B12	R13	G13	B13
R20	G20	B20	R21	G21	B21	R22	G22	B22	R23	G23	B23
R30	G30	B30	R31	G31	B31	R22	G32	B32	R33	G33	B33

B00	G00	R00	?	B01	G01	R01	?	B02	G02	R02	?	B03	G03	R03	?
B10	G10	R10	?	B11	G11	R11	?	B12	G12	R12	?	B13	G13	R13	?
B20	G20	R20	?	B21	G21	R21	?	B22	G22	R22	?	B23	G23	R23	?
B30	G30	R30	?	B31	G31	R31	?	B22	G32	R32	?	B33	G33	R33	?

R00	G00	B00	?	R01	G01	B01	?	R02	G02	B02	?	R03	G03	B03	?
R10	G10	B10	?	R11	G11	B11	?	R12	G12	B12	?	R13	G13	B13	?
R20	G20	B20	?	R21	G21	B21	?	R22	G22	B22	?	R23	G23	B23	?
R30	G30	B30	?	R31	G31	B31	?	R22	G32	B32	?	R33	G33	B33	?

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

B00	G00	R00	B01	G01	R01	B02	G02	R02	B03	G03	R03
B10	G10	R10	B11	G11	R11	B12	G12	R12	B13	G13	R13
B20	G20	R20	B21	G21	R21	B22	G22	R22	B23	G23	R23
B30	G30	R30	B31	G31	R31	B22	G32	R32	B33	G33	R33

R00	G00	B00	R01	G01	B01	R02	G02	B02	R03	G03	B03
R10	G10	B10	R11	G11	B11	R12	G12	B12	R13	G13	B13
R20	G20	B20	R21	G21	B21	R22	G22	B22	R23	G23	B23
R30	G30	B30	R31	G31	B31	R22	G32	B32	R33	G33	B33

B00	G00	R00	?	B01	G01	R01	?	B02	G02	R02	?	B03	G03	R03	?
B10	G10	R10	?	B11	G11	R11	?	B12	G12	R12	?	B13	G13	R13	?
B20	G20	R20	?	B21	G21	R21	?	B22	G22	R22	?	B23	G23	R23	?
B30	G30	R30	?	B31	G31	R31	?	B22	G32	R32	?	B33	G33	R33	?

R00	G00	B00	?	R01	G01	B01	?	R02	G02	B02	?	R03	G03	B03	?
R10	G10	B10	?	R11	G11	B11	?	R12	G12	B12	?	R13	G13	B13	?
R20	G20	B20	?	R21	G21	B21	?	R22	G22	B22	?	R23	G23	B23	?
R30	G30	B30	?	R31	G31	B31	?	R22	G32	B32	?	R33	G33	B33	?

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

p00	q00	p01	q01	p02	q02	p03	q03
p10	q10	p11	q11	p12	q12	p13	q13
p20	q20	p21	q21	p22	q22	p23	q23
p30	q30	p31	q31	p32	q32	p33	q33

B00	G00	R00	B01	G01	R01	B02	G02	R02	B03	G03	R03
B10	G10	R10	B11	G11	R11	B12	G12	R12	B13	G13	R13
B20	G20	R20	B21	G21	R21	B22	G22	R22	B23	G23	R23
B30	G30	R30	B31	G31	R31	B22	G32	R32	B33	G33	R33

R00	G00	B00	R01	G01	B01	R02	G02	B02	R03	G03	B03
R10	G10	B10	R11	G11	B11	R12	G12	B12	R13	G13	B13
R20	G20	B20	R21	G21	B21	R22	G22	B22	R23	G23	B23
R30	G30	B30	R31	G31	B31	R22	G32	B32	R33	G33	B33

B00	G00	R00	?	B01	G01	R01	?	B02	G02	R02	?	B03	G03	R03	?
B10	G10	R10	?	B11	G11	R11	?	B12	G12	R12	?	B13	G13	R13	?
B20	G20	R20	?	B21	G21	R21	?	B22	G22	R22	?	B23	G23	R23	?
B30	G30	R30	?	B31	G31	R31	?	B22	G32	R32	?	B33	G33	R33	?

R00	G00	B00	?	R01	G01	B01	?	R02	G02	B02	?	R03	G03	B03	?
R10	G10	B10	?	R11	G11	B11	?	R12	G12	B12	?	R13	G13	B13	?
R20	G20	B20	?	R21	G21	B21	?	R22	G22	B22	?	R23	G23	B23	?
R30	G30	B30	?	R31	G31	B31	?	R22	G32	B32	?	R33	G33	B33	?