Aspect Ratios and other associated matters

Television or broadcast video has been around for well over 50 years. It has mostly developed and been disseminated in an analog form to which most of the fundamental agreed standards apply. Digital video was developed to initially serve the analog broadcast marketplace with unchanged standards. It is important to remember these basic facts.

 

Broadcast video uses ‘Raster Scan’ where the picture is made by a series of regular scans called lines and fields. Video is ultimately a spot of light travelling at uniform speed across a cathode ray tube; the changing brightness of the spot traces out a picture. The electronic signal that we use to control the spot has two important parameters; the voltage and the timing. The voltage controls the brightness, the timing decides where on the screen the spot is located. The vertical position is stepped down sequentially in what we call lines, the horizontal position is determined by the ‘delay’ after a reference start-of-line marker. If we forget for now all the nuts and bolts that allow the system to work (sync pulses, CSC bursts, etc) then what actually counts to define the video standard is the number of lines and the timing width of the picture content that are both used to create the actual visible image. This timing is normally called the Active Line Time.

 

NTSC is defined as having 486 active lines with a nominal active line length of 52.7 microseconds. PAL is 576 active lines with a nominal active length of 52 microseconds. These format specifications are what represent the image with its 4:3 aspect ratio. This in turn has formed the specification for image generating kit like cameras, and IT HAS NOT CHANGED through digital video adoption. Note though that the analog video specifications include tolerances (hence use of ‘nominal’ above) and for convenience we will not always use the middle of the range. There is also some argument about the exact number of active analog lines, but the above numbers are the usually accepted values.

 

Early on in the evolution of digital video, international committees met and decided to create formats that would enhance manufacturing ease and format conversion. It was therefore decided to use a common number of digital samples per line of video. It was also a requirement that the digital sample period was a little longer than the actual active line time.

 

It was decided to use a sample frequency of 13.5MHz, and to have 720 samples for each line. This results in about 712 active samples for NTSC and about 702 for PAL . These numbers remember represent a nominal 4:3 image.

 

If the digital video is used to produce analog video, then the ‘extra’ pixel samples will be cropped and lost in the blanking. Conversely any analog source will only produce active pictures for the narrower period, and in fact many digital cameras are designed to also have pictures for only this period. If the full 720 sample width of the digital image is used, then it is clearly a little bit wider than 4:3; it is if you like a 4:3 image with some ‘bits on the sides’. Image origination for the digital format must therefore take this into account.

 

Let us therefore look at the specifics of image production for the two digital video formats. Paint and draw type applications, and most animation origination will work with ‘square pixels’, as will devices like digital scanners used to produce images. This means that the image is represented on the computer screen as having 1:1 relationship with the pixel size. An image of equal numbers of pixels in height and width will come out looking square. This is not true of digital video. We have seen that a 4:3 NTSC image is represented by 712x486 pixels, but clearly a 4:3 square pixel image would be 712x534 (or 648x486). The video image thus has a stretched horizontals appearance on a computer screen; it is what we call an anamorphic image. Many applications use an anamorphic ratio (e.g. 0.9), but since these are often approximations I don’t believe they are helpful to a good understanding. On the PAL side, the 4:3 image is represented in video by 702x576 pixels, with a square pixel equivalent being 768x576. The video image thus appears to have a squashed horizontals appearance.

 

The NTSC situation is then complicated some, because the development of the popular DV consumer format decided to use a cropped 480 line frame size. Since the same analog display equipment is still being used, and indeed the same cameras as used before were also used, then it stands to reason that the aspect and anamorphic ratios remained the same. I believe the easiest way to think of this is to imagine a 4:3 full NTSC image at 712x486. If we crop 6 pixels from the height, then clearly we need to crop 8 pixels from the width to maintain a 4:3 image. (OK, this is an approximation but near enough.)  We thus arrive at an active digital video image of 704x480. The square pixel equivalent to this would be 704x528 (or 640x480).

 

Lately there has been use of widescreen video with a 16:9 image aspect ratio. This simply represents a  further 4:3 anamorphic ratio to that already used. Clearly the square pixel sizes need to be modified pro rata, but otherwise all the numbers remain the same as before.

 

In all formats, we have 2 obvious alternate generation procedures:

1)     We can generate a 4:3(16:9) image and then ‘drop it into’ the slightly wider full digital frame, leaving some blank image area.

2)     We can actually generate a non 4:3(16:9) image that will subsequently fill the digital frame.

 

To preserve quality, it is customary to generate an image based upon always having too many pixels, so that the final digital video frame is a reduction in size.


 

In summary, some obvious image sizes in pixels to use to generate square pixel images for digital video are as follows:

 

NTSC D1 (720 x 486) normal

4:3 square pixels                            712 x 534

Full width square pixels                720 x 534

 

NTSC D1 (720 x 486) widescreen

16:9 square pixels                          864 x 486

Full width square pixels                874 x 486

 

NTSC DV (720 x 480) normal

4:3 square pixels                            704 x 528

Full width square pixels                720 x 528

 

NTSC DV (720 x 480) widescreen

16:9 square pixels                          854 x 480

Full width square pixels                874 x 480

 

PAL (720 x 576) normal

4:3 square pixels                            768 x 576

Full width square pixels                788 x 576

 

PAL (720 x 576) widescreen

16:9 square pixels                          1024 x 576

Full width square pixels                1050 x 576

 

Some practical points

 

At some stage the square pixel image has to be converted to the anamorphic image actually used in the digital video. This involves some processing, normally called interpolation, and the quality can vary widely. The artist should determine where is the best place in the production chain to get the format change done with the best quality. For example, Photoshop (if set to best) does a very good process, After Effects is good, but Premiere is mediocre.

 

Many NLE applications like Premiere and Final Cut Pro have ‘magic number’ recognition. If they see a graphic image size that they think is appropriate to a video frame, then they will automatically apply a transformation to the size. Many applications use only an approximation of the requisite ratios, and since this transformation may not be to your required quality, it is much safer to always apply your own transformation before import.