Why Develop New Metrics for Digital Video Systems?
Input Scene Dependencies:
The advent of digital video compression, storage, and
transmission systems has exposed fundamental limitations of
techniques and methodologies that have traditionally been used to
measure video performance. Traditional performance parameters have
relied on the "constancy" of a video system's performance for
different input scenes. Thus, one could inject a test pattern or
test signal (e.g., a static multi-burst), measure some resulting
system attribute (e.g., frequency response), and be relatively
confident that the system would respond similarly for other video
material (e.g., video with motion). A great deal of research has
been performed to relate the traditional analog video performance
parameters (e.g., differential gain, differential phase, short time
waveform distortion, etc.) to perceived changes in video quality.
While the recent advent of video compression, storage, and
transmission systems has not invalidated these traditional
parameters, it has certainly made their connection with perceived
video quality much more tenuous. Digital video systems adapt and
change their behavior depending upon the input scene. Therefore,
attempts to use input scenes that are different from what is
actually used in-service can result in erroneous and misleading
results. Variations in subjective performance ratings as large as 3
quality units on a subjective quality scale that runs from 1 to 5
(1=lowest rating, 5=highest rating) have been noted in tests of
commercially available systems. While quality dependencies on the
input scene tend to become much more prevalent at higher
compression ratios, they also are observed at lower compression
ratios. For example, subjective test results of 45-Mb/s
contribution quality systems (i.e., systems now used by
broadcasters to transmit over long-line digital networks) revealed
one transmission system with multiple tandem codecs whose
subjective performance varied from 2.16 to 4.64 quality
units.
A digital video transmission system that works fine for video
teleconferencing might be inadequate for entertainment television.
Specifying the performance of a digital video system as a function
of the video scene coding difficulty yields a much more complete
description of system performance. Recognizing the need to select
appropriate input scenes for testing, algorithms have been
developed for quantifying the expected coding difficulty of an
input scene based on the amount of spatial detail and motion. Other
methods have been proposed for determining the picture-content
failure characteristic for the system under consideration. National
and international standards have been developed that specify
standard video scenes for testing digital video systems. Use of
these standards assures that users compare apples to apples when
evaluating similar systems from different suppliers.
Digital Transmission System
Dependencies:
The operating characteristics of digital transmission
systems (e.g., bit-rate, error rate, dropped packet rate) may
change over time and this can produce quality fluctuations. These
transients may be very difficult to capture unless the performance
of the system is being continuously monitored. Ideally, this
monitoring should be done in-service, since taking the transmission
system out-of-service and injecting known test signals and/or
scenes will change the conditions under which the the digital
transmission system is operating, and hence unduly influence the
performance measurement. Two examples that demonstrate this effect
is the statistical multiplexer, a device which multiplexes variable
bit-rate compression of many individual video channels into a
single constant bit-rate channel, and digital video transmission
over the Internet using non-guaranteed bandwidth. Only continuous,
non-intrusive, in-service performance monitoring can accurately
capture what the viewer is perceiving in these instances. A new
measurement paradigm is thus required.
New Digital Video Impairments:
Digital video systems produce fundamentally different
kinds of impairments than analog video systems. Examples of these
include tiling, error blocks, smearing, jerkiness, edge busyness,
and object retention. To fully quantify the performance
characteristics of a digital video system, it is desirable to have
a set of performance parameters, where each parameter is sensitive
to some unique dimension of video quality or impairment type. This
is similar to what was developed for analog impairments (e.g., a
multi-burst test would measure the frequency response, and a
signal-to-noise ratio test would measure the analog noise level).
This discrimination property of performance parameters is useful to
designers trying to optimize certain system attributes over others,
and to network operators wanting to know not only when a system is
failing but where and how it is failing.
Also of interest is how a user weighs the different performance
attributes of a digital video system (e.g., spatial resolution,
temporal resolution, or color reproduction accuracy) when
subjectively rating the quality of the experience. The process of
estimating these subjective quality ratings from objective
performance parameter data is an important new area of work.
The Need for Technology Independence:
The constancy of analog video systems over the past 4
decades provided the necessary long term development cycle to
produce today's accurate analog video test equipment. In contrast,
the rapid evolution of digital video compression, storage, and
transmission technology presents a much more difficult performance
measurement task. To avoid immediate obsolescence, new performance
measurement technology developed for digital video systems must be
technology independent, or not dependent upon specific coding
algorithms or transport architectures. One way to achieve
technology independence is to have the test instrument perceive and
measure video impairments like a human being. Fortunately, the
computational resources needed to achieve these measurement
operations are becoming available.