www.its.bldrdoc.gov - >> Resources >> Video Quality Research >> Documents Dynamic

Video Quality Documents

   
   

Lucjan Janowski; Margaret H. Pinson, "Subject Bias: Introducing a Theoretical User Model," Fifth International Workshop on Quality of Multimedia Experience (QoMEX 2014), Singapore, 18-20 September 2014

Abstract & keywords

   

Keywords: mean opinion score; video quality assessment; design of experiment s; QoE; subjective ratings

We propose a model for rating behavior based on subject bias and subject error. Evidence for subject bias can be found in freely available subjective experiments. When subject bias is removed from ratings, the sensitivity of statistical comparisons between stimuli usually improves. According to our model, subject biases characterize the subject pool. These between-subject differences are important when analyzing and comparing people. On the other hand, it is advantageous to remove subject bias when analyzing mean opinion score. We conclude that bias acts like a random variable within ratings.

   

Margaret H. Pinson; Lucjan Janowski, AGH/NTIA: A Video Quality Subjective Test with Repeated Sequences, NTIA Technical Memo TM-14-505, June 2014

Abstract & keywords

   

Keywords: video quality; subjective testing; subject screening

This report provides full technical details for the video quality subjective test AGH/NTIA. Analyses of this dataset appear in separate publications. The purpose of this document is to provide design details that are beyond the scope of a conference paper or journal article. Subjective experiment AGH/NTIA includes multiple instances of the same stimuli rated three or six times by the same subject. The goal is to provide insights into the suitability of subject screening methods, the impact of source video reuse on subjective data, and the behavior of subjects when repeatedly rating the same stimuli.

   

Margaret Pinson; Naeem Ram, VQEG eLetter: Volume 1, Issue 1, , March 2014

Abstract & keywords

   

Keywords: video quality; VQEG

The VQEG eletter publishes up-to-date technical advances on video quality related topics. The VQEG eLetter also makes available limited distribution publications that contain valuable technical information, such as out-of-print ATIS contributions. This first issue features an anthology of papers where well-known researchers identify different techniques that can be used to solve the same problem, in this case subjective video quality training sessions.

   

Margaret H. Pinson; Marc Sullivan; Andrew Catellier, "A new method for immersive audiovisual subjective testing," Eighth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM 2014), Chandler, AZ, January 30-31, 2014

Abstract & keywords

   

Keywords: video quality; audio quality; speech quality; immersive subjective test

An immersive subjective test method is proposed in which subjects view each source stimulus only once. In order to encourage a subject’s engagement with test content, longer stimuli are used. Distractor questions are used in addition to the traditional MOS scale in order to focus the subject on the intended application. A speech quality experiment is conducted with this method, and the results compared to those obtained with traditional methods. The consistent rank ordering among datasets demonstrates the validity of the immersive method.

   

Andrew A. Catellier; Luke Connors, Web-Enabled Subjective Test (WEST) Research Tools Manual, NTIA Handbook HB-14-501, January 2014

Abstract & keywords

   

Keywords: subjective testing; multimedia quality; mobile; infrastructure

The Web-Enabled Subjective Test software allows researchers to conduct subjective tests on multiple devices with aggregated data collection and reporting. The system leverages modern web technologies and infrastructure to deliver test stimuli to typical mobile devices and personal computers. Devices with a modern web browser may be tested wherever a network connection is available. If resources allow, many subjects may participate in the test at one time. This unique capability allows for diverse and interesting subjective test plans. This manual describes how to install the WEST software on an Ubuntu Linux-based system.

The WEST software is available for download here.

   

M. H. Pinson; N. Staelens; A. Webster, "The history of video quality model validation," 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP), pp.458,463, Sept. 30 2013-Oct. 2 2013 doi: 10.1109/MMSP.2013.6659332

Abstract & keywords

   

Keywords: standards; HDTV; Quality assessment; video recording; video codecs; video coding; ITU-T Study Group 12; digital video codecs; model validation techniques; video quality experts group; video quality model validation testing; multimedia communication; streaming media

This paper describes objective video quality validation efforts conducted in the past two decades. Validation efforts to be examined include a validation test performed by the T1A1 committee in the early 1990's; five rounds of validation testing performed by the Video Quality Experts Group; and validation tests performed by ITU-T Study Group 12. Useful products that resulted from those efforts will be identified, including standards, datasets, and model validation techniques.

   

Margaret H Pinson; Marcus Barkowsky; Patrick Le Callet, "Selecting scenes for 2D and 3D subjective video quality tests," EURASIP Journal on Image and Video Processing 2013, 2013:50 doi: 10.1186/1687-5281-2013-50

Abstract & keywords

   

Keywords: video quality; subjective testing; video quality assessment; Stereoscopic 3D; Content selection; Scene selection

This paper presents recommended techniques for choosing video sequences for subjective experiments. Subjective video quality assessment is a well-understood field, yet scene selection is often driven by convenience or content availability. Three-dimensional testing is a newer field that requires new considerations for scene selection. The impact of experiment design on best practices for scene selection will also be considered. A semi-automatic selection process for content sets for subjective experiments will be proposed.

   

M. H. Pinson; C. Schmidmer; L. Janowski; R. Pépion; Q. Huynh-Thu; P. Corriveau; A. Younkin; P. LeCallet; M. Barkowsky; W. Ingram, "Subjective and Objective Evaluation of an Audiovisual Subjective Dataset for Research and Development," Fifth International Workshop on Quality of Multimedia Experience (QoMEX 2013), Klagenfurt am Wörthersee, Austria, July 3-5, 2013

Abstract & keywords

   

Keywords: subjective testing; audiovisual

In 2011, the Video Quality Experts Group (VQEG) ran subjects through the same audiovisual subjective test at six different international laboratories. That small dataset is now publically available for research and development purposes.

   

Margaret H. Pinson, "The Consumer Digital Video Library [Best of the Web]," IEEE Signal Processing Magazine, vol. 30, no. 4, pp. 172,174, July 2013 doi: 10.1109/MSP.2013.2258265

Abstract & keywords

   

Keywords: Internet; video quality measurement; digital video recording; libraries; video recording; web sites

The difficulty of finding and getting rights to use the high quality video sequences needed for video quality research has hampered research in this area for many years. This article describes the Consumer Digital Video Library (CDVL) website (www.cdvl.org), which attempts to address this obstacle by making high-quality, uncompressed video clips available for use by the education, research, and product development communities.

   

Margaret H. Pinson; Stephen Wolf, Fast Low Bandwidth Model: A Reduced Reference Video Quality Metric, NTIA Technical Memo TM-13-497, June 2013

Abstract & keywords

   

Keywords: models; Video; quality; metrics; features; parameters; objective; subjective; correlation; reduced-reference; spatial information (SI); temporal information (TI); impairments

This memorandum describes the Fast Low Bandwidth model. This summary is intended to help the reader understand the model from an algorithmic standpoint. Some knowledge of prior NTIA objective video quality models is necessary for the understanding of this document. The Fast Low Bandwidth was designed to be operated in-service using the reduced reference (RR) methodology. The model requires access to the original video at one location, the processed video at another location, and low bandwidth data link between the two locations. That link is used to communicate RR features between the two locations. The Fast Low Bandwidth model is included in ITU-T Rec. J.249.

   

Joel Dumke, "Visual acuity and task-based video quality in public safety applications," Proc. of SPIE-Image Quality and System Performance X, Electronic Imaging, SPIE Vol. 8653, 865306, Burlingame, CA, 5-7 February 2013 doi: 10.1117/12.2004882

Abstract & keywords

   

Keywords: video quality; public safety; video quality metrics; Visual acuity; Task-based

This paper explores the utility of visual acuity as a video quality metric for public safety applications. An experiment has been conducted to track the relationship between visual acuity and the ability to perform a forced-choice object recognition task with digital video of varying quality. Visual acuity is measured according to the smallest letters reliably recognized on a reduced LogMAR chart.

   

Margaret H. Pinson; Karen Sue Boyd; Jessica Hooker; Kristina Muntean, "How To Choose Video Sequences For Video Quality Assessment," Proceedings of the Seventh International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM-2013), Scottsdale, AZ, January 30-February 1, 2013

Abstract & keywords

   

Keywords: experiment design; MOS; subjective testing; task-based subjective testing; video quality assessment

This paper presents recommended techniques for choosing video sequences for subjective experiments. Subjective video quality assessment is a well understood field, yet scene selection is often limited by content availability. The Consumer Digital Video Library (www.cdvl.org) is a solution. Task oriented subjective testing is a newer field than entertainment oriented testing that may require a different approach to scene selection. We describe three different task-based investigations currently underway: performance requirements for public safety equipment, how quality affects comprehension of sign language over a video link, and how video affects oral comprehension over an audiovisual link. Recommendations for scene selection for two types of testing are given. The impact of experiment design will be considered. An example 1080i 29.97fps video sequence set is presented.

   

Margaret H. Pinson; Lucjan Janowski; Romuald Pépion; Quan Huynh-Thu; Christian Schmidmer; Phillip Corriveau; Audrey Younkin; Patrick Le Callet; Marcus Barkowsky; William Ingram, "The Influence of Subjects and Environment on Audiovisual Subjective Tests: An International Study," IEEE Journal of Selected Topics in Signal Processing, Vol. 6, No. 6, October 2012, pp. 640–651

Abstract & keywords

   

Keywords: audiovisual quality; environment effect; language

Traditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student’s t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment.

   

Andrew Catellier; Margaret Pinson; William Ingram; Arthur Webster, "Impact of Mobile Devices and Usage Location on Perceived Multimedia Quality," Proceedings of the 4th International Workshop on Quality of Multimedia Experience (QoMEX), pp. 39-44, Yarra Valley, Australia, July 5-7, 2012.

Abstract & keywords

   

Keywords: standards; subjective testing; audiovisual; multimedia; coder/decoders; mobile

We explore the quality impact when audiovisual content is delivered to different mobile devices. Subjects were shown the same sequences on five different mobile devices and a broadcast quality television. Factors influencing quality ratings include video resolution, viewing distance, and monitor size. Analysis shows how subjects’ perception of multimedia quality differs when content is viewed on different mobile devices. In addition, quality ratings from laboratory and simulated living room sessions were statistically equivalent.

   

Margaret H. Pinson; William Ingram; Arthur Webster, "Audiovisual Quality Components: An Analysis," IEEE Signal Processing Magazine, vol.28, no.6, pp.60-67, Nov. 2011 doi: 10.1109/MSP.2011.942470

Abstract & keywords

   

Keywords: video quality; audio quality; subjective testing; audiovisual quality

The perceived quality of an audiovisual sequence is heavily influenced by both the quality of the audio and the quality of the video. The question then arises as to the relative importance of each factor and whether a regression model predicting audiovisual quality can be devised that is generally applicable.

   

Margaret H. Pinson; Stephen Wolf, Batch Video Quality Metric (BVQM) User's Manual, NTIA Handbook HB-11-441d, September 2011

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; objective video quality performance; batch video clip processing

This handbook provides a user’s manual for the batch video quality metric (BVQM) tool. BVQM runs under the Windows XP® or Windows 7® operating systems. BVQM performs objective automated quality assessments of processed video clip batches (i.e., as output by a video system under test). BVQM reports video calibration and quality metric results such as: temporal registration, spatial registration, spatial scaling, valid region, gain/level offset, and objective video quality estimates. BVQM operates on original and processed video files only, and has no video capture capability. BVQM compares the original video clip to the processed video clip and reports quality estimates on a scale from zero to one. On this scale, zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level (excursions beyond one are possible for extremely impaired video sequences).

   

Stephen Wolf; Margaret H. Pinson, Video Quality Model for Variable Frame Delay (VQM_VFD), NTIA Technical Memo TM-11-482, September 2011

Abstract & keywords

   

Keywords: measurement; model; Video; quality; parameters; objective; subjective; correlation; calibration; delay; dropped; frames; Full Reference (FR); pausing; skipping; alignment; time; angular; distance; resolution; variable

Time varying delays of the output (or processed) video frames with respect to the input (i.e., the original or reference) video frames present significant challenges for Full Reference (FR) video quality measurement systems. Time alignment errors between the output video sequence and the input video sequence can produce measurement errors that greatly exceed the perceptual impact of these time varying video delays. This document proposes a new video quality model (VQM) that properly accounts for the perceptual impact of variable frame delay (VFD). This new model, called VQM_VFD, also uses perceptual features extracted from spatial-temporal (ST) blocks of a fixed angular extent. This enables VQM_VFD to track subjective quality over a wide range of viewing distances and image sizes. VQM_VFD uses a neural network that achieves 0.9 correlation to subjective quality for subjective datasets at image sizes from Quarter Common Intermediate Format (QCIF) to High Definition TV (HDTV).

   

Gregory Cermak; Margaret H. Pinson; Stephen Wolf, "The Relationship Among Video Quality, Screen Resolution, and Bit Rate," IEEE Transactions on Broadcasting, vol.57, no.2, pp.258-262, June 2011 doi: 10.1109/TBC.2011.2121650

Abstract & keywords

   

Keywords: HDTV; subjective testing; quality; CIF; Bit rate; H.264; QCIF; VGA

How much bandwidth is required for good quality video for a given screen resolution? Data acquired during two Video Quality Experts Group (VQEG) projects allow at least a partial answer to this question. This international subjective testing produced large amounts of mean opinion score (MOS) data for the screen resolutions QIF, CIF, VGA, and HD; for H.264 and similar modern codecs; and for many bit rates. Those data are assembled in the present report. For each screen resolution, MOS is plotted as a function of bit rate. A plot of all four data sets together shows the bit rate that would be required to achieve a given level of video quality for a given screen resolution. Relations among the four data sets are regular, suggesting that interpolation across screen resolutions might be reasonable. Based on these data, it would be reasonable to choose a bit rate, given a screen resolution; it would not be reasonable to choose a screen resolution given a bit rate.

   

Stephen Wolf, Variable Frame Delay (VFD) Parameters for Video Quality Measurements, NTIA Technical Memo TM-11-475, April 2011

Abstract & keywords

   

Keywords: Video; quality; parameters; objective; subjective; correlation; calibration; dropped; frames; Full Reference (FR); pausing; skipping; alignment; time; variable delay; measuremen

Digital video transmission systems consisting of a video encoder, a digital transmission method (e.g., Internet Protocol—IP), and a video decoder can produce pauses in the video presentation, after which the video may continue with or without skipping video frames. Sometimes sections of the original video stream may be missing entirely (skipping without pausing). Time varying delays of the output (or processed) video frames with respect to the input (i.e., the original or reference) video frames present significant challenges for Full Reference (FR) video quality measurement systems. Time alignment errors between the output video sequence and the input video sequence can produce measurement errors that greatly exceed the perceptual impact of these time varying video delays. This document proposes several objective video quality parameters that can be extracted from variable frame delay (VFD) information, demonstrates their correlation to subjective video quality, and shows how they can be utilized in an FR video quality measurement (VQM) system.

   

Margaret H. Pinson; Arthur Webster; William Ingram, Preliminary Investigation into the Impact of Audiovisual Synchronization of Impaired Audiovisual Sequences, NTIA Technical Memo TM-11-474, March 2011

Abstract & keywords

   

Keywords: video quality; synchronization; audio quality; subjective testing; audiovisual quality

The quality perception of an audiovisual sequence is heavily influenced by the quality of the audio, the quality of the video, and the audiovisual time synchronization. The questions then arise: what is the relative importance of each factor, and can a model be devised that is generally applicable? Previous work either examined the relative influences of audio and video quality for synchronized video or investigated the quality impact of synchronization errors on unimpaired video sequences. This experiment is a first attempt to combine all three factors into a single experiment, to judge the complex interactions among individual measurements of audio and video quality and synchronization errors.

   

Margaret H. Pinson; Stephen Wolf; Gregory Cermak, "HDTV Subjective Quality of H.264 vs. MPEG-2, with and without Packet Loss," IEEE Transactions onBroadcasting, vol.56, no.1, pp.86-91, March 2010 doi: 10.1109/TBC.2009.2034511

Abstract & keywords

   

Keywords: HDTV; subjective testing; quality; H.264; MPEG-2; packet loss; transmission errors.

The intent of H.264 (MPEG-4 Part 10) was to achieve equivalent quality to previous standards (e.g., MPEG-2) at no more than half the bit-rate. H.264 is commonly felt to have achieved this objective. This document presents results of an HDTV subjective experiment that compared the perceptual quality of H.264 to MPEG-2. The study included both the coding-only impairment case and a coding plus packet loss case, where the packet loss was representative of a well managed network (0.02% random packet loss rate). Subjective testing results partially uphold the commonly held claim that H.264 provides quality similar to MPEG-2 at no more than half the bit rate for the coding-only case. However, the advantage of H.264 diminishes with increasing bit rate and all but disappears when one reaches about 18 Mbps. For the packet loss case, results from the study indicate that H.264 suffers a large decrease in quality whereas MPEG-2 undergoes a much smaller decrease.

   

Marcus Barkowsky; Margaret Pinson; Romuald Pépion; Patrick Le Callet, "Analysis of Freely Available Subjective Dataset for HDTV including Coding and Transmission Distortions," Fifth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM–10), Scottsdale, Arizona, January 13–15, 2010

Abstract & keywords

   

Keywords: HDTV; impairments; VQEG; H.264; MPEG-2; encoding; transmission; decoding; subjective video quality

We present the design, preparation, and analysis of a subjective experiment on typical HDTV sequences and scenarios. This experiment follows the guidelines of ITU and VQEG in order to obtain reproducible results. The careful selection of content and distortions extend over a wide and realistic range of typical transmission scenarios. Detailed statistical analysis provides important insight into the relationship between technical parameters of encoding, transmission and decoding and subjectively perceived video quality.

   

Margaret H. Pinson; Stephen Wolf; Neha Tripathi; Chin Koh, "The Consumer Digital Video Library," Fifth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM–10), Scottsdale, Arizona, January 13–15, 2010

Abstract & keywords

   

Keywords: video quality; video sequences; video clips; Consumer Digital Video Library; CDVL; video processing

The shortage of high quality video sequences has hampered video quality research for 20 years. Every Video Quality Experts Group (VQEG) test has encountered difficulty in obtaining quality source content. For some validation tests, this caused a multiple–year delay. To address this critical need, the newly launched Consumer Digital Video Library (CDVL) website (www.cdvl.org) accepts and shares contributions of high–quality video sequences. The site is designed to assist video researchers in the development and testing of new algorithms for video processing, coding, and both subjective and objective quality measurement.

   

Stephen Wolf, A Full Reference (FR) Method Using Causality Processing for Estimating Variable Video Delays, NTIA Technical Memo TM-10-463, October 2009

Abstract & keywords

   

Keywords: video quality; calibration; delay; causality; dropped frames; Full Reference (FR); pausing; skipping

Digital video transmission systems consisting of a video encoder, a digital transmission method (e.g., Internet Protocol – IP), and a video decoder can produce pauses in the video presentation, after which the video may continue with or without skipping v

   

K. Brunnstrom; D. Hands; F. Speranza; A. Webster, "VQEG validation and ITU standardization of objective perceptual video quality metrics [Standards in a Nutshell]," IEEE Signal Processing Magazine, vol.26, no.3, pp. 96-101, May 2009 doi: 10.1109/MSP.2009.932162

Abstract & keywords

   

Keywords: video quality measurement; multimedia systems; video signal processing; ITU standardization; VQEG validation; objective video quality metrics; perceptual video quality metrics

For industry, the need to access accurate and reliable objective video metrics has become more pressing with the advent of new video applications and services such as mobile broadcasting, Internet video, and Internet Protocol television (IPTV). Industry-class objective quality- measurement models have a wide range of uses, including equipment testing (e.g., codec evaluation), transmission- planning and network-dimensioning tasks, head-end quality assurance, in- service network monitoring, and client-based quality measurement. The Video Quality Experts Group (VQEG) is the primary forum for validation testing of objective perceptual quality models. The work of VQEG has resulted in International Telecommunication Union (ITU) standardization of objective quality models designed for standard- definition television and for multimedia applications. This article reviews VQEG's work, paying particular attention to the group's approach to validation testing.

   

Stephen Wolf; Margaret Pinson, "Reference Algorithm for Computing Peak Signal to Noise Ratio (PSNR) of a Video Sequence with a Constant Delay," ITU-T Contribution COM9-C6-E, Geneva, February 2-6, 2009.

Abstract & keywords

   

Keywords: video quality metrics; Peak Signal to Noise Ratio; PSNR

Peak Signal to Noise Ratio (PSNR) has been used as a benchmark to evaluate new objective perceptual video quality metrics. For example, PSNR has been used as a benchmark for both the Multimedia (MM) and Reduced Reference Television (RRTV) test programs recently completed by the Video Quality Experts Group (VQEG). However, there is not currently an international Recommendation specifying exactly how to perform this critical measurement. Since the calculation of PSNR is highly dependent upon proper calculation of spatial alignment, temporal alignment, gain, and level offset between the processed video sequence and the original video sequence, one must also specify the method of performing these calibration procedures. The past two validation tests (MM and RRTV) performed by VQEG utilized the exhaustive search PSNR algorithm that is the subject of this contribution. Members of VQEG agreed to use this PSNR method as a benchmark for assessing the effectiveness of perceptual video quality metrics after extensive discussions. This PSNR calculation method has the advantage of automatically determining the highest possible PSNR value for a given video sequence over the range of spatial and temporal shifts. Only one temporal shift is allowed for all frames in the entire processed video sequence (i.e., constant delay).

   

Stephen Wolf; Margaret Pinson, "Fast Low Bandwidth Video Quality Model (VQM) Description and Reference Code," ITU-T Contribution COM9-C5-E, Geneva, February 2-6, 2009

Abstract & keywords

   

Keywords: digital television; reduced reference; video quality model; VQM; Fast Low Bandwidth VQM

Study Group 9 has been working on finalizing Draft New Recommendation J.redref which specifies Reduced Reference (RR) methods for performing video quality measurements of digital television systems. The U.S. National Telecommunications and Information Administration (NTIA) recently submitted two Video Quality Models (VQMs) to the recent Video Quality Experts Group (VQEG) RR Television (RRTV) evaluation tests. One of the VQMs (i.e., the Fast Low Bandwidth VQM) was in the top performing group for both 525-line and 625-line video systems and also significantly outperformed Peak Signal to Noise Ratio (PSNR), a full reference VQM. This contribution describes the NTIA Fast Low Bandwidth VQM algorithm and provides reference code for its implementation. While NTIA holds multiple U.S. patents on this algorithm, NTIA/ITS has made and is willing to make these freely available to all interested parties for both non-commercial and commercial purposes.

   

Stephen Wolf, "A No Reference (NR) and Reduced Reference (RR) Metric for Detecting Dropped Video Frames," Fourth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM-09), Scottsdale, Arizona, January 15-16, 2009

Abstract & keywords

   

Keywords: video quality; dropped frames; reduced reference; no reference; video quality metrics

Digital video transmission systems consisting of a video encoder, a digital transmission method (e.g., Internet Protocol – IP), and a video decoder can produce pauses in the video presentation that result from dropped or repeated video frames. For example, a common response of a video decoder to dropped IP packets is to momentarily freeze the video by repeating the last good video frame. This paper presents a No Reference (NR) metric and a Reduced Reference (RR) metric for detecting these dropped video frames. These metrics may have application for in-service video quality monitoring.

   

Margaret H. Pinson; Stephen Wolf, In-Service Video Quality Metric (IVQM) User’s Manual, NTIA Handbook HB-09-434b, January 2009

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; objective video quality performance; batch video clip processing

This handbook provides a user’s manual for the batch video quality metric (BVQM) tool. BVQM runs under the Windows XP® or Red Hat Linux® operating systems. BVQM performs objective automated quality assessments of processed video clip batches (i.e., as output by a video system under test). BVQM reports video calibration and quality metric results such as: temporal registration, spatial registration, spatial scaling, valid region, gain/level offset, and objective video quality estimates. BVQM operates on original and processed video files only, and has no video capture capability. BVQM compares the original video clip to the processed video clip and reports quality estimates on a scale from zero to one. On this scale, zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level (excursions beyond one are possible for extremely impaired video sequences).

   

Margaret H. Pinson; Stephen Wolf, Batch Video Quality Metric (BVQM) User’s Manual, NTIA Handbook HB-09-441c, January 2009

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; objective video quality performance; batch video clip processing

This handbook provides a user’s manual for the batch video quality metric (BVQM) tool. BVQM runs under the Windows XP® or Vista® operating systems. BVQM performs objective automated quality assessments of processed video clip batches (i.e., as output by a video system under test). BVQM reports video calibration and quality metric results such as: temporal registration, spatial registration, spatial scaling, valid region, gain/level offset, and objective video quality estimates. BVQM operates on original and processed video files only, and has no video capture capability. BVQM compares the original video clip to the processed video clip and reports quality estimates on a scale from zero to one. On this scale, zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level (excursions beyond one are possible for extremely impaired video sequences).

   

Margaret H. Pinson; Stephen Wolf, Techniques for Evaluating Objective Video Quality Models Using Overlapping Subjective Data Sets, NTIA Technical Report TR-09-457, November 2008

Abstract & keywords

   

Keywords: performance; Video; quality; objective; subjective; correlation; combining; mapping; multi-media; VQEG

This report presents techniques for evaluating objective video quality models using overlapping subjective data sets. The techniques are demonstrated using data from the Video Quality Experts Group (VQEG) Multi-Media (MM) Phase I experiments. These results also provide a supplemental analysis of the performance achieved by the objective models that were submitted to the MM Phase I experiments. The analysis presented herein uses the subjective scores from the common set of video clips to map all the subjective scores from the 13 or 14 experiments (at a given image resolution) onto a single subjective scale. This mapping greatly increases the available data and thus allows for more powerful analysis techniques to be performed. Resolving power values are presented for each model and image resolution. On a per-clip level, models' responses to stimuli are analyzed with respect to all stimuli, each coding algorithm, coding-only impairments, and transmission error impairments. The models' responses to stimuli are also analyzed on per-system and per-scene levels. Results indicate the amount of improvement possible when averaging over multiple scenes or systems.

   

Stephen Wolf, A No Reference (NR) and Reduced Reference (RR) Metric for Detecting Dropped Video Frames, NTIA Technical Memo TM-09-456, October 2008

Abstract & keywords

   

Keywords: Video; metrics; dropped; frames; No Reference (NR); Reduced Reference (RR)

Digital video transmission systems consisting of a video encoder, a digital transmission method (e.g., Internet Protocol – IP), and a video decoder can produce pauses in the video presentation that result from dropped or repeated video frames. For example, a common response of a video decoder to dropped IP packets is to momentarily freeze the video by repeating the last good video frame. This document presents a No Reference (NR) metric and a Reduced Reference (RR) metric for detecting these dropped video frames. These metrics may have application for in-service video quality monitoring.

   

Margaret H. Pinson; Stephen Wolf, Batch Video Quality Metric (BVQM) User’s Manual, NTIA Handbook HB-08-441b, November 2007

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; objective video quality performance; batch video clip processing

This handbook provides a user’s manual for the batch video quality metric (BVQM) tool. BVQM runs under the Windows XP® or Red Hat Linux® operating systems. BVQM performs objective automated quality assessments of processed video clip batches (i.e., as output by a video system under test). BVQM reports video calibration and quality metric results such as: temporal registration, spatial registration, spatial scaling, valid region, gain/level offset, and objective video quality estimates. BVQM operates on original and processed video files only, and has no video capture capability. BVQM compares the original video clip to the processed video clip and reports quality estimates on a scale from zero to one. On this scale, zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level (excursions beyond one are possible for extremely impaired video sequences).

   

Margaret H. Pinson; Stephen Wolf, Reduced Reference Video Calibration Algorithms, NTIA Technical Report TR-08-433b, November 2007

Abstract & keywords

   

Keywords: Video; gain; calibration; spatial scaling; delay; offset; spatial shift; temporal shift

This report describes four Reduced Reference (RR) video calibration algorithms of low computational complexity. RR methods are useful for performing end-to-end in-service video quality measurements since these methods utilize a low bandwidth network connection between the original (source) and processed (destination) ends. The first RR video calibration algorithm computes temporal registration of the processed video stream with respect to the original video stream (i.e., video delay estimation). The second algorithm jointly calculates spatial scaling and spatial shift. The third algorithm calculates luminance gain level offset of the processed video stream with respect to the original video stream. The fourth algorithm estimates the valid video region of the original or processed video stream (i.e., the portion of the video image that contains actual picture content). All the algorithms utilize only the luminance (Y) image plane of the video signal.

   

Margaret H. Pinson; Stephen Wolf; Robert B. Stafford, "Video Performance Requirements for Tactical Video Applications," 2007 IEEE Conference on Technologies for Homeland Security, pp.85-90, 16-17 May 2007 doi: 10.1109/THS.2007.370025

Abstract & keywords

   

Keywords: telecommunication standards; measurement; Degradation; Law enforcement; Manufacturing; Safety devices; Testing; Video compression; Video equipment; Video sharing

The Public Safety Statement of Requirements (PS-SoR) for Communications and Interoperability focuses on the needs of first responders to communicate and share information as authorized, when it is needed, where it is needed, and in a mode or form that allows the practitioners to effectively use it. PS-SoR Volume I defined functional communication and interoperability requirements. Published in September, 2006, PS-SoR Volume II identifies quantitative performance metrics, including minimum video performance requirements for public safety's tactical video applications. The goal was not to identify what is achievable with current technology but rather, looking towards the future, to investigate the minimum level of performance that first responders need in order to effectively use their video equipment. On behalf of the SAFECOM Program and the Office of Law Enforcement Standards, the Institute for Telecommunication Sciences (ITS) conducted subjective video quality testing to estimate the level of video quality that first responders find acceptable for tactical video applications. This subjective testing utilized source video content that is typical of public safety operations in structured subjective viewing experiments with 35 first responders. The evaluations from these first responders, in viewing high quality video (original video) and purposefully degraded video (using video compression and transmission equipment), allowed determination of basic quality thresholds for public safety tactical video applications. These perceptual quality thresholds have been translated into technical parameters for use by video equipment designers, manufacturers, and customers. This paper summarizes those findings. Other testing to evaluate requirements for other public safety applications is underway.

   

Stephen Wolf; Margaret H. Pinson, "Application of the NTIA General Video Quality Metric (VQM) to HDTV Quality Monitoring," Third International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM-07), Scottsdale, Arizona, January 25-26, 2007

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; digital video; objective video quality performance

This paper summarizes results from an experiment whose goal was to assess whether the NTIA General Video Quality Metric (VQM) is an acceptable objective metric for measuring High Definition TV (HDTV) video quality. The HDTV subjective test that was performed to evaluate the NTIA General VQM contained 60 30-second video clips that were rated using the Single Stimulus Continuous Quality Evaluation (SSCQE) method. The 60 clips included twelve 1080i HDTV originals and 48 processed versions of these originals from 16 different video systems. The video systems included 5 different HDTV codecs running at bit rates from 2 to 19 Mbps and broadcast transmission errors (i.e., RF transmission with poor signal-to-noise-ratio). Excellent objective-to-subjective correlation results for this experiment demonstrate the potential application of the NTIA General VQM to HDTV quality monitoring.

   

Margaret H. Pinson; Stephen Wolf, In-Service Video Quality Metric (IVQM) User’s Manual, NTIA Handbook HB-06-434a, July 2006

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; end-to-end; in-service; objective video quality performance

The purpose of this handbook is to provide a user’s manual for the in-service video quality metric (IVQM) tool. IVQM performs automated processing of live video signals. This program runs under the Windows XP® operating system on two PCs communicating through an IP connection. IVQM performs image acquisition, temporal registration, other video calibration (spatial registration, spatial scaling, valid region, and gain/level offset), and video quality estimation. IVQM compares the source video sequence to the destination video sequence (i.e., as output by the video system under test). Each program alternates between video capture and video analysis. Every source/destination video sequence pair is processed through three main steps. First, the sequences are buffered onto a hard drive. Second, the sequences are temporally registered. Third, the video quality of the destination video sequence is estimated. Quality estimates are reported on a scale of zero to one, where zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level. Some video sequences may also be used to estimate other calibration values (spatial registration, spatial scaling, valid region estimation, and gain/level offset). The user has control over how often these other calibration values are calculated.

   

Margaret H. Pinson; Stephen Wolf, Reduced Reference Video Calibration Algorithms, NTIA Technical Report TR-06-433a, July 2006

Abstract & keywords

   

Keywords: Video; gain; calibration; spatial scaling; delay; offset; spatial shift; temporal shift

This report describes four Reduced Reference (RR) video calibration algorithms of low computational complexity. RR methods are useful for performing end-to-end in-service video quality measurements since these methods utilize a low bandwidth network connection between the original (source) and processed (destination) ends. The first RR video calibration algorithm computes temporal registration of the processed video stream with respect to the original video stream (i.e., video delay estimation). The second algorithm jointly calculates spatial scaling and spatial shift. The third algorithm calculates luminance gain level offset of the processed video stream with respect to the original video stream. The fourth algorithm estimates the valid video region of the original or processed video stream (i.e., the portion of the video image that contains actual picture content). All the algorithms utilize only the luminance (Y) image plane of the video signal.

   

Margaret H. Pinson; Stephen Wolf, In-Service Video Quality Metric (IVQM) User’s Manual, NTIA Handbook HB-06-434, December 2005

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; end-to-end; in-service; objective video quality performance

The purpose of this handbook is to provide a user's manual for the in-service video quality metric (IVQM) tool. IVQM performs automated processing of live video signals. This program runs under the Windows XP® operating system on two PCs communicating through an IP connection. IVQM performs image acquisition, temporal registration, other video calibration (spatial registration, spatial scaling, valid region, and gain/level offset), and video quality estimation. IVQM compares the source video sequence to the destination video sequence (i.e., as output by the video system under test). Each program alternates between video capture and video analysis. Every source/destination video sequence pair is processed through three main steps. First, the sequences are buffered onto a hard drive. Second, the sequences are temporally registered. Third, the video quality of the destination video sequence is estimated. Quality estimates are reported on a scale of zero to one, where zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level. Some video sequences may also be used to estimate other calibration values (spatial registration, spatial scaling, valid region estimation, and gain/level offset). The user has control over how often these other calibration values are calculated.

   

Mark A. McFarland; Margaret H. Pinson; Stephen Wolf, Batch Video Quality Metric (BVQM) User’s Manual, NTIA Handbook HB-06-441a, December 2005

Abstract & keywords

   

Keywords: video quality; metrics; video calibration; automatic measurements; digital video; objective video quality performance; batch video clip processing

This handbook provides a user’s manual for the batch video quality metric (BVQM) tool. BVQM runs under the Windows XP® operating system. BVQM performs objective automated quality assessments of processed video clip batches (i.e., as output by a video system under test). BVQM reports video calibration and quality metric results such as: temporal registration, spatial registration, spatial scaling, valid region, gain/level offset, and objective video quality estimates. BVQM operates on original and processed video files only, and has no video capture capability. BVQM compares the original video clip to the processed video clip and reports quality estimates on a scale from zero to one. On this scale, zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level (excursions beyond one are possible for extremely impaired video sequences).

   

Margaret H. Pinson; Stephen Wolf, Reduced Reference Video Calibration Algorithms, NTIA Technical Report TR-06-433, October 2005

Abstract & keywords

   

Keywords: Video; gain; calibration; spatial scaling; delay; offset; spatial shift; temporal shift

This report describes four Reduced Reference (RR) video calibration algorithms of low computational complexity. RR methods are useful for performing end-to-end in-service video quality measurements since these methods utilize a low bandwidth network connection between the original (source) and processed (destination) ends. The first RR video calibration algorithm computes temporal registration of the processed video stream with respect to the original video stream (i.e., video delay estimation). The second algorithm jointly calculates spatial scaling and spatial shift. The third algorithm calculates luminance gain level offset of the processed video stream with respect to the original video stream. The fourth algorithm estimates the valid video region of the original or processed video stream (i.e., the portion of the video image that contains actual picture content). All the algorithms utilize only the luminance (Y) image plane of the video signal.

   

Margaret Pinson; Stephen Wolf, "Low Bandwidth Reduced Reference Video Quality Monitoring System," First International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, Arizona, January 23-25, 2005.

Abstract & keywords

   

Keywords: video quality; objective; subjective; correlation; in-service; reduced reference

This paper presents a new reduced reference (RR) video quality monitoring system that utilizes less than 10 kbits/s of reference information from the source video stream. This new video quality monitoring system utilizes feature extraction techniques similar to those found in the NTIA General Video Quality Model (VQM) that was recently standardized by the American National Standards Institute (ANSI) and the International Telecommunication Union (ITU). Objective to subjective correlation results are presented for 18 subjectively rated data sets that include more than 2500 video clips from a wide range of video scenes and systems. The method is being implemented in a new end-to-end video quality monitoring tool that utilizes the Internet to communicate the low bandwidth features between the source and destination ends.

   

Margaret H. Pinson; Stephen Wolf, Video scaling estimation technique, NTIA Technical Memo TM-05-417, January 2005

Abstract & keywords

   

Keywords: video quality; objective; calibration; random search; spatial scaling

Digital video compression algorithms are being deployed that spatially stretch or shrink the video picture. Although small changes in spatial scaling are not usually noticeable to viewers, objective video quality measurement systems may be adversely impacted if the spatial scaling is not corrected. This report describes an algorithm that can be used to automatically measure the amount of spatial scaling present in a video system. This algorithm obtains satisfactory computational complexity by (1) separating the searches for horizontal & vertical scaling factors, (2) using image profiles rather than full images, and (3) using random rather than exhaustive searching techniques.

   

Margaret Pinson; Stephen Wolf, "A new standardized method for objectively measuring video quality," IEEE Transactions on Broadcasting, vol.50, no.3, pp. 312-322, Sept. 2004 doi: 10.1109/TBC.2004.834028

Abstract & keywords

   

Keywords: video quality; image quality; objective testing; subjective testing

The National Telecommunications and Information Administration (NTIA) General Model for estimating video quality and its associated calibration techniques were independently evaluated by the Video Quality Experts Group (VQEG) in their Phase II Full Reference Television (FR-TV) test. The NTIA General Model was the only video quality estimator that was in the top performing group for both the 525-line and 625-line video tests. As a result, the American National Standards Institute (ANSI) adopted the NTIA General Model and its associated calibration techniques as a North American Standard in 2003. The International Telecommunication Union (ITU) has also included the NTIA General Model as a normative method in two Draft Recommendations. This paper presents a description of the NTIA General Model and its associated calibration techniques. The independent test results from the VQEG FR-TV Phase II tests are summarized, as well as results from eleven other subjective data sets that were used to develop the method.

   

Margaret H. Pinson; Stephen Wolf, The impact of monitor resolution and type on subjective video quality testing, NTIA Technical Memo TM-04-412, March 2004

Abstract & keywords

   

Keywords: video quality; image quality; subjective testing; CIF; ITU-R Recommendation BT.601; monitor resolution

This memorandum compares subjective video quality test results from a professional cathode ray tube (CRT) television monitor with that of a consumer liquid crystal display (LCD) video phone monitor. The CRT monitor supported the full ITU–R Recommendation BT.601 resolution (720 x 486) while the LCD monitor only supported Common Intermediate Format (CIF) resolution (352 x 288). The subjective results from the two tests are very similar, with the only significant difference being that the CIF monitor masks impairments that appear in only one of the two interlaced fields.

   

Michael H. Brill; Jeffrey Lubin; Pierre Costa; Stephen Wolf; John Pearson, "Accuracy and cross-calibration of video quality metrics: new methods from ATIS/T1A1," Signal Processing: Image Communication, Vol. 19, Feb. 2004, pp. 101-107

Abstract & keywords

   

Keywords: telecommunication; video quality; Quality metrics; VQEG: Standards

Video quality metrics (VQMs) have often been evaluated and compared using simple measures of correlation to subjective mean opinion scores from panels of observers. However, this approach does not fully take into account the variability implicit in the observers. We present techniques for determining the statistical resolving power of a VQM. defined as the minimum change in the value of the metric for which subjective test scores show a significant change. Resolving power is taken as a measure of accuracy. These techniques have been applied to the video quality experts group (VQEG) data set and incorporated into the recent Alliance for Telecommunications Industry Solutions (A TIS) Committee TIA I series of technical reports (TRs), which provide a comprehensive framework for characterizing and validating full-reference VQM. These approved TRs, while not standards, will enable the US telecommunications industry to incorporate VQMs into contracts and tariffs for compressed video distribution. New methods for assessing VQM accuracy and cross-calibrating VQMs are an integral part of the framework. These methods have been applied to two VQMs at this point: peak-signal-to-noise ratio and the version of Sarnoff's just noticeable difference metric (JNDmetrix 11 ) tested by VQEG (Rapporteur Ql l/12 (VQEG): Final report from the VQEG on the validation of objective models of video quality assessment. June 2000). The framework is readily extensible to additional VQMs.

   

Stephen Wolf, Color correction matrix for digital still and video imaging systems, NTIA Technical Memo TM-04-406, December 2003

Abstract & keywords

   

Keywords: digital; Video; calibration; camera; channel; chart; color; colorspace; component; correction; matrix; non-linear; sRGB

This document discusses a method for correcting inaccurate color output by digital still and video imaging systems. The method uses a known reference image together with a least–squares algorithm to estimate the optimal color channel mixing matrix that must be applied to the output images in order to correct their color inaccuracies. The techniques presented in this document will provide users of digital photography and video equipment with an automated tool for correcting color output. For instance, digital photography users currently may try to correct color distortions in their images by trial and error using photo editing software. However, these correction procedures are time consuming and subjective and do not normally allow for arbitrary mixing of the color channels. The automated color correction matrix computation presented in this document allows each color component in the corrected image (e.g., red) to be calculated as a linear summation of a DC component and all the color components (e.g., red, green, and blue) in the uncorrected image. Methods to correct non–linearities in the color response of digital imaging systems are also discussed.

   

Margaret Pinson; Stephen Wolf, "Comparing Subjective Video Quality Testing Methodologies," SPIE Video Communications and Image Processing Conference, Lugano, Switzerland, July 2003. doi: 10.1.1.108.5768

Abstract & keywords

   

Keywords: video quality; image quality; subjective testing; correlation; single stimulus continuous quality evaluation (SSCQE); double stimulus continuous quality scale (DSCQS); double stimulus comparison scale (DSCS); picture quality

International recommendations for subjective video quality assessment (e.g., ITU-R BT.500-11) include specifications for how to perform many different types of subjective tests. Some of these test methods are double stimulus where viewers rate the quality or change in quality between two video streams (reference and impaired). Others are single stimulus where viewers rate the quality of just one video stream (the impaired). Two examples of the former are the double stimulus continuous quality scale (DSCQS) and double stimulus comparison scale (DSCS). An example of the latter is single stimulus continuous quality evaluation (SSCQE). Each subjective test methodology has claimed advantages. For instance, the DSCQS method is claimed to be less sensitive to context (i.e., subjective ratings are less influenced by the severity and ordering of the impairments within the test session). The SSCQE method is claimed to yield more representative quality estimates for quality monitoring applications. This paper considers data from six different subjective video quality experiments, originally performed with SSCQE, DSCQS and DSCS methodologies. A subset of video clips from each of these six experiments were combined and rated in a secondary SSCQE subjective video quality test. We give a method for postprocessing the secondary SSCQE data to produce quality scores that are highly correlated to the original DSCQS and DSCS data. We also provide evidence that human memory effects for time-varying quality estimation seem to be limited to about 15 seconds.

   

Margaret Pinson; Stephen Wolf, "An Objective Method for Combining Multiple Subjective Data Sets," SPIE Video Communications and Image Processing Conference, Lugano, Switzerland, July 2003. doi: 10.1.1.105.967

Abstract & keywords

   

Keywords: video quality; image quality; objective testing; subjective testing; correlation; single stimulus continuous quality evaluation (SSCQE); double stimulus continuous quality scale (DSCQS); comparison

International recommendations for subjective video quality assessment (e.g., ITU-R BT.500-11) include specifications for how to perform many different types of subjective tests. In addition to displaying the video sequences in different ways, subjective tests also have different rating scales, different words associated with these scales, and many other test variables that change from one laboratory to another (e.g., viewer expertise and criticality, cultural differences, physical test environments). Thus, it is very difficult to directly compare or combine results from two or more subjective experiments. The ability to compare and combine results from multiple subjective experiments would greatly benefit developers and users of video technology since standardized subjective data bases could be expanded upon to include new source material and past measurement results could be related to newer measurement results. This paper presents a subjective method and an objective method for combining multiple subjective data sets. The subjective method utilizes a large meta-test with selected video clips from each subjective data set. The objective method utilizes the functional relationships between objective video quality metrics (extracted from the video sequences) and corresponding subjective mean opinion scores (MOSs). The objective mapping algorithm, called the iterated nested least-squares algorithm (INLSA), relates two or more independent data sets that are themselves correlated with some common intermediate variables (i.e, the objective video quality metrics). We demonstrate that the objective method can be used as an effective substitute for the expensive and time consuming subjective meta-test.

   

Stephen D. Voran, An iterated nested least-squares algorithm for fitting multiple data sets, NTIA Technical Memo TM-03-397, October 2002

Abstract & keywords

   

Keywords: audio quality estimation; speech quality estimation; video quality estimation; data set fitting; least-squares fitting; linear regression; meta-analysis

A multiple data set fitting problem often arises in conjunction with the development of objective estimators of perceived audio or video quality. In such development work, we often seek the best linear relationship between a set of objective audio or video quality estimation parameters and a set of subjective audio or video quality scores. In order to find the most robust and reliable relationship, we prefer to perform a least-squares fit using as many audio or video data points as possible. This motivates us to combine scores from different subjective tests. Unfortunately, scores from different subjective tests or data sets can differ in significant ways due to differing test procedures, environments, languages, and other sources. We develop a solution to this multiple data set fitting problem: the iterated nested least-squares (INLS) algorithm. This algorithm iterates between two least-squares steps. One step attempts to homogenize heterogeneous data sets through the use of a single first-order correction for all of the data points in each data set. The other least-squares step solves for the appropriate linear combination of the parameters, across all data sets. We also offer example INLS algorithm results using simulation data and data from telephone-bandwidth speech quality tests. For convenience we have written this memorandum in the language of objective estimation of perceived audio and video quality but the results are completely general and can be used to fit other types of data sets as well.

   

Stephen Wolf; Margaret Pinson, Video Quality Measurement Techniques, NTIA Technical Report TR-02-392, June 2002

Abstract & keywords

   

Keywords: television; models; Video; quality; metrics; features; parameters; objective; subjective; correlation; reduced-reference; videoconferencing; root cause analysis; spatial information (SI); temporal information (TI); impairments; blocking; blurring; frame dropping; peak-signal-to-noise ratio (PSNR); video calibration; spatial registration; temporal registration; gain; contrast; level offset; brightness

Objective metrics for measuring digital video performance are required by Government and industry for specification of system performance requirements, comparison of competing service offerings, service level agreements, network maintenance, and optimization of the use of limited network resources such as transmission bandwidth. To be accurate, digital video quality measurements must be based on the perceived quality of the actual video being received by the users of the digital video system rather than the measured quality of traditional video test signals (e.g., color bar). This is because the performance of digital video systems is variable and depends upon the dynamic characteristics of both the original video (e.g., spatial detail, motion) and the digital transmission system (e.g., bit rate, error rate). The goal of this report is to provide a complete description of the ITS video quality metric (VQM) algorithms and techniques. The ITS automated objective measurement algorithms provide close approximations to the overall quality impressions, or mean opinion scores, of digital video impairments that have been graded by panels of viewers.

   

Margaret Pinson; Stephen Wolf, Video Quality Measurement User’s Manual, NTIA Handbook HB-02-01, February 2002

Abstract & keywords

   

Keywords:

The purpose of this handbook is to provide a user’s manual for the video quality metric (VQM) tool. The VQM software tool performs automated batch processing of video files. Program VQM runs under the UNIX operating system and uses a control file to specify the exact video quality measurement procedures that are to be performed. All results are emailed to the user. Program VQM compares the video sequence that has been processed by the video system under test to the original video sequence through two main steps. First, program VQM calibrates the processed video sequence to remove systematic differences between the original and processed, such as spatial and temporal shifts. Second, program VQM estimates and reports the perceived quality of the processed video using one of five video quality models. Quality estimates are reported on a scale of zero to one, where zero means that no impairment is visible and one means that the video clip has reached the maximum impairment level.

   

Wael Ashmawi; Roch Guerin; Stephen Wolf; Margaret Pinson, "On the Impact of Policing and Rate Guarantees in Diff-Serv Networks: A Video Streaming Application Perspective," Association for Computing Machinery, Special Interest Group on Data Communications (SIGCOMM 2001), pp. 83-95, August 27-31, 2001, San Diego, CA

Abstract & keywords

   

Keywords: quality of service; objective; subjective; correlation; streaming video quality metrics; Internet2; differentiated services; expedited forwarding

Over the past few years, there have been a number of proposals aimed at introducing different levels of service in the Internet. One of the more recent proposals is the Differentiated Services (Diff-Serv) architecture, and in this paper we explore how the policing actions and associated rate guarantees provided by the Expedited Forwarding (EF) translate into perceived benefits for applications that are the presumed users of such enhancements. Specifically, we focus on video streaming applications that arguably have relatively strong service quality requirements, and which should, therefore, stand to benefit from the availability of some form of enhanced service. Our goal is to gain a better understanding of the relation that exists between application level quality measures and the selection of the network level parameters that govern the delivery of the guarantees that an EF based service would provide. Our investigation, which is experimental in nature, relies on a number of standard streaming video servers and clients that have been modified and instrumented to allow quantification of the perceived quality of the received video stream. Quality assessments are performed using a Video Quality Measurement tool based on the ANSI objective quality standard. Measurements were made over both a local Diff-Serv testbed and across the QBone, a QoS enabled segment of the Internet2 infrastructure. The paper reports and analyzes the results of those measurements.

   

Stephen Wolf; Margaret H. Pinson, "The Relationship Between Performance and Spatial-Temporal Region Size for Reduced-Reference, In-Service Video Quality Monitoring Systems," Proc. of 5th World Multiconference on Systemics, Cybernetics and Informatics (ISAS-SCI 2001), Orlando, FL, July 22-25, 2001.

Abstract & keywords

   

Keywords: objective; subjective; correlation; reduced-reference; in-service; video quality metrics; spatial; temporal; compression; sub-region

This paper presents objective-to-subjective correlation results for a reduced-reference, in-service, video quality monitoring system. This reduced-reference system utilizes quality parameters that are computed by comparing features extracted from spatial-temporal (S-T) regions of the input video stream with identical features extracted from the output video stream. The amount of reduced-reference information that is required to compute the quality parameters is inversely related to the size of the S-T region. Smaller amounts of reference information (i.e., larger S-T regions) are desired since less transmission or storage bandwidth is required for the reference information. However, objective-to-subjective correlation drops off if the S-T region size becomes too large. In this paper we examine the tradeoffs between objective-to-subjective correlation results and S-T region size. Correlation results for S-T region sizes from 8 vertical lines x 8 horizontal pixels x 2 video frames to 128 x 128 x 24 are presented. These results utilized a total of nine subjectively rated data sets that span an extremely wide range of bit rates and compression techniques. Thus, designers of television video systems as well as Internet video streaming systems may use the results.

   

Stephen Wolf; Margaret H. Pinson, "Spatial-Temporal Distortion Metrics for In-Service Quality Monitoring of Any Digital Video System," SPIE International Symposium on Voice, Video, and Data Communications, Boston, MA, September 11-22, 1999.

Abstract & keywords

   

Keywords: objective; subjective; correlation; in-service; video quality metrics; spatial; temporal; compression

Many organizations have focused on developing digital video quality metrics which produce results that accurately emulate subjective responses. However, to be widely applicable a metric must also work over a wide range of quality, and be useful for in-service quality monitoring. The Institute for Telecommunication Sciences (ITS) has developed spatial-temporal distortion metrics that meet all of these requirements. These objective metrics are described in detail and have a number of interesting properties, including utilization of 1) spatial activity filters which emphasize long edges on the order of 1/5 degree while simultaneously performing large amounts of noise suppression, 2) the angular direction of the spatial gradient, 3) spatial-temporal compression factors of at least 384:1 (spatial compression of at least 64:1 and temporal compression of at least 6:1, and 4) simple perceptibility thresholds and spatial-temporal masking functions. Results are presented that compare the objective metric values with mean opinion scores from a wide range of subjective data bases spanning many different scenes, systems, bit-rates, and applications.

   

Maragaret H. Pinson; Stephen Wolf, "Medium Bandwidth Techniques for Estimating Temporal Delays Between Input and Output Video Sequences," ANSI T1A1 Contribution T1A1.5/99-205, May 1999

Abstract & keywords

   

Keywords: video delay; temporal alignment; medium bandwidth features

This contribution presents new techniques for estimating temporal delays between input and output video streams from video teleconferencing systems. The discussion in this contribution considers "variable alignment," defined as an alignment process that can assign a unique alignment offset, or delay, to every output video frame in a given output video sequence. To achieve the bandwidth reduction necessary for in-service measurements, the variable alignment techniques presented in this contribution are applied to subsampled video images. We have found that a sub-sampling factor of 128 to 1 produces useful variable alignment results. Regardless of the sub-sampling factor used (including sub-sampling factors of 1), there is always a possibility of misalignment. The probability of misalignment increases as the amount of scene motion decreases and the video distortion present in the output increases. Therefore, it has been necessary to implement artificial intelligence techniques that examine the set of most probable alignments produced by the signal processing portion of the algorithm. All of the techniques presented in this contribution can also be applied to whole field correlation algorithms such as those given in ANSI T1.801.04.

   

Margaret H. Pinson; Stephen Wolf, "Low Bandwidth Techniques for Estimating Temporal Delays Between Input and Output Video Sequences," ANSI T1A1 Contribution T1A1.5/99-204, May 1999.

Abstract & keywords

   

Keywords: video delay; temporal alignment; low bandwidth features

This contribution presents a new technique for estimating temporal delays between input and output video streams from video teleconferencing systems. The discussion in this contribution is limited to "constant alignment," defined as an alignment process that computes the same fixed alignment offset, or delay, for every output video frame in a given output video sequence. Constant alignment can (1) serve as a starting point for variable alignment techniques, where each output frame can have a unique alignment offset, or delay, and (2) be used to align low frame rate output video sequences in preparation for measuring video quality parameters. The constant alignment technique that is presented is applicable for real-time in-service monitoring since it utilizes a set of computationally efficient low bandwidth features that are extracted from the input and output video streams.

   

Stephen Wolf; Margaret H. Pinson, "In-Service Performance Metrics for MPEG-2 Video Systems," Proc. Made to Measure 98—Measurement Techniques of the Digital Age Technical Seminar, Montreux, Switzerland, November 12-13, 1998, International Academy of Broadcasting (IAB), ITU and Technical University of Braunschweig.

Abstract & keywords

   

Keywords: performance; Video; quality; metrics; features; parameters; objective; subjective; correlation; color; MPEG-2; spatial; temporal; measures; test; instrument

With the advent of new digital video systems that utilize compression to achieve a savings in transmission or storage bandwidth, the quality of the received output video can be dependent not only upon the inherent spatial and temporal information content of the input video but also upon the dynamic variability of the digital communications channel. Therefore, out-of-service quality measurements using video test signals or scenes may not relate at all to the resultant received quality of actual program material. Furthermore, traditional in-service quality measurements made by injecting test signals into the non-visible portion of the video signal (e.g., the vertical interval in the NTSC or PAL video standard) are not applicable to modern digital video systems. Thus, a new method is required to make in-service video quality measurements on actual program material. This paper describes a new test instrument for measuring the quality of a video transmission or storage system where the input and output of the system may be spatially separated, and when there is no a priori knowledge of the input video. The test instrument makes continuous quality measurements by (1) extracting statistics from sequences of processed input and output video frames, (2) communicating these extracted statistics between the input and the output ends using an ancillary-data channel of arbitrary bandwidth, (3) computing individual video quality parameters from the communicated statistics that are indicative of the various perceptual aspects of video quality (e.g., spatial, temporal, color), and (4) calculating a composite video quality metric by combining the individual video quality parameters. The test instrument makes coarser quality measurements (coarser in the sense that the extracted statistics come from larger spatial-temporal regions) when smaller capacity ancillary-data channels are available and finer quality measurements when larger capacity ancillary-data channels are available. The design goal for the test instrument is to make the most accurate in-service video quality measurements given the available ancillary-data-channel bandwidth (mobile telephone connections, modem connections over the Public Switched Telephone Network (PSTN), Internet connections, Local Area Network (LAN) connections, satellite connections, cable connections, etc.).

   

Charles Fenimore; John Libert; Stephen Wolf, "Perceptual Effects of Noise in Digital Video Compression," 140th SMPTE Technical Conference, Pasadena, CA, October 28-31, 1998.

Abstract & keywords

   

Keywords: noise; Video; quality; metrics; objective; subjective; correlation; MPEG-2; perception; scene; criticality

We present results of subjective viewer assessment of video quality of MPEG-2 compressed video containing wide-band Gaussian noise. The video test sequences consisted of seven test clips (both classical and new materials) to which noise with a peak-signal-to-noise-ratio (PSNR) of from 28 dB to 47 dB was added. We used software encoding and decoding at five bit-rates ranging from 1.8 Mb/s to 13.9 Mb/s. Our panel of 32 viewers rated the difference between the noisy input and the compression-processed output. For low noise levels, the subjective data suggests that compression at higher bit-rates can actually improve the quality of the output, effectively acting like a low-pass filter. We define an objective and a subjective measure of scene criticality (the difficulty of compressing a clip) and find the two measures correlate for our data. For difficult-to-encode material (high criticality), the data suggest that the effects of compression may be less noticeable at mid-level noise, while for easy-to-encode video (low criticality), the addition of a moderate amount of noise to the input led to lower quality scores. This suggests that either the compression process may have reduced noise impairments or a form of masking may occur in scenes that have high levels of spatial detail.

   

Coleen Jones; D. J. Atkinson, "Development of Opinion-Based Audiovisual Quality Models for Desktop Video-Teleconferencing," 1998 Sixth International Workshop on Quality of Service (IWQoS 98), pp.196-203, Napa, California, May 18-20, 1998. doi: 10.1109/IWQOS.1998.675239

Abstract & keywords

   

Keywords: models; video teleconferencing; Video; quality; objective; subjective; correlation; Audio; audio-visual

This paper discusses the analysis of an audiovisual desktop video-teleconferencing subjective experiment conducted at the Institute for Telecommunication Sciences. Objective models of the individual audio and video quality are presented. Also discussed is an objective model of the audio-visual quality based upon the results of the individual objective audio and video quality models. Finally, a subjective model of the audiovisual quality based upon users’ ratings of the audio and video quality is discussed.

   

Stephen Wolf; Margaret H. Pinson; Arthur A. Webster; Gregory W. Cermak; E. Paterson Tweedy, "Validating Objective Measures of MPEG Video Quality," SMPTE Journal, vol. 107, no. 4, pp. 226-235, New York City, April 1998

Abstract & keywords

   

Keywords: coding; Video; quality; metrics; objective; subjective; correlation; MPEG-2; compression; MPEG; MPEG-1

In 1996, the American National Standards Institute (ANSI) adopted ANSI T1.801.03, which presents a number of new objective video quality metrics for quantifying the effects of digital compression and transmission impairments. The measurements in ANSI T1.801.03 were selected based on an extensive multilaboratory quality assessment study that included video systems from bit rates of 64 kbit/sec to 45 Mbit/sec and video test scenes that spanned a wide range of spatial and temporal coding difficulties. The set of objective video quality measurements effectively accounted for subjective judgments by human viewers. While 25 video systems were tested, this multilaboratory study did not include MPEG video systems, and did not cover any bit rates between 1.6 and 10 Mbit/sec. This paper presents the results from two MPEG studies designed to fill in the bit-rate gap in the previous multilaboratory study. In these studies, we concentrated on bit rates from 1.5 - 8.3 Mbit/sec and examined the performance of MPEG 1 and MPEG 2 codecs (coder-decoders) specifically. The effectiveness of the ANSI standard objective video quality metrics was examined for these bit rates and coding technologies. Our analysis revealed that the objective video quality metrics primarily measure four principal components of video quality: added edges, lost edges, added motion, and lost motion; we found that parameters selected from these principal components can be used as effective predictors of subjective quality ratings for entertainment video systems.

   

Stephen Wolf, "Measuring the End-to-End Performance of Digital Video Systems," IEEE Transactions on Broadcasting, Vol. 43, No. 3, pp. 320-328, September 1997. doi: 10.1109/11.632940

Abstract & keywords

   

Keywords: digital; performance; standards; ANSI; Video; quality; parameters; objective; subjective; correlation

Significant research and development efforts by industry and government laboratories were brought to fruition in 1996 with the approval of American National Standard (ANSI) T1.801.03 entitled "American National Standard for Telecommunications - Digital Transport of One-Way Video Signals - Parameters for Objective Performance Assessment." This standard provides a set of objective parameters that have consistently demonstrated high correlation levels with subjective evaluations of digital video impairments. The parameters are technology-independent and may be used to measure the performance of a wide range of digital video compression, storage, and transmission systems. This paper presents an overview of the ANSI T1.801.03 parameters and summarizes other relevant standards activities and contributions.

   

Stephen Wolf; Margaret Pinson; Arthur Webster; Greg Cermak; E. Paterson Tweedy, "Objective and Subjective Measures of MPEG Video Quality," ANSI T1A1 Contribution T1A1.5/96-121, October 28, 1996.

Abstract & keywords

   

Keywords: coding; Video; quality; metrics; objective; subjective; correlation; MPEG-2; compression; MPEG; MPEG-1

Presents an in-depth analysis and discussion of the results from applying the ANSI T1.801.03-1996 objective video quality metrics to subjectively rated MPEG-1 and MPEG-2 video test scenes. The objective metrics presented in ANSI T1.801.03-1996 (American National Standard for Telecommunications - Digital Transport of One-Way Video Signals - Parameters for Objective Performance Assessment) were able to account for 90% of the subjective information that could be captured considering the level of measurement error present in the subjective and objective data sets. By contrast, peak signal to noise ratio (PSNR), a traditional objective metric, was only able to account for 21% of the subjective information.

   

Abstract & keywords

   

Keywords: coding; digital; Video; gain; level offset; calibration; delay; variable; compression; shift; dynamic

Presents a computerized search method for determining the gain, level offset, active video shift, and video delay of a digital video transmission channel. The method uses digitized input and output NTSC video fields that have been time tagged with SMPTE time code. The primary applications of the method are (1) to produce estimates of channel gain, level offset, and active video shift when data from ANSI T1.801.03-1996 calibration test patterns are not available, and (2) to provide a dynamic method to measure these quantities in conjunction with video delay

   

D. J. Atkinson, Exploring B-ISDN performance: Selected experiments and results, NTIA Technical Report TR-96-329, April 1996

Abstract & keywords

   

Keywords: Broadband; performance; standards; network; measurement; asynchronous transfer mode; ATM; B-ISDN; emulation; SONET; synchronous optical network

This report describes experiments conducted to explore the user-information transfer performance of the broadband integrated services digital network (B-ISDN), the emerging infrastructure for the global information age. These performance experiments include studying the effect of physical layer transmission performance on asynchronous transfer mode (ATM) cell transfer performance, ATM performance in relationship to network topology, and the impact of B-ISDN performance on video quality. A tool to help study these performance issues, a B-ISDN network emulator, is described, including its validation. The emulator incorporates a novel model for transmission impairments, enabling performance interactions among the B-ISDN protocol layers to be studied based on relevant International Telecommunication Union – Telecommunication Standardization Sector (ITU-T) Recommendations and American National Standards.

   

Stephen Wolf, "Tape Duplication and Traceability Process for ANSI T1.801.02-1996 (Terms and Definitions)," ANSI T1A1 Contribution T1A1.5/95-158, December 14, 1995.

Abstract & keywords

   

Keywords: coding; digital; guidelines; Video; impairments; compression; scenes; examples; tape; traceability; duplication; conferencing; artifacts

Recommends a process to be used by ANSI for duplication and traceability of video tapes for ANSI T1.801.02-1996 (American National Standard for Telecommunications - Digital Transport of Video Teleconferencing/Video Telephony Signals - Performance Terms, Definitions, and Examples).

   

Stephen Wolf, "Tape Duplication and Traceability Process for ANSI T1.801.01-1996 (Test Scenes)," ANSI T1A1 Contribution T1A1.5/95-157, December 14, 1995.

Abstract & keywords

   

Keywords: coding; digital; guidelines; Video; impairments; Testing; compression; scenes; tape; traceability; duplication; conferencing

Recommends a process to be used by ANSI for duplication and traceability of video tapes for ANSI T1.801.01-1996 (American National Standard for Telecommunications - Digital Transport of Video Teleconferencing/Video Telephony Signals - Video Test Scenes for Subjective and Objective Performance Assessment).

   

Stephen Wolf, "Field Vs. Frame Calculations for Visual Channel Delay," ANSI T1A1 Contribution T1A1.5/95-152, December 11, 1995.

Abstract & keywords

   

Keywords: Video; delay; variable; NTSC; field; frame; rate; constant

This contribution recommends that calculations of visual channel delay be based on fields rather than frames for an interlaced video system such as NTSC. Using fields rather than frames will double the resolution of the measurement without significantly increasing the complexity and will overcome certain measurement anomalies that might result from calculations based on frames.

   

Stephen Wolf, "An Update on ITS Video Quality Parameters," ANSI T1A1 Contribution T1A1.5/92-135, July 13, 1995.

Abstract & keywords

   

Keywords: digital; performance; models; Video; quality; metrics; objective; subjective; correlation; spatial; temporal; compression; efficient; computation

Presents a summary of efforts to make the objective video quality metrics more efficient in terms of computation and transmission bandwidth. Objective to subjective correlation results are presented for the subjectively rated data set given in the 1992 International Broadcasting Convention paper.

   

Stephen Wolf, "An Analysis Technique for Detecting Temporal Edge Noise," ANSI T1A1 Contribution T1A1.5/95-105, January 9, 1995.

Abstract & keywords

   

Keywords: digital; performance; noise; Video; quality; objective; impairments; Testing; temporal; scene; edge

Presents an objective analysis technique that can be used for detecting temporal edge noise in an output video sequence that has been digitally compressed. Temporal edge noise is defined in ANSI T1.801.02-1996 as "A form of edge busyness characterized by time-varying sharpness to edges of objects." The technique uses the Fourier transform to perform a comparative analysis of the frequencies present in the time history of the spatial information (SI) features measured from the input and output video (SI features are defined in ANSI T1.801.03-1996). The presence of temporal edge noise in the output video is shown to result in high frequency information being added to the time history of the SI feature.

   

Dwight Melcher; Stephen Wolf, "Objective Measures for Detecting Digital Tiling," ANSI T1A1 Contribution T1A1.5/95-104, January 9, 1995.

Abstract & keywords

   

Keywords: digital; performance; Video; quality; objective; impairments; blocking; Testing; spatial; compression; scene; MPEG; tiling; teleconferencing; codecs

Describes an algorithm for (1) quantifying the spatial gradients or edges in an image as a function of angle or orientation, (2) extracting low bandwidth features from these input and corresponding output spatial gradient images, and (3) deriving objective video quality metrics from these low bandwidth features that quantify the amount of tiling (i.e., blocking) and blurring in the output image. Examples are given for images that have been compressed by MPEG and video teleconferencing codecs.

   

Stephen Wolf, "An In-depth Analysis of the P6 Lost Motion Energy Parameter," ANSI T1A1 Contribution T1A1.5/95-103, January 9, 1995.

Abstract & keywords

   

Keywords: performance; models; noise; Video; quality; objective; subjective; correlation; impairments; dropped frames; temporal; compression; scene; teleconferencing; codecs; frame rate; motion

Provides an in-depth analysis of the behavior of the P6 video performance parameter, a parameter the measures the motion energy that is lost when a video scene is transmitted through a digital video compression system. Objective to subjective correlation results are given for the T1A1 subjective video experiment that contained 625 mean opinion scores (25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). This contribution provides experimental evidence that video teleconferencing users do not significantly downgrade quality ratings for frame rates of at least 10 to 15 frames per second.

   

Abstract & keywords

   

Keywords: system; performance; Video; quality; objective; subjective; Testing

This contribution discusses the advantages and disadvantages of two methods for obtaining a composite rating of Hypothetical Reference Circuit (HRC) performance for an ensemble of source material. The term HRC refers to a specific realization of a video transmission system that may include coders, digital transmission circuits, decoders, and even analog processing of the video signal. In the first method, subjects are asked to separately observe and rate each of several HRC-scene combinations, and the separate ratings are then averaged over subjects and scenes to produce a composite rating for each HRC. In the second method, subjects are asked to rate each HRC based on a single observation period in which all the scenes are presented; the individual subject ratings are then averaged to produce a composite rating for each HRC. This contribution evaluates the two methods in light of accepted industry practice and statistical considerations. It is shown that the first method is far more widely used and offers substantial statistical advantages over the second.

   

Arthur Webster, "Two Criteria for Video Test Scene Selection," ITU-T (Telecommunications Standardization Sector), Temporary Document 35-E, Working Party 2, Study Group 12, Question 22, Geneva, December 2-5, 1994.

Abstract & keywords

   

Keywords: performance; Video; objective; subjective; spatial; temporal; scene; selection; complexity; information

Describes two objective measurements that can be used to quantify the spatial and temporal information content of a video test scene. Test scene coding difficulty (and hence video quality) often depend on the spatial and temporal information complexity of the source video. The metrics presented here can be used to assist in the selection of appropriate video test scenes for subjective and objective performance tests of digital video compression systems.

   

Coleen Jones; Ned Crow; Stephen Wolf; Arthur Webster, "Analysis of T1A1.5 Subjective and Objective Test Data," ANSI T1A1 Contribution T1A1.5/94-152, October 3, 1994.

Abstract & keywords

   

Keywords: digital; performance; Video; quality; objective; subjective; correlation; compression; conferencing; ANOVA; results; systems

Presents detailed intra-laboratory (within laboratory) and inter-laboratory (between laboratories) analysis of variance (ANOVA) results for the T1A1 subjective video experiment that contained 625 mean opinion scores (25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). The ANOVA results for the subjective test data were obtained by applying the techniques in ANSI contribution T1A1.5/94-128. A correlation analysis is given of the objective parameters presented in ANSI contributions T1A1.5/93-153, T1A1.5/93-152 (corrections given in T1A1.5/94-110), and T1A1.5/94-102. In addition to the detailed ANOVA results, plots are presented which compare the subjective data from the different laboratories.

   

Coleen Jones; Ned Crow, "Status of NTIA's Analysis of Subjective Data," ANSI T1A1 Contribution T1A1.5/94-133, July 18, 1994.

Abstract & keywords

   

Keywords: performance; analysis; Video; quality; objective; subjective; ANOVA; experiment; statistical

Presents results from the analysis of variance (ANOVA) of the subjective opinion scores for the T1A1 subjective video experiment that contained 625 mean opinion scores (25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). This contribution includes the analysis of the three teams (green, red, and orange) for two of the three laboratories involved in the testing. The analysis was performed as discussed in contribution T1A1.5/94-128 ("Methods for Analysis of Inter-laboratory Video Performance Standard Subjective Test Data"). The ANOVAs showed that all main effects and interactions were significant for all teams within the two laboratories.

   

Coleen Jones, "The Correlation of Traditional Bandwidth and Signal-to-Noise Ratio Parameters to Subjective Data," ANSI T1A1 Contribution T1A1.5/94-132, July 18, 1994.

Abstract & keywords

   

Keywords: performance; Video; quality; objective; subjective; frequency; bandwidth; SNR; response

This contribution presents objective to subjective correlation results for the traditional analog measurements of bandwidth (i.e., frequency response) and signal-to-noise ratio (SNR) for the T1A1 subjective video experiment that contained 625 mean opinion scores (25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec).

   

Edwin L. Crow, "Methods for Analysis of Inter-laboratory Video Performance Standard Subjective Test Data," ANSI T1A1 Contribution T1A1.5/94-128, March 28, 1994.

Abstract & keywords

   

Keywords: analysis; Video; quality; subjective; ANOVA; statistical; inter-laboratory

Presents in detail the ANOVA methods for analysis of the subjective T1A1 video test data, especially with regard to any systematic effect a laboratory may have on the Mean Opinion Score (MOS) of any given Hypothetical Reference Circuit (HRC). The term HRC refers to a specific realization of a video transmission system that may include coders, digital transmission circuits, decoders, and even analog processing of the video signal. The experimental variables of the T1A1 subjective test plan may be summarized as follows: (1) 3 laboratories, X, Y, Z, (2) 25 HRCs, 1, 2,..., 25, (3) 25 scenes, a, b,..., y, (4) 625 HRC-scene combinations, or test combinations, (5) 30 accepted viewers in each lab, screened from about 36 initial viewers, some of whom may not pass a consistency check, broken into 3 teams of 10 each, (6) 3 sets of videotapes - Red (R), Green (G), Orange (O), each set of 4 tapes to be viewed by a corresponding team in each lab, each set including 10 HRCs (thus overlapping slightly), (7) 4 subteams within each team since each viewing session is limited to at most 3 viewers, (8) 4 sessions for each subteam to view the 4 tapes, each tape (and session) being limited to about 32 minutes, (9) 9 types (1, 2,..., 9) into which the 25 HRCs are classified, 1 to 4 in each type, (10) 5 content categories (A,B,C,D,E) of the 25 scenes, 3 to 6 in each category, (11) 5 possible ratings of test combination scene impairment by viewers on voting forms ranging from Imperceptible to Very Annoying, which will be translated into 5,4,3,2,1 in the data reduction. Test combinations are ordered on the tapes by a restricted randomization, that subteams are selected at random from the total viewers available (from a specified type of population), and that the four tapes are presented to the corresponding four subteams in random permutation orders.

   

Stephen Wolf; Margaret Pinson, "Corrections and Extensions to T1A1.5/93-152," ANSI T1A1 Contribution T1A1.5/94-110, January 17, 1994.

Abstract & keywords

   

Keywords: coding; performance; measurements; Video; quality; metrics; parameters; objective; delay; dropped; frames; variable; Testing; compression; artifacts; motion

Contribution T1A1.5/93-152 summarized the methods of measurement for objective video quality parameters based on the Sobel-filtered image and the motion difference image that were submitted prior to conducting the T1A1 subjective experiment (this experiment collected 625 mean opinion scores - i.e., 25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). This contribution presents (1) one minor correction to the recommended value for the fraction above threshold in contribution T1A1.5/93-152, and (2) a method for estimating the video delay uncertainty of the automated time alignment algorithm presented in section 3 of contribution T1A1.5/93-152 (non-zero video delay uncertainty may result when dynamic time warping, or variable video delay, is present in the video transmission system, or when there is a substantial number of dropped video frames), (3) a method for using this video delay uncertainty in the computation of the parameters presented in T1A1.5/93-152, and (4) an improved motion spike detector that could be used for computing parameters p10 and p11 in T1A1.5/93-152.

   

Arthur Webster; Coleen Jones, "VTC Hypothetical Reference Circuit Signal to Noise Ratio Measurements," ANSI T1A1 Contribution T1A1.5/94-103, January 17, 1994.

Abstract & keywords

   

Keywords: performance; measurements; noise; Video; quality; objective; conferencing; SNR; specifications

Discusses the method of measurement used to obtain the signal to noise ratio (SNR) values for each of the twenty-five video systems in the T1A1 subjective video experiment. The 25 video systems ranged in bit rate from 64 kb/sec to 45 Mb/sec. The signal to noise ratio (SNR) was measured using a traditional analog test waveform, in particular, the 55 IRE flat field test signal that was on the standard video test scene tape.

   

Coleen Jones, "VTC Hypothetical Reference Circuit Bandwidth Measurements," ANSI T1A1 Contribution T1A1.5 / 94-102, January 17, 1994

Abstract & keywords

   

Keywords: performance; measurements; Video; quality; objective; frequency; bandwidth; conferencing; response; specifications; zone; plate

Discusses the method of measurement used to obtain the frequency response or bandwidth values for each of the twenty-five video systems in the T1A1 subjective video experiment. The 25 video systems ranged in bit rate from 64 kb/sec to 45 Mb/sec. The bandwidth was measured using a traditional analog test waveform, in particular, a static zone plate (i.e., a two dimensional swept frequency waveform) that was on the standard video test scene tape.

   

Neal Seitz; Stephen Wolf; Stephen Voran; Randy Bloomfield, "User-Oriented Measures of Telecommunication Quality," IEEE Communications Magazine, vol.32, no.1, pp.56-66, Jan. 1994 doi: 10.1109/35.249792

Abstract & keywords

   

Keywords: performance; standards; ISDN; telecommunication; ATM; Video; quality; metrics; objective; subjective; correlation; data; voice; perception-based

Discusses the standardization of user-oriented, technology-independent measures of telecommunication service quality, including voice, video, and data performance standards. Standards committee work has progressed in three broad phases. In the first phase, participants defined the basic concepts that underline the user-oriented approach to telecommunication quality assessment. In the second phase, participants developed a set of generic user-oriented quality measures, and applied these generic measures in deriving technology-specific performance parameters and measurement methods for packet-switched networks and integrated services digital networks (ISDNs). In the third phase, participants have developed user-oriented quality measures for video and voice communications. User-perceivable video and voice quality impairments are quantified by objective measures chosen for their correlation with carefully collected and numerically quantified human reactions to the transmitted images and sounds.

   

Arthur Webster, "Methods of Measurement for Two Objective Video Quality Parameters Based on the Fourier Transform," ANSI T1A1 Contribution T1A1.5/93-153, November 8, 1993.

Abstract & keywords

   

Keywords: digital; performance; measurements; Video; quality; metrics; features; parameters; objective; subjective; correlation; in-service; Testing; spatial; compression; Fourier; frequencies; out-of-service

Summarizes detailed methods of measurement for two objective video quality parameters based on the Fourier transform image. This contribution was submitted to ANSI T1A1 prior to conducting the T1A1 subjective experiment (this experiment collected 625 mean opinion scores - i.e., 25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). The video quality parameters presented here have demonstrated strong correlation to subjective evaluations of the video and can be used for in-service as well as out-of-service tests since low bit-rate features are extracted and compared from the input and corresponding output video images.

   

Stephen Wolf; Margaret Pinson; Coleen Jones; Arthur Webster, "A Summary of Methods of Measurement for Objective Video Quality Parameters Based on the Sobel Filtered Image and the Motion Difference Image," ANSI T1A1 Contribution T1A1.5/93-152, November 8, 1993.

Abstract & keywords

   

Keywords: digital; performance; measurements; Video; quality; metrics; features; parameters; objective; subjective; correlation; in-service; Testing; spatial; temporal; compression; motion; out-of-service; Sobel

Summarizes detailed methods of measurement for objective video quality parameters based on the Sobel-filtered image and the motion difference image. This contribution was submitted to ANSI T1A1 prior to conducting the T1A1 subjective experiment (this experiment collected 625 mean opinion scores - i.e., 25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec). The video quality parameters presented here have demonstrated strong correlation to subjective evaluations of the video and can be used for in-service as well as out-of-service tests since low bit-rate features are extracted and compared from the input and corresponding output video images.

   

Coleen Jones, "Real-Time Measurement System for ITS Video Quality Parameters," ANSI T1A1 Contribution T1A1.5/93-105, August 9, 1993.

Abstract & keywords

   

Keywords: system; digital; performance; measurement; noise; Video; quality; parameters; objective; gain; frequency; spatial; temporal; compression; motion; response; Sobel; real-time; PC-based; image; processing

Discusses a real-time personal computer-based system for characterizing spatial (e.g., blurring) and temporal (e.g., jerky motion) distortions in a compressed digital video system. Comparisons with a high quality non real-time laboratory version of the same image processing algorithms demonstrates that measurement system gain and frequency response must be calibrated to assure repeatability of objective performance parameter results.

   

Stephen Voran; Stephen Wolf, "An Objective Technique for Assessing Video Impairments," IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, vol.1, no., pp.161-165 vol.1, 19-21 May 1993 doi: 10.1109/PACRIM.1993.407197

Abstract & keywords

   

Keywords: digital; performance; models; Video; quality; parameters; objective; subjective; correlation; spatial; temporal; compression; conferencing; assessment; bit; errors

Measurements that quantify perceptual video attributes in both the spatial and temporal domains are extracted from the video that has undergone digital coding and compression. These measurements are then used to compute a single score that quantifies the perceptual impact of the impairments present in the video sequence. This objective score is well-correlated (r=.92) with impairment assessments made by human viewers. A collection of 36 video scenes with a wide range of spatial and temporal complexity were passed through 28 video systems covering the range of compression from 45 Mbs/sec to 56 kbs/sec, including bit errors introduced into the digital transmission channel.

   

Arthur A. Webster; Coleen T. Jones; Margaret H. Pinson; Stephen D. Voran; Stephen Wolf, "An objective video quality assessment system based on human perception," SPIE: Human vision, visual processing and digital display IV, 1-4 February 1993, San Jose California

Abstract & keywords

   

Keywords: objective video quality assessment; subjective video quality assessment

The Institute for Telecommunication Sciences (ITS) has developed an objective video quality assessment system that emulates human perception. The system returns results that agree closely with quality judgements made by a large panel of viewers. Such a system is valuable because it provides broadcasters, video engineers and standards organizations with the capability for making meaningful video quality evaluations without convening viewer panels. The issue is timely because compressed digital video systems present new quality measurement questions that are largely unanswered. The perception-based system was developed and tested for a broad range of scenes and video technologies. The 36 test scenes contained widely varying amounts of spatial and temporal information. The 27 impairments included digital video compression systems operating at line rates from 56 kbits/sec to 45 Mbits/sec with controlled error rates, NTSC encode/decode cycles, VHS and S-VHS record/play cycles, and VHF transmission. Subjective viewer ratings of the video quality were gathered in the ITS subjective viewing laboratory that conforms to CCIR Recommendation 500-3. Objective measures of video quality were extracted from the digitally sampled video. These objective measurements are designed to quantify the spatial and temporal distortions perceived by the viewer. This paper presents the following: a detailed description of several of the best ITS objective measurements, a perception-based model that predicts subjective ratings from these objective measurements, and a demonstration of the correlation between the model's predictions and viewer panel ratings. A personal computer-based system is being developed that will implement these objective video quality measurements in real time. These video quality measures are being considered for inclusion in the Digital Video Teleconferencing Performance Standard by the American National Standards Institute (ANSI) Accredited Standards Committee T1, Working Group T1A1.5.

   

Coleen Jones; Stephen Wolf; Margaret Pinson, "Preliminary Results of One-Way Video Delay Measurement Algorithms," ANSI T1A1 Contribution T1A1.5/92-139, July 13, 1992.

Abstract & keywords

   

Keywords: Video; quality; objective; in-service; delay; alignment; motion; real-time; one-way; measure; metric; parameter

Out-of-service measurements can be used to obtain a baseline video system delay. However, in many digital video systems, video delay is dynamic and dependent upon the scene content. This contribution describes an objective measure of video delay that can be made "in-service" and can thus track dynamic changes in the one-way video delay of a transmission channel. The metric correlates motion in the input and output video scene. A block diagram of a potential, real-time implementation is also given.

   

Stephen Wolf, "An Automated Technique for Measuring Transmitted Frame Rate (TFR) and Average Frame Rate (AFR)," ANSI T1A1 Contribution T1A1.5/92-138, July 13, 1992.

Abstract & keywords

   

Keywords: spectrum; performance; measurement; Video; quality; metrics; objective; in-service; frequency; Testing; compression; frame; rate; motion; Fourier

The motion difference images of the original and degraded video sequences is used to dynamically measure the Transmitted Frame Rate (TFR) and the Average Frame Rate (AFR) of a digital video system codec (coder-decoder). The TFR measurement algorithm gives the TFR spectrum, so that in cases where the TFR is adaptive, each TFR being used by the codec is obtained. The AFR measurement algorithm gives the average number of frames per second that were transmitted by the codec during a selected time interval. The TFR and AFR measurements presented here can be used for in-service as well as out-of-service testing of digital video systems.

   

Stephen Voran, "The Effect of Multiple Scenes on Objective Video Quality Assessment," ANSI T1A1 Contribution T1A1.5/92-136, July 13, 1992.

Abstract & keywords

   

Keywords: system; digital; Video; quality; metrics; objective; subjective; correlation; compression; scene; averaging; composite; score

Presents objective to subjective correlation results when the scores from multiple video test scenes are averaged. It is shown that the prediction errors of the objective video quality model are reduced, yielding objective measurement results that more closely approximate the subjective viewing assessments.

   

Stephen Voran; Stephen Wolf, "The Development and Evaluation of an Objective Video Quality Assessment System that Emulates Human Viewing Panels," International Broadcasting Convention (IBC), pp.504-508, 3-7 Jul 1992.

Abstract & keywords

   

Keywords: digital; performance; models; Video; quality; parameters; objective; subjective; correlation; spatial; temporal; compression; conferencing; assessment; bit; errors

Discusses the approach used and the research conducted to develop an objective video quality assessment system that emulates human perception. The system returns results that agree closely with quality judgments made by a large panel of viewers for the subjectively rated video data set that was examined. This data set included 36 test scenes with widely varying amounts of spatial and temporal information and 27 impairments including digital video compression systems operating at line rates from 56 kbs/sec to 45 Mbs/sec with controlled error rates, NTSC encode/decode cycles, VHS and S-VHS record/play cycles, and VHF transmission.

   

Stephen Voran; Stephen Wolf, "Objective Measures of Video Impairment: Analysis of 128 Scenes," ATIS T1 Contribution T1A1.5/92-112, March 25, 1992

Abstract & keywords

   

Keywords: video teleconferencing; Video Telephony; Subjective Quality; Objective Quality

ATIS Committee T1 contribution to standards project “Analog Interface Performance Specifications for Digital Video Teleconferencing/Video Telephony Service and DS3 Television”

   

Arthur Webster; Stephen Wolf, "Spatial and Temporal Information Measures for Video Quality," ANSI T1Q1 Contribution T1Q1.5/92-113, January 22, 1992.

Abstract & keywords

   

Keywords: digital; Video; quality; frequency; spatial; temporal; measures; test; scene; motion; complexity; information; Fourier; distortions

Presents an overview of metrics that can be used to quantify the amount of spatial and temporal information in a video sequence. By applying the metrics to the input and output video, the amount of spatial and temporal information that is lost by the digital video transmission system can be obtained. The spatial metrics are based on the Fourier transform of the image and the temporal metrics are based on the motion difference image.

   

Stephen Voran, "The development of objective video quality measures that emulate human perception," Global Telecommunications Conference, 1991 (GLOBECOM '91. Countdown to the New Millennium. Featuring a Mini-Theme on: Personal Communications Services), pp.1776-1781 vol.3, 2-5 Dec 1991

Abstract & keywords

   

Keywords: digital; performance; models; Video; quality; parameters; objective; subjective; correlation; spatial; temporal; compression; conferencing; motion; information; Sobel; assessment; bit; errors; difference; still; false; edges

Discusses research efforts to derive objective measures of video quality that emulate human perception. The derivation of these metrics involves the following steps: (1) a set of test scenes in selected and distorted, (2) a set of candidate objective measures are extracted, (3) a panel of viewers rates the quality of the same set of test scenes, (3) a simultaneous statistical analysis of the subjective and objective data sets reveals which portion of the objective data is meaningful, and how the objective data should be combined to create an overall metric that emulates human perception. One objective metric that correlates well with subjective quality quantifies the amount of false or extra edges that have been added to the output video. There appears to be some advantages to applying the metric separately to the still and motion portions of the video.

   

Abstract & keywords

   

Keywords: digital; performance; models; Video; quality; parameters; objective; subjective; correlation; spatial; compression; MPEG; motion; information; Sobel; assessment; difference; still; false; edges; VHS

Presents preliminary results on the objective to subjective quality correlation for a set of 5 test scenes passed through three systems (NTSC, VHS, and DS1 MPEG coding). One objective metric that correlates well with subjective quality quantifies the amount of false or extra edges that have been added to the output video. There appears to be some advantages to applying the metric separately to the still and motion portions of the video.

   

Stephen Voran; Stephen Wolf, "Motion-Still Segmentation Algorithm for VTC/VT Objective Quality Assessment," ANSI T1Q1 Contribution T1Q1.5/91-110, January 22, 1991.

Abstract & keywords

   

Keywords: telephony; noise; Video; quality; objective; subjective; resolution; spatial; dynamic; teleconferencing; motion; Sobel; still; static; segmentation; erode; dilate; region; growing; threshold

The ability of the human eye to resolve detail in a video scene is related to how much motion is present at the point of focus and whether or not the eye can track the motion. Thus, stationary portions of the video scene can be resolved in great detail by the eye, while moving portions of the video scene are normally resolved in less detail (provided the eye cannot fully track the motion). Low bit-rate digital video channels (e.g., video teleconferencing) determine how many bits are used for each local area of the video scene. Since the time averaged information content of a still video scene is much less than the time averaged information content of a moving scene, typical video teleconferencing channels can have very different static and dynamic responses. The dynamic response of the channel is also a function of the video scene and can vary on a frame-by-frame basis. Thus, it is desirable to have a general algorithm (applicable to any test waveform or test scene) that can separate the dynamic response from the static response on a frame by frame basis. This contribution describes one such algorithm. The motion-still segmentation algorithm presented here can be applied to any test waveform or test scene in order to separate the moving portions from the still portions of the video scene. A detailed description of the motion-still segmentation algorithm and its theoretical basis is given first. Then, a typical application of the algorithm is presented: measuring the increased spatial blurring of moving objects in an actual video teleconferencing scene.

   

Stephen Wolf, Features for automated quality assessment of digitally transmitted video, NTIA Technical Report TR-90-264, June 1990

Abstract & keywords

   

Keywords: ANSI; American National Standards; feature extraction; image processing; video quality; video teleconferencing

This report describes an automated method of video quality assessment based on extraction and classification of features from sampled input and output video. The first subsystem of the automated video quality measurement system is the feature extraction subsystem. Features are extracted from the sampled video that quantify many of the distortions present in modern digital compression and transmission systems. The feature measurements may then be injected into a quality classification subsystem which will determine the overall quality rating of the video. This report discusses the first subsystem of the automated video quality assessment system, namely the feature extraction subsystem. The measurement techniques used to extract a number of useful features are discussed in detail. Results are presented using sampled video teleconferencing data that contained common video compression artifacts.