5i' PART I Series H Recommendations LINE TRANSMISSION OF NON-TELEPHONE SIGNALS Blanc MONTAGE: PAGE 2 = PAGE BLANCHE LINES USED FOR THE TRANSMISSION OF SIGNALS OTHER THAN TELEPHONE SIGNALS, SUCH AS TELEGRAPH, FACSIMILE, DATA, ETC., SIGNALS Part I contains two classes of Recommendations: those which define the characteristics of transmission channels (telephone-type, group, supergroup, etc., circuits) used only to transmit signals other than telephone signals, and those which define the characteristics of the signals used in such transmis- sions. In this Part, "wideband" is used to qualify the transmission channels, and "wide-spectrum" the signals transmitted, so as to avoid any confusion between the transmission channels and the sig- nals transmitted with regard to the frequency bands involved in transmission over group links, supergroup links, etc. As far as possible, one should avoid specifying the charac- teristics of particular channels or signals in defining a new ser- vice and refer only to the characteristics of the channels _________________________ Excluding the transmission of sound-programme and television signals, which is the subject of the Series J Recommendations. mentioned in Section 1 of this Recommendation Series. Section 6 of this Series is reserved for Recommendations con- cerning the characteristics of visual telephone systems. Table 1 indicates the correspondence of Series H Recommenda- tions to Recommendations of other Series. TABLE 1 Series H Recommendations Recommendations of other Series H.12, S 1 M.1040 (Volume IV) H.12, S 2 M.1025 (Volume IV) H.12, S 3 M.1020 (Volume IV) H.13 See Recommendation O.71 (Volume IV) H.14, S 2 M.910 (Volume IV) H.16 O.72 (Volume IV) H.21 See also the Recommendations M.800 (Volume IV) and R.77 (Volume VII) H.22 See also the Recommendation M.810 (Volume IV) H.23 Extract of Recommendations R.31 and R.35 (Volume VII) H.32 R.43 (Volume VII) H.41 T.11 (Volume VII) H.42 T.12 (Volume VII) H.43 T.10 (Volume VII) H.51 V.2 (Volume VIII) MONTAGE: PAGE 4 = BLANCHE SECTION 1 LINES USED FOR THE TRANSMISSION OF SIGNALS OTHER THAN TELEPHONE SIGNALS, SUCH AS TELEGRAPH, FACSIMILE, DATA, ETC., SIGNALS 1.1 Characteristics of transmission channels used for other than telephone purposes Recommendation H.11 CHARACTERISTICS OF CIRCUITS IN THE SWITCHED TELEPHONE NETWORK (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.12 CHARACTERISTICS OF TELEPHONE-TYPE LEASED CIRCUITS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.13 CHARACTERISTICS OF AN IMPULSIVE NOISE MEASURING INSTRUMENT FOR TELEPHONE-TYPE CIRCUITS (The text of this Recommendation can be found in Recommendation O.71 in | Fascicle IV.4 of Volume IV of the Red Book , ITU, Geneva, 1985) Recommendation H.14 CHARACTERISTICS OF GROUP LINKS FOR THE TRANSMISSION OF WIDE-SPECTRUM SIGNALS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.15 CHARACTERISTICS OF SUPERGROUP LINKS FOR THE TRANSMISSION OF WIDE-SPECTRUM SIGNALS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.16 CHARACTERISTICS OF AN IMPULSIVE-NOISE MEASURING INSTRUMENT FOR WIDEBAND DATA TRANSMISSION (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) 1.2 Use of telephone-type circuits for voice-frequency teleg- raphy Recommendation H.21 COMPOSITION AND TERMINOLOGY OF INTERNATIONAL VOICE-FREQUENCY TELEGRAPH SYSTEMS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.22 TRANSMISSION REQUIREMENTS OF INTERNATIONAL VOICE-FREQUENCY TELEGRAPH LINKS (AT 50, 100 AND 200 BAUDS) (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.23 BASIC CHARACTERISTICS OF TELEGRAPH EQUIPMENTS USED IN INTERNATIONAL VOICE-FREQUENCY TELEGRAPH | fR SYSTEMS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) 1.3 Telephone circuits or cables used for various types of telegraph transmission or for simultaneous transmissions Recommendation H.32 SIMULTANEOUS COMMUNICATION BY TELEPHONY AND TELEGRAPHY ON A TELEPHONE-TYPE CIRCUIT (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.34 SUBDIVISION OF THE FREQUENCY BAND OF A TELEPHONE-TYPE CIRCUIT BETWEEN TELEGRAPHY AND OTHER SERVICES (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) 1.4 Telephone-type circuits used for facsimile telegraphy Recommendation H.41 PHOTOTELEGRAPH TRANSMISSIONS ON TELEPHONE-TYPE CIRCUITS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.42 RANGE OF PHOTOTELEGRAPH TRANSMISSIONS ON A TELEPHONE-TYPE CIRCUIT (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.43 DOCUMENT FACSIMILE TRANSMISSIONS ON LEASED TELEPHONE-TYPE CIRCUITS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) 1.5 Characteristics of data signals Recommendation H.51 POWER LEVELS FOR DATA TRANSMISSION | OVER TELEPHONE LINES (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.52 TRANSMISSION OF WIDE-SPECTRUM SIGNALS (DATA, FACSIMILE, ETC.) ON WIDEBAND GROUP LINKS (The text of this Recommendation can be found in Fascicle III.4 of the Red Book , ITU, Geneva, 1985) Recommendation H.53 TRANSMISSION OF WIDE-SPECTRUM SIGNALS (DATA, ETC.) OVER WIDEBAND SUPERGROUP LINKS (The text of this Recommendation can be found in Fascicle III.4 | of Volume III of the Red Book , ITU, Geneva, 1985) SECTION 2 CHARACTERISTICS OF VISUAL TELEPHONE SYSTEMS Recommendation H.100 VISUAL TELEPHONE SYSTEMS (former Recommendation H.61, Geneva, 1980; amended at Malaga-Torremolinos, 1984 and at Melbourne, 1988) 1 Definition The visual telephone service is generally a two-way telecom- munication service which uses a switched network of broadband analogue and/or digital circuits to establish connections among subscriber terminals, primarily for the purpose of transmitting live or static pictures. Special application one-way systems, e.g. surveillance and some information retrieval systems, or a non-switched videoconfer- ence service, can be regarded as degenerate cases of the visual telephone service. The visual telephone service also includes the associated speech. 2 Facilities to be offered The design of the visual telephone service shall be such as to offer at least the following basic facilities: a) Transmission of live pictures such as head and shoulders of one person or a small group of persons, with moderate definition. b) Transmission of the associated speech. c) Transmission of graphics information such as drawings and documents with high definition (e.g. 625 lines or 525 lines). d) Video conference service, with or without the use of split-screen techniques. The above-mentioned services shall, in general, be bi-directional, although uni-directional operation should be possi- ble. Also, some of the facilities can be omitted, if not required, in order to minimize costs. Note - At the subscriber terminal, the use of ancillary equipments, e.g. for document reproduction, video tape recordings, etc., shall be possible. 3 System parameters 3.1 Picture standards 3.1.1 The video standards of the subscriber sets shall be com- patible with, readily convertible to, or identical to, the local broadcast television standards. 3.1.2 Two classes of picture standards are recommended for the visual telephone system. They are given in Table 1/H.100. H.T. [T1.100] TABLE 1/H.100 Picture standards ___________________________________________ { { Class Items ___________________________________________ ___________________________________________ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Table 1/H.100 [T1.100], p. Class a standards are identical to the local broadcast video standards and will, in most cases, give sufficient definition for real-time picture transmission of a group of people (e.g. for con- ferencing) and of graphics material. Class b standards give sufficient definition for real-time transmission of a head and shoulder picture of one person or a small group. For the transmission of graphics information or other still pictures with high definition, a slow-scan technique has to be applied. For instance, a system using 625 or 525 horizontal scanning lines and 5, or less, pictures per second which gives a Class a definition in the 1 MHz bandwidth. Further study is required to define slow scanning parameters. 4 Characteristics relating to split-screen techniques for Class a television conference systems In television conference systems which use split-screen tech- niques to make more effective use of the picture area, the follow- ing features for the terminals and transmitted signals are recom- mended. Preferred seating arrangement for such systems are given in Annex A. 4.1 Picture format The transmitted picture should be 4 | | aspect ratio , split into upper and lower halves corresponding to the groups of seats. Viewed from the camera system, the left-hand group should be in the upper half and the right-hand group in the lower half. The split should occur at the end of lines 166 and 479 for 625-line television systems and at the end of line 142 in Field 1 and line 141 in Field 2 for 525-line television systems, as shown in Figure 1/H.100. Before display, the receive equipment may discard half lines and first and last lines which are liable to be averaged during standards conversion or vertical aperture correction of mixed sig- nals. 4.2 Identification signal for split-screen system 4.2.1 Analogue video signals _________________________ Split-screen techniques for systems using Class b stan- dards require future study. The identification signal for split-screen system should be inserted in the vertical blanking period, because the control is required for each television frame or field. The line where the identification signal is inserted and its signal format are under study. 4.2.2 Digital video signals An identification signal for split-screen system should be provided. In the case of codecs in Recommendations H.120 and H.130 the format shall be that specified in Recommendation H.130. 4.3 Compatibility with non-split-screen systems The simplest kind of a video telephone terminal is composed of a single camera and other equipments. These terminals may be inter- connected with split-screen system terminals. In that case, mechan- ical masks (if used) for the two split-screen displays (aspect ratio = 4 | | .5) need to be removed, or if a display with 4 | | aspect ratio needs to be installed additionally. 4.4 Cameras and displays arrangement The entrance pupils of the TV camera optical system should be as near as possible to the centre of the TV display showing remote conferees, in order to minimize errors in eye contact angle. Unless means are employed to place these pupils in line with the display, e.g. by use of half-silvered mirrors, the camera sys- tem should be sited above the display and central to it. In order to keep the maximum horizontal errors as small as possible, the cameras used had better be in a cross-fire system , as for example in Figure A-1/H.100, and the camera/display assembly should be sited on the central axis of the terminal. However, in some cases, adoption of parallel-fire system as shown also in Figure A-1/H.100 is necessary due to a restriction in equipment arrangement. Whether the two cameras are arranged in cross-fire or parallel-fire is left open to each Administration since the selec- tion does not affect the interconnection of different systems. 4.5 Picture processing methods at transmitting terminals In order to obtain the correct relationship between the sig- nals from the two cameras for split-screen working, the cameras should be synchronized but the vertical drive pulses should be rephased. The drive to one should be advanced by one quarter of the vertical period while the drive to the other should be retarded by the same amount. This causes a central strip of the target of each camera tube to be used and so minimizes the effects of distortions in the corners of the targets. Figure B-1a) /H.100 illustrates the preferred method. Alternative methods which are not recommended although they do not give rise to problems of end-to-end compatibility are compared in Annex B. 4.6 Receiving equipment The receiving equipment should be capable of working with discontinuities in the received signal that may be caused by switching between non-synchronous video sources. Note - A split-screen device should be capable of working with a codec with the input and output frequency tolerances as specified in Recommendation H.120. FIGURE 1/H.100, p. ANNEX A (to Recommendation H.100) Seating arrangements when applying split-screen techniques for class a system Preferred arrangements for video conferences using split-screen techniques are: A.1 The conference terminal accommodation should be for 6 pri- mary seats in two adjacent groups of 3 as shown in Figure A.1/H.100. Provision for additional seating behind may be made, so long as allowance is made for the central gap between the two halves. For example, 4 additional persons may be seated on a second row as in the Figure. A.2 The chairman's position should be in the centre of the left-hand group of seats (viewed from the camera) with user con- trols accessible from both this position and the one of the chairman's left. Consequently, when split-screen pictures are displayed, stacked as received (i.e. shown as 3 over 3), the chairman's posi- tion is standardized as top centre. The suite of 3 chairs containing the chairman's position should also be regarded as the primary position for occasions when only half of a studio is in use. Such standardization is necessary for connection of 3 studios in conference using time-division mul- tiplex of pairs of TV signals to share a common trunk between two studios. Figure A-1/H.100, p. ANNEX B (to Recommendation H.100) Picture processing methods in transmitting terminals Alternative methods of obtaining the split-screen signal which are compatible with the recom- mended method and which might be useful for experiments and demonstrations are shown in b) and c) of Figure B-1/H.100. In method b) , the two cameras are directed upward and downward to pick up right and left halves of the con- ferencing room, respectively. Since circumferences of target and scanning areas are used, geometric and brightness distortions tend to occur. In method c) , vertical deflection currents are biased by the quantity corresponding to _ | /4 of target height. Vertical deflection bias adjustment is needed every time cameras are exchanged. In method a) , the vertical driving pulses are phase-shifted by _ | /4 V. The recommended method, a) , avoids the problems of methods b) and c) Figure B-1/H.100, p. Recommendation H.110 HYPOTHETICAL REFERENCE CONNECTIONS FOR VIDEOCONFERENCING USING PRIMARY DIGITAL GROUP TRANSMISSION (Malaga-Torremolinos, 1984; amended at Melbourne, 1988) The CCITT, considering (a) that there is growing evidence of a customer demand for a videoconference service; (b) that circuits to meet this demand can, at present, be pro- vided effectively by digital transmission using the primary digital group; (c) that switched digital transmission networks known as the Integrated Digital Network (IDN) and Integrated Services Digital Network (ISDN) are under study, but the methods of exploiting these networks for the transmission of primary digital groups will not become clear until the studies have progressed further; (d) that the existence of different digital hierarchies and different television standards in different countries complicates the problems of defining hypothetical reference connections; (e) that a hypothetical reference connection may be used as a guide to simplify the problems of connections between countries with different television standards and digital hierarchies, appreciating that rapid advances are being made in research and development of video coding and bit-rate reduction techniques which may lead to further Recommendations being proposed for hypothetical reference connections for videoconferencing at bit rates which are multiples or sub-multiples of 384 kbit/s during subsequent study periods, so that this may be considered as the first of an evolving series of Recommendations, noting (a) that a hypothetical reference connection is a model in which studies relating to overall performance may be made, thereby allowing comparisons with standards and objectives; on this basis, limits for various impairments can be allocated to the elements of the connection; (b) that such a model may be used: - by an Administration to examine the effects on transmission quality of possible changes of impairment allocations in national networks, - by the CCITT for studying the allocation of impairments to component parts of international networks, - to test national rules for prima facie compli- ance with any impairment criteria which may be recommended by the CCITT for national systems; (c) that hypothetical reference connections are not to be regarded as recommending particular values of impairments allocated to constituent parts of the connection, and they are not intended to be used for the design of transmission systems, and recognizing that the planning of the necessary transmission networks for a videoconference service will be facilitated if recommended hypothetical reference connections are available, even if only in a preliminary form without details of all transmission and switching arrangements, recommends The term "intra-regional" is used here to describe connections within a group of countries which share a common television scan- ning standard and a common digital hierarchy, and may or may not be in geographical proximity. The term "inter-regional" is used here to describe connections between groups of countries which have dif- ferent television scanning standards and/or different digital hierarchies. (1) that the hypothetical reference connection and means for digi- tal transmission illustrated in Figures 1/H.110 and 2/H.110 shall be used as the model for studies of the overall performance of international videoconference connections, both intra-regional and inter-regional , which are provided using minimum numbers of encod- ing and decoding equipments; (2) that hypothetical reference connections of a more complex type, as, for example, those illustrated in Figure 3/H.110, being representative of many connections that may be employed in prac- tice, should be studied further. Note 1 - The hypothetical reference connection shown in Figure 1/H.110 contains the basic transmission elements, but is incomplete because switching has been excluded and the local ends and parts of the national network at each end of the connection have been left unspecified. Note 2 - Because the arrangements of transmission systems interconnecting regions using different digital hierarchies have not yet been standardized, and because videoconferencing is likely to be a minority service in such transmission systems, it seems prudent to consider videoconference connections both where the primary hierarchical level on the inter-regional link is 1.5 Mbit/s and where it is 2 Mbit/s. In Figure 2b/H.110, the change between 2048 kbit/s and 1544 kbit/s transmission is placed at the 2048 kbit/s end of the long international network. The long distance part of the connection is thus operated at the lower bit rate. Where the international network is provided on a system which uses the 2048 kbit/s hierarchy, Figure 2c/H.110 maintains the efficien- cies offered by the arrangement shown in Figure 2b/H.110, by mak- ing available the six vacated time slots for other use. Figure 2d/H.110 offers the possibility of improved picture quality compared with Figures 2b/H.110 and 2c/H.110 by making full use of the available 2048 kbit/s for the videoconferencing signal. This arrangement would require a 2048 kbit/s codec compatible with 525-line Video Standards, or the use of an external standards con- verter. This is for further study. Note 3 - The lengths which have been assigned to the parts of the connections have been arbitrarily chosen, but have some con- sistency with existing CCITT and CCIR Recommendations. They are intended to be representative of long international connections, but not the longest possible. The lengths will likely require revi- sion when studies on the error rates of digital paths have pro- gressed to the stage when the error rates of the paths used in the connections can be predicted. Note 4 - The propagation delay is one of the main factors to be studied based upon the structures and lengths of the connections in Figures 1/H.110, 2/H.110 and 3/H.110. However, in the absence of subjective test results, the specification of requirements for videoconferencing connections must await further study. This study and particularly operational experience are required to determine the extent to which Recommendation G.114, which applies to tele- phone connections, relates to videoconferencing connections. Note 5 - In Figures 1/H.110 and 3/H.110, the codecs may be located anywhere within the international or national networks including the international gateway or the customer's premises. Note 6 - The extensions beyond the codec shown as A or D in Figures 1/H.110 and 3/H.110 may include wideband analogue or high-speed digital transmission systems on terrestrial bearers. It is not expected that these transmission systems will have any sig- nificant influence on the quality of the picture or sound, or, on the propagation delay, other than that due to their length. Note 7 - For inter-regional operation, television standards conversion between 525-line and 625-line video signals may be required. This conversion may be performed by the codecs them- selves, or provided by external equipment. Note 8 - The arrangements shown in Figure 2/H.110 provide for the simplest means of transmission. More complex means are possible and are not precluded. Note 9 - The hypothetical reference connection shown in Figure 3/H.110 is of a more complex type than the connection shown in Figure 1/H.110, in that it includes codecs in cascade, and, possibly an external Television Standards Converter. The picture quality attainable with these more complex connections may be degraded with respect to that attainable using the connection illustrated in Figure 1/H.110. This and other aspects of the more complex connection must be studied further. Figure 1/H.110 and symbols for Figure 1/H.110, p. Figure 2/H.110 (a l'italienne), p. Symbols for Figure 2/H.110, p. FIGURE 3/H.110, p. 8 SYMBOLES DE LA FIGURE 3/H.110, p. 9 Recommendation H.120 CODECS FOR VIDEOCONFERENCING USING PRIMARY DIGITAL GROUP TRANSMISSION (Malaga-Torremolinos, 1984; amended at Melbourne, 1988) The CCITT, considering (a) that there is growing evidence of a customer demand for a videoconference service; (b) that circuits to meet this demand can, at present, be provided effectively by digital transmission using the primary digital group; (c) that the existence of different digital hierarchies and different television standards in different parts of the world com- plicates the problems of specifying coding and transmission stan- dards for international connections; (d) that the eventual use of switched digital transmission networks should be taken into account, appreciating that rapid advances are being made in research and development of video coding and bit-rate reduction techniques which may lead to further Recommendations being proposed for videoconferencing at bit rates which are multiples or submultiples of 384 kbit/s during sub- sequent study periods so that this Recommendation may be considered as the first of an evolving series of Recommendations, and noting that it is a basic objective of the CCITT to recommend a unique solution for international connections as far as possible, recommends that the codecs having signal processing and interface charac- teristics described in SS 1, 2 and 3 below, should be used for international videoconference connections. Note - Codecs of types other than those described in this Recommendation are not precluded. Introduction Section 1 of this Recommendation specifies the codec, developed for operation with the 625-line, 50 field/s television standard and the 2048 kbit/s primary digital group. Its architec- ture has been chosen to permit variations in the detailed design of certain of the functional elements having the greatest influence on the picture quality. This enables future developments, aimed at improving the performance, to be incorporated without affecting the ability of different coders and decoders to interwork. For this reason, no details are given of such items as motion detectors or spatial and temporal filters. The Recommendation confines itself to the details necessary to enable a decoder correctly to interpret and decode the received signals. The annexes to S 1 which can be found at the end of this Recommendation give details of some additional optional features which may be provided to supplement the basic design. Under the general heading of codecs not requiring separate television standards conversion when used on interregional connec- tions, S 2 describes a version of the codec for 525 line, 60 field/s and 1544 kbit/s operation which also provides automatic television standards conversion when connected to the version of the codec described in S 1 via a re-multiplexing unit (to convert between frame structures defined in SS 2.1 and 2.3 of Recommendation G.704) at the junction of the 2048 and 1544 kbit/s digital paths. This codec is also suitable for use within regions using the 525-line, 60 field/s television standard and 1544 kbit/s transmission. Other implementations of S 2 are to be studied, for example: - a version of the codec for 625-line, 50 field/s and 2048 kbit/s operation capable of interworking with the codec described in S 3; - a version of the codec for 525-line, 60 field/s and 2048 kbit/s operation capable of interworking with the codec described in S 1. Section 3 of the Recommendation describes a codec for intra-regional use in 525-line, 60 field/s and 1544 kbit/s regions. The frame structures associated with the codecs described in this Recommendation are to be found in Recommendation H.130. As the codecs are complex items using combined intraframe and interframe picture-coding techniques which tend to be known only to specialists, Appendix I is provided giving a brief outline of the principles involved in the codecs of SS 1 and S 2. 1 A codec for 625-lines, 50 fields/s and 2048 kbit/s transmission for intra-regional use and capable of interworking with the codec of S 2 1.1 Scope _________________________ The term "intra-regional" is used here to describe con- nections within a group of countries which share a com- mon television scanning standard and a common digital hierarchy, and may or may not be in geographical prox- imity. The term "inter-regional" is used here to describe connections between groups of countries which have different television scanning standards and/or different digital hierarchies. Section 1 defines the essential features of a codec for the digital transmission, at 2048 kbit/s, of signals for videoconfer- ence or visual telephone service in accordance with Recommendation H.100. The video input to the coder and output from the decoder in a 625-line, 50 field/s signal, according to the "Class a " standard of Recommendation H.100, or alternatively, the 313-line, 50 field/s signal of the "Class b " standard. Provision is also made for a sound channel and optional data channels. A brief description of the operation of the codec is given in Appen- dix I. The Recommendation starts with a brief specification of the codec (S 1.2) and a description of the video interface. This is followed by details of the source coder (S 1.4) which provides analogue-to-digital conversion followed by recoding with substan- tial redundancy reduction in the face-to-face mode. Paragraph 1.5 deals with the video multiplex coder which inserts instructions and addresses into the digitized video signal to control the decoder so that it correctly interprets the signals received. Paragraph 1.6 describes the transmission coder which arranges the various digital signals (video, sound, data, signalling) into a form compatible with Recommendation G.732 for transmission over 2048 kbit/s digital paths. Paragraph 1.7 describes optional forward error correction facilities. Provision is made in the digital frame structure for the inclusion of other optional facilities such as a graphics mode, encryption and multipoint conferencing. Details of such facilities as are at present available are given in the annexes to this Recom- mendation. 1.2 Brief specification 1.2.1 Video input/output The video input and output are standard 625-line, 50 field/s colour or monochrome television signals. The colour signals are in, or converted to, component form. Colour and monochrome operation are fully compatible. 1.2.2 Digital output/input The digital output and input are at 2048 kbit/s, compatible with the frame structure of Recommendation G.704. 1.2.3 Sampling frequency The video sampling frequency and the 2048 kHz network clock are asynchronous. 1.2.4 Coding techniques Conditional replenishment coding supplemented by adaptive digital filtering, differential PCM and variable-length coding are used to achieve low bit-rate transmission. 1.2.5 Audio channel An audio channel using 64 kbit/s is included. At present, cod- ing is A-law according to Recommendation G.711, but provision is made for future use of more efficient coding. 1.2.6 Mode of operation The normal mode of operation is full duplex. 1.2.7 Codec-to-network signalling An optional channel for codec-to-network signalling is included. This conforms to emerging ideas in CCITT for switching 2-Mbit/s paths in the ISDN. 1.2.8 Data channels Optional 2 x 64 kbit/s and 1 x 32 kbit/s data channels are available. These are used for video if not required for data. 1.2.9 Forward error correction Optional forward error correction is available. This is required only if the long-term error rate of the channel is worse than 1 in 106. 1.2.10 Additional facilities Provision is made in the digital frame structure for the future introduction of encryption, a graphic mode and multipoint facilities. 1.2.11 Propagation delay When the coder buffer is empty and the decoder buffer full, the coder delay is less than 5 ms and the decoder delay is 130 _ 30 ms at 2 Mbit/s or 160 _ 36 ms when only 1.5 Mbit/s are in use 1.3 Video interface The normal video input is a 625-line, 50 field/s signal in accordance with CCIR Recommendation 472. When colour is being transmitted, the input (and output) video signals presented to the analogue/digital convertors (and from the digital/analogue conver- tors) are in colour-difference component form. The luminance and colour-difference components, E`Y, (E`R - E`Y) and (E`B - E`Y) are as defined in CCIR Report 624. The analogue video input (and out- put) interface with the codec may be in the form of colour-difference components, colour components (R, G, B) or as a composite colour signal. The video interface is as recommended in CCIR Recommendation 656. Optionally, any other video standard which can be converted to give 143 active lines per field may be used. 1.4 Source coder 1.4.1 Luminance component or monochrome 1.4.1.1 Analogue-to-digital conversion The signal is sampled to produce 256 picture samples per active line (320 samples per complete line). The sampling pattern is orthogonal and line, field and picture repetitive. For the 625-line input, the sampling frequency is 5.0 MHz, locked to the video waveform. Uniformly quantized PCM with 8 bits/sample is used. Black level corresponds to level 16 (00010000). White level corresponds to level 239 (11101111). PCM code words outside this range are forbidden (the codes being used for other purposes). For the purposes of prediction and interpolation, the final picture element in each active line (i.e. picture element 255) is set to level 128 in both encoder and decoder. _________________________ These are typical figures. The delays depend on the de- tailed implementation used. In all arithmetic operations, 8-bit arithmetic is used and the bits below the binary point are truncated at each stage of divi- sion. 1.4.1.2 Pre- and post-filtering In addition to conventional anti-aliasing filtering prior to analogue-to-digital conversion, a digital transversal filtering operation is carried out on the 625-line signal to reduce the vert- ical definition of the picture prior to conditional replenishment coding instead of the 2871/2 active lines of the 625-line signal, although the effective vertical definition is greater than one-half of that of a normal 625-line display. An interpolation process in the decoder restores the 625-line signal waveform. 1.4.1.3 Conditional replenishment coding A movement detector identifies clusters of picture elements which are deemed to be moving. The basic feature is a frame memory which stores 2 fields of 143 lines, each line containing 256 addressable points. The memory is updated at the picture rate and differences between the incoming signal and the corresponding stored values are used to determine the moving area in the coder. A similar frame memory must exist at the decoder and be similarly updated under the control of addressing information received from the coder. It is not necessary to specify the techniques used for movement detection because they do not affect interworking, although they do affect the resultant picture quality. Detected moving areas are transmitted by differential PCM with a maximum of 16 quantization levels. The first picture element in each moving area is transmitted by PCM. Variable-length coding is used on the DPCM code words. The first picture element of each cluster and the complete PCM lines, when they are transmitted to provide systematic or forced updating, are coded in accordance with S 1.4.1.1. 1.4.1.3.1 DPCM prediction algorithm The algorithm used for DPCM prediction is: X = [Formula Deleted] , where X is the sample being predicted. (See Figure 1/H.120.) Figure 1/H.120, p. For the purpose of prediction, line and field blanking are assumed to be at level 128 (out of 256). 1.4.1.3.2 Quantization law and variable-length coding 511 input levels are quantized to a maximum of 16 output lev- els. The quantizer does not assume the use of modulo 256 arith- metic. The quantization law and associated variable-length codes which are used for both luminance and colour-difference picture elements in moving areas which are not horizontally subsampled are given in Table 1/H.120. H.T. [T1.120] TABLE 1/H.120 Code table for non-horizontally-subsampled moving areas ________________________________________________________________ Input levels Output levels Variable-length code Code No. ________________________________________________________________ -255 to -125 -141 1 0 0 0 0 0 0 0 0 1 17 -124 to - 95 -108 1 0 0 0 0 0 0 0 1 16 - 94 to - 70 - 81 1 0 0 0 0 0 0 1 15 - 69 to - 49 - 58 1 0 0 0 0 0 1 14 - 48 to - 32 - 39 1 0 0 0 0 1 13 - 31 to - 19 - 24 1 0 0 0 1 12 - 18 to - 9 - 13 1 0 1 10 - 8 to - 1 - 4 1 1 9 0 to 7 + 3 0 1 1 8 to 17 + 12 0 0 1 2 18 to 30 + 23 0 0 0 1 3 31 to 47 + 38 0 0 0 0 1 4 48 to 68 + 57 0 0 0 0 0 1 5 69 to 93 + 80 0 0 0 0 0 0 1 6 94 to 123 +107 0 0 0 0 0 0 0 1 7 124 to 255 +140 0 0 0 0 0 0 0 0 1 8 ________________________________________________________________ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Table 1/H.120 [T1.120], p. The end-of-cluster code is 1 0 0 1 and is designated as code number 11. The end-of-cluster code is omitted at the end of the last cluster in a line irrespective of whether it is a luminance cluster or a colour-difference cluster 1.4.1.4 Subsampling As the buffer fills, horizontal subsampling and field/field subsampling are introduced. 1.4.1.4.1 Horizontal subsampling Horizontal subsampling is carried out only in moving areas. Normally, in this mode, only even elements are transmitted on even numbered lines and odd elements on odd numbered lines. This gives rise to a line quincunx pattern in moving areas. Omitted elements are interpolated in the decoder by averaging the two horizontally adjacent elements. Interpolated picture elements are placed in the frame stores. A moving area cluster will always start with a PCM value and finish with a transmitted DPCM picture element, even during subsampling. This means that in in some instances, the transmitted cluster needs to be extended by one element in comparison with the moving area declared by the movement detector. At the end of the active line, however, this cannot occur as clusters must not extend into blank- ing, so cluster shortening by one element can be necessary. Adaptive element subsampling allows the transmission of nor- mally omitted elements, either to remove interpolation errors or, to provide a softer switch to subsampling and thus improve the pic- ture quality. The signalling of the extra elements is achieved by using, on horizontally subsampled lines only, 8 quantizing levels for normally transmitted elements and the remaining 8 levels for the extra elements. Also, a cluster can finish either on a normally transmitted element or an "extra" element. During horizontally subsampled lines, the quantization law and variable-length code shown in Table 2/H.120 will be used for both luminance and colour-difference samples in moving areas. H.T. [T2.120] TABLE 2/H.120 Quantization law and variable-length code table _______________________________________ Quantization Variable-length codes _______________________________________ | | | | | | | | | ____________________________________________________________________________________________ Input range Output levels Normal elements Code No. Extra elements Code No. ____________________________________________________________________________________________ -255 to -41 -50 1 0 0 0 0 0 0 1 15 1 0 0 0 0 0 0 0 0 1 17 - 40 to -24 -31 1 0 0 0 0 1 13 1 0 0 0 0 0 0 0 1 16 - 23 to -11 -16 1 0 1 10 1 0 0 0 0 0 1 14 - 10 to - 1 - 5 1 1 9 1 0 0 0 1 12 0 to + 9 + 4 0 1 1 0 0 0 1 3 10 to 22 +15 0 0 1 2 0 0 0 0 0 1 5 23 to 39 +30 0 0 0 0 1 4 0 0 0 0 0 0 0 1 7 40 to 255 +49 0 0 0 0 0 0 1 6 0 0 0 0 0 0 0 0 1 8 ____________________________________________________________________________________________ | | | | | | | | | | | | | | Table 2/H.120 [T2.120], p. With regard to prediction, if element A is a non-transmitted element in a moving area, it is replaced by As(see Figure 1/H.120); if element D is part of a subsampled moving area, and not transmit- ted in the current frame, it is replaced by C. 1.4.1.4.2 Field/field subsampling Either field can be omitted. In the omitted field, interpola- tion takes place only in those parts of the picture which are estimated to be moving. "Stationary" areas remain unchanged. The estimated moving areas are formed from an OR function on the moving areas in the past and future fields, as shown in Figure 2/H.120. In the figure, x is a moving element if a OR b OR c OR d are moving. Figure 2/H.120, p. For the purpose of field interpolation, PCM lines are con- sidered as non-moving and field blanking is assumed to be at a level of 128 out of 256. In the interpolator for monochrome or luminance signals, the operations [Formula Deleted] carried -v'8p' out before the combined average is taken. Thus x = [Formula Deleted] The interpolated values are placed in the frame store. 1.4.2 Colour-difference components 1.4.2.1 Analogue-to-digital conversion The signal is sampled to produce 52 samples per active line (64 samples per complete line). The sampling pattern is orthogonal and line-, field- and picture-repetitive. For the 625-line input, the sampling frequency is 1.0 MHz, locked to the video waveform. The (E`R - E`Y) and (E`B - E`Y) samples are sited so that the centre of the first colour-difference sample on any line is co-sited with the centre of the third luminance sample (addressed as number 2). The (E`R - E`Y) and (E`B - E`Y) signals are stored and transmitted on alternate lines of the coded picture. The first active line of Field No. 1 contains (E`B - E`Y) and the first active line of Field No. 2 contains (E`R - E`Y). The colour differ- ence signal not being transmitted during any line is obtained at the decoder by interpolation. The vertical filtering (see S 1.4.2.2) is arranged so that the effective vertical positions of the colour-difference samples in each of the 286 active lines coincide with those of the correspond- ing luminance samples. Uniformly quantized PCM with 8 bits/sample is used. The (E`R - E`Y) and (E`B - E`Y) signals are quantized using _ | 11 steps with zero signal corresponding to level 128. The analo- gue video signals are amplitude-limited so that the digitized sig- nals do not go outside that range (corresponding to levels 16 to 239). The video levels are set so that a 100/0/75/0 colour bar signal (see CCIR Recommendation 471 for explanation of nomencla- ture) will occupy levels 17 to 239. As for the luminance signal, forbidden PCM code words are available for purposes other than transmitting video sample ampli- tudes. 1.4.2.2 Pre- and post-filtering In addition to conventional anti-aliasing filtering prior to analogue-digital conversion, a digital transversal filtering opera- tion is carried out on the 625-line signal to reduce the vertical definition of the picture prior to conditional-replenishment cod- ing. As a result of this process, 72 active lines of (E`R - E`Y) and 71 active lines of (EB - E`Y) are used in Field No. 2 instead of 2871/2 active lines per field of a 625-line signal. Similarly, Field No. 1 contains 72 active lines of (E`B - E`Y) and 71 lines of (E`R - E`Y). An interpolation process in the decoder restores the 625-line signal waveforms. 1.4.2.3 Conditional replenishment coding Coloured moving areas are detected, coded and addressed separately from the luminance moving areas , but the same princi- ples are employed. Detected moving areas are transmitted by differential PCM with a maximum of 16 quantization levels. The first picture element in each moving area is transmitted by PCM. Variable-length coding is used on the DPCM code words. Complete PCM lines are transmitted to provide systematic and forced updating coincident with luminance PCM lines. 1.4.2.3.1 DPCM prediction algorithm The algorithm used for colour-difference signals is: x = A (see Figure 1/H.120) 1.4.2.3.2 Quantization law and variable-length coding As for luminance component (see SS 1.4.1.3.2 and 1.4.1.4.1). 1.4.2.4 Subsampling Horizontal subsampling is carried out in exactly the same way as for the luminance signal, including adaptive element subsam- pling. Field/field subsampling of the colour-difference signals is also similar to that of the luminance signal. Either field can be omitted and, in the omitted field, interpolation takes place only in those parts of the picture which are estimated to be moving. Stationary areas remain unchanged. The estimated moving areas are formed by an OR function on moving areas in past and future fields in the same manner as for luminance (S 1.4.1.4.2). For colour-difference signals, the interpolated value of x is [Formula Deleted] Field 1 or -v'8p' Field 2, respectively. Both field and horizontal subsampling take place simultane- ously with subsampling of the luminance signal and they are sig- nalled to the decoder in the same way. 1.5 Video multiplex coding 1.5.1 Buffer store The size of the buffer store is defined at the transmitting end only and is 96 kbit/s. Its delay is approximately equal to the duration of one picture (40 ms). At the receiving end, the buffer must be of at least this length, but in some implementations of the decoder it may be longer. 1.5.2 Video synchronization The method used for video synchronization permits the reten- tion of the picture structure. The required information is transmitted in the form of line start and field start codes (LST and FST). 1.5.2.1 Line start code The line start code includes a synchronization word, a line number code and a digit to signal the presence of element subsam- pling. It has the format: 0 0 0 0 0 0 0 0 | 0 0 0 0 1 0 0 0 | "S" | 3-bit line No. code | "S" is a 1 if horizontal subsampling occurs on the TV line following the line start code. "S" is a "don't care" condition on empty or PCM lines. The line number code comprises the least three significant digits of the line number, where Line 0 = first active line of Field 1 and Line 144 = first active line of Field 2. Lines numbered 143 and 287 are non-coded lines, used for field synchronization and line number continuity. 1.5.2.2 Field start code There are two field start codes, FST-1 and FST-2, where the first line of the field following FST-2 is interlaced between the first two lines of the field following FST-1. FST-1 indicates the start of the first field, starting with line number 0. FST-2 indi- cates the start of the second field, starting with line number 144, as shown in Figure 3/H.120. Figure 3/H.120, p. Each field start code comprises a line start code, followed by an 8-bit word, followed by the line start code of the first line of the next field. The field start code is given in Figure 4/H.120. Figure 4/H.120 [T3.120] (a traiter comme tableau MEP), p. For FST-1, F = 1 and for FST-2, F = 0. A = 0 for normal opera- tion. If required, A = 1 is used to signal that the buffer state is less than 6 kbits (used in switched multipoint applications). S is the subsampling digit as defined in S 1.5.2.1. Field subsampling is signalled by two consecutive field start codes of the same number. For example: [Formula Deleted]