# 行政院國家科學委員會專題研究計畫成果報告 具延遲處理器及訊號點對應器之迴旋碼的進一步研究 Further Study on Convolutional Codes with a Delay Processor and a Signal Mapper 計畫編號: NSC 90-2213-E-002-064 執行期間:90年8月1日至91年7月31日 主持人:林茂昭 國立臺灣大學電信所 #### 一、中文摘要 在[1,2]中提及一種迴旋碼 T,其架構為在低編碼速率迴旋碼 C 後接上一個延遲處理器(delay processor)及訊號點對應器(signal mapper),因為延遲處理器的加入,使得該迴旋碼 T 之限制長度(constraint length)比低編碼速率迴旋碼 C 大很多(即使迴旋碼 C 的限制長度很小),在此情形下,迴旋碼 T 可具有更大的自由距離,其解碼可使用 C 的籬柵(trellis)及解碼過的訊息來解碼,然而,此次佳化解碼(suboptimally decoding) [1,2]會導致大錯誤係數,使得除錯性能降低。再者,在短封包傳輸通訊中,過多的補零動作也會使得編碼速率因而降低。 在此計畫中,我們使用重覆性解碼及 Tail Biting 來克服消除上述次佳化解碼所造成的錯誤係數及編碼速率降低問題,並專注於改良次佳化解碼[1,2]。研究成果顯示我們所設計的籬柵碼在短碼長時比二位元渦輪碼具有更佳除錯性能表現。除此之外,我們還設計此籬柵碼之解碼電路來驗證 Tail Biting T 之優越性。 ## 英文摘要 In [1,2], a convolutional coding scheme implemented by serially which is concatenating a delay processor and a signal mapper to the encoder of a conventional convolutional code C was proposed. Since the introduction of the delay processor, the constraint length of resultant code T is large even if the constraint length of C is short. In this manner, T can achieve large free distance and can be suboptimally decoded by using the trellis of C with the help of some previously decoded information. However, the suboptimum decoding used in [1] and [2] results in a large error coefficient and hence degrades the error performance. Moreover, for short packet transmission, the rate loss resulted from zero-tail method is large. In this project, we use iterative decoding and the tail-biting technique to solve these problems. We will focus on improving the suboptimal decoding used in [1] and [2]. Binary codes constructed from our design has better error performance and lower decoding complexity as compared to turbo codes [4] in case of short codeword length. We also provide circuit design of the suboptimal decoder of the proposed trellis code to verify its advantage. ## 二、計畫的緣由與目的(Goals) The encoding of trellis codes given in [1,2] is shown in Fig.1. It is already known that by a trellis code T which is constructed by using the encoder of a convolutional code C with short constraint length followed by a delay processor and a signal mapper is equivalent to a trellis code with large constraint length. In [1] and [2], a bound on free distance of T is derived and a suboptimum decoding is used to achieve error performance guaranteed by the free distance bound. However, the suboptimum decoding used in [1] and [2] will results in a large error coefficient and dense distance spectrum and hence the associated error performance is very poor at low to moderate signal-to-noise ratio conditions. overcome these problems, in this project we propose an improved suboptimal decoding based on the concept of turbo decoding (iterative decoding). When using the delay processor given in [1,2] and not using the interleaver as used in [1], we encounter severe error propagation. To mitigate this error propagation, we employ various weight factors to reflect the various feedback extrinsic influences of the information among various coding level. Moreover, for short packet transmission, the rate loss resulted from terminating the trellis of T in the all zero state by appending some zeros to a message frame, i.e., the zero-tail method, is large. The rate loss problem resulted from zero-tail method can be avoided by using the tail-biting design [3]. If directly apply the conventional tail-biting design as shown in [3] to T, then the decoding is extremely difficult. We propose an alternative method to design the tail-biting trellis code of T. With this, both the encoding and decoding can be easily implemented. Simulation results show that the trellis code with a delay processor and a signal mapper can achieve performance better than that of binary turbo code at moderate to high signal-to-noise ratios (or moderate to low bit error rate) in case of short decoding delay. ## 三、研究方法與成果(Methods and Results) #### **Tail-Biting Trellis Codes** For trellis codes with a delay processor, i.e., T, conventional zero-tail encoded trellis code will result in a lot of fractional rate loss for short packet transmission since the number of memory bits in the encoder of $T, \nu' = \nu + \sum_{j=1}^{m} \tau_{j}$ , is very large for typical values used in [2]. Hence tail-biting trellis code is highly desired for short packet transmission. The trellis of tail-biting trellis code is different from that of zero-tail trellis code. For zero-tail code, the trellis starts from the zero state and ends in the zero state while for tail-biting code, the trellis can start from any possible state and end in the starting state. To obtain the associated tail-biting code of T, we can use the linear feed forward shift register of the encoder of C for which the generator matrix is $G_{\epsilon}(D)Q(D)$ , where $G_{\epsilon}(D)$ and Q(D) are transfer function matrices of the convolutional code C and the delay processor, respectively. With the zero-tail encoding, the shift register is initialized with zeros. To achieve a tail-biting code T in a conventional way[3], the shift register is initialized with the last $v' = v + \sum_{j=1}^{m} \tau_{j}$ message bits. This ensures that the encoder of C will start from and end in the same state [3]. However, with this tail-biting method, it is difficult to decode T efficiently. ## 1. Encoding We now propose to design the tail-biting trellis code T' in a way which is suitable for the suboptimal decoding that will be described later. Let the packet size of message bits be $L \times \lambda \times r$ , i.e., $L \times \lambda$ message symbols, where L and $\lambda$ are positive integers. To obtain T', we can initialize the linear feed forward shift register of the encoder of C with last $\nu$ message bits. We then feed the whole message frame to the encoder of C as usual and arrange the resultant code bits $v_j(t)$ , j = 1,...,m and $t = 0,1,...,L\lambda - 1$ , in an m×L $\lambda$ matrix. Let the entry at the jth row and tth column of the $m \times L \lambda$ matrix be $s_i(t-1)$ . The relation between $v_i(t)$ and $s_i(t)$ is as follows $v_j(t) = s_j((t + \sum_{i=j}^m \lambda_i) \mod L\lambda)$ $$= s_{j}((t + \tau_{j}) \mod L\lambda) \quad \text{for} \quad 1 \le j \le m$$ Then $s(0),...,s(L\lambda-1)$ are fed to the signal mapper to yield the desired codeword $z(0),...,z(L\lambda-1)$ of T". If we define the state of T'' as the contents of the memory elements of the convolutional encoder of C and the delay processor, then the encoder of T" will start from and end in the same state and hence the code T'' is indeed a tail-biting trellis code. We prove the following theorem. **Theorem 1.** If L and $\lambda$ are large enough and T is noncatastrophic, then the minimum distance $d_{\min}$ of the associated tail-biting code T' is the same as the free distance of Tupper and bounded by $d_{\min}^{u} = \min_{\substack{N \in C, V(0) \neq 0 \\ N \in C, V(0) \neq 0}} \sum_{j=1}^{m} \sum_{t=0}^{\infty} V_{j}(t) \delta_{j}.$ #### 2. Decoding The proposed decoder consists of two modules. One is the decoder of C and the other is a demapper. The decoder of C employs the tail-biting trellis of C where the "time axis" has the topology of a circle rather than that of an interval. With this circular trellis, we can run the sliding window Log-BCJR algorithm [5] continuously without any boundary effect as the trellis level or decoding time increases. The decoder of C will yield extrinsic value $L_{\alpha}^{(I)}(v_{\alpha}(t))$ of code bits of C after a window size of $\lambda$ time units (symbols) rather than a block size of $L\lambda$ time units (symbols). Note that these code bits are also used as labeling bits $s_{i}(t)$ of the signal mapper and hence these extrinsic values can be used as a priori values $L_e^{(I)}(s_i(t))$ of these labeling bits. The demapper uses these a priori values and the received symbol y(t) which is the error-corrupted form of z(t) as input and produces the a posteriori value and extrinsic value of labeling bits $s_i(t)$ as described in the following. Let $||y-z||^2$ represent the squared Euclidean distance between symbol y and symbol z. If y or z is a binary m-tuple, then a bit "0" must be replaced by "-1" and a bit "1" remains to be "1". For j = $1, 2, \dots, m \text{ and } i = 0, 1, \text{ define}$ $\operatorname{set} B'_{i} = \{w(s_{i}(t),...,s_{i}(t) =$ $i, s_{i+1}(t), ..., s_m(t)$ | $s_q(t) \in \{0,1\}$ for $q \neq j\}$ and function $$\max_{b \in B} (f(b)) = \ln \{ \sum_{j=1}^{J} \exp(f(b_j)) \} \approx$$ $$\max_{b \in B} (f(b)) + \delta(f(b_1), f(b_2), \dots f(b_J)),$$ where f is a function of b and $B = \{b_j \text{ for } 1 \le j \le J\}$ and $\delta(f(b_1), f(b_2), ..., f(b_J))$ is a correction term that can be computed recursively by using a simple lookup table [6]. If $L_e^{(f)}(s_j(t))$ , j=1,2,...,m are independent, then the a posteriori value $L^{(o)}(s_j(t))$ of labeling bit $s_j(t)$ at the demapper output can be calculated according to $$L^{(n)}(s_{j}(t)) \approx \max_{x \in B_{j}^{I}} \left\{ -\frac{\|y(t) - w(x)\|^{2}}{No} \right\} + \frac{1}{2} \sum_{i=1}^{m} \zeta_{i,j} \left[ 2s_{i}(t) - 1 \right] L_{c}^{(I)}(s_{i}(t)) \right\} - \max_{x \in B_{j}^{I}} \left\{ -\frac{\|y(t) - w(x)\|^{2}}{No} \right\} + \frac{1}{2} \sum_{i=1}^{m} \zeta_{i,j} \left[ 2s_{i}(t) - 1 \right] L_{c}^{(I)}(s_{i}(t)) \right\}$$ where $\zeta_{i,j} = 1$ for all i and j. If the a priori information of $s_i(t)$ is not available, then $L_e^{(I)}(s_i(t))$ equals zero. The independence of $L_r^{(l)}(s_i(t))$ , j=1,2,...,m may be achieved by using a large interleaver between the decoder of C and the demapper. However, we do not use such an interleaver since doing so will increase the decoding delay. Without the interleaver, severe propagation may occur. We can alleviate this problem by using suitable weighting factor $\zeta_{i,j}$ which reflects the amount of influence of a priori value $L_e^{(I)}(s_i(t))$ on the a posteriori value $L^{(o)}(s_i(t))$ . These weight factors can be represented by an mxm matrix $\Xi$ for which the entry at the *i*th row and jth column is $\zeta_{i,j}$ . The matrix of weighting factors, $\Xi$ , is dependent on the choices of signal mapper and in this chapter is designed by intuition in a trial-and-error way. The extrinsic values of these labeling obtained $L_{e}^{(O)}(s_{\perp}(t)) = L_{e}^{(o)}(s_{\perp}(t)) - L_{e}^{(I)}(s_{\perp}(t))$ . Then these values $L_e^{(O)}(s_i(t))$ are also used as intrinsic values for code bits of C and are fed to the sliding-window Log-BCJR decoder of C for further processing. #### 3. Performance Evaluation In this section, we use specific examples to verify the error performance by simulation. In our simulation, six iterations are used unless the number of iterations is mentioned. **Example 1**: Let r = 2, m = 4 and $\Omega = \{1,0\}^4$ . Let the 4-state (v = 2) generator matrix $G_c^1$ of C, the labeling $K_1$ and $\Xi_1$ be respectively given by $$G_c^1 = \begin{pmatrix} 3 & 0 & 1 & 3 \\ 0 & 3 & 3 & 2 \end{pmatrix} , K_1 = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \end{pmatrix}$$ $$\Xi_1 = \begin{pmatrix} 1.00 & 0.83 & 0.83 & 0.83 \\ 0.77 & 1.00 & 0.83 & 0.77 \\ 0.77 & 0.50 & 1.0 & 0.71 \end{pmatrix}$$ It can be checked that $\{\delta_1, \delta_2, \delta_3, \delta_4\}$ = $\{1,2,3,4\}$ for $K_1$ . Consider cases (a) and (b) for which the parameters are given in Table 1. For both cases (a) and (b), the minimum distance $d_{\min}^{I}$ (for sufficiently large $\lambda$ and L) is 13. Simulation results of binary turbo code are shown in Fig.2 and Fig.3. We consider a random interleaver. 4-state component codes and Log-BCJR (Log-MAP) algorithm [5]. Fig.2 and Fig.3 show that the error performances of Examples 1(a) and 1(b) are better than those of binary turbo codes at moderate to high SNR in the AWGN or Rayleigh fading channels based on the same short codeword length. The inferior error turbo binary performance of codes considered here may be due to that both the "interleaver gain" and minimum distance are small. Note that for binary turbo codes, the Log-MAP algorithm must run twice for each iteration in the iterative decoding. contrast, for the proposed code, the Log-MAP algorithm runs only once for each iteration. #### 4. Circuit Design We complete a circuit design of the suboptimal decoder for a tail-biting code T which is composed of concatenating a rate 2/4 convolutional code followed by a 4-level delay processor and a signal mapper. We use L=16 and $\lambda=8$ respectively. The code length is 512. The design is suitable for implementation on the Altera FLEX10K200 FPGA chips. Performance measurement indicates the tail-biting code T is indeed superior to the binary turbo codes of comparable lengths. #### 四、結論與討論(Concluding Remarks) We study trellis codes with which constraint length have constructed from the scheme proposed in [1, 2]. We find better bounds on free distances than those derived in [1, 2]. We propose a decoding method based on the concept of iterative decoding to take advantage of the larger free distance bounds. We consider the tail-biting structure of this trellis code for short packet transmission. Simulations show that in case of short codeword length, the proposed tail-biting trellis coding can perform better than either the turbo coding or the conventional trellis coding. We also provide circuit design to verify advantage of the proposed code. #### 五、參考資料(References) - G. Hellstern, "Coded Modulation with Feedback Decoding Trellis Codes," in *Proc. IEEE Int. Conf. on Communications* (Geneva, Switzerland, May 1993), pp.1071-1075. - [2] J.Y. Wang and M.C. Lin, "On constructing trellis codes with large free distances and low decoding complexities," *IEEE Trans. Commun.*, vol.45, no.9, pp.1017-1020, Sept. 1997. - [3] H. H. Ma and J. K.Wolf, "On tail-biting convolutional codes," IEEE Trans. Commun., vol. COM-34, no. 2,pp. 104-111.Feb. 1986. - [4] C. Berrou and A. Glavieux, "Near optimum error correcting coding and decoding: Turbo-codes," *IEEE Trans. Commun.*, vol. 44, no. 10, pp. 1261-1271, Oct. 1996. - [5] P. Robertson, E. Villebrun and P. Hoeher, "A comparison of optimal and suboptimal MAP decoding algorithms operating in the log domain," in *Proc. ICC* '95,pp. 1009-1013. - [6] S. Le Goff, A. Glavieux and C. Berrou, "Turbo-codesand high spectral efficient modulation," in *Proc. ICC'94*, pp.1064-1070, 1004 ## 六、圖表 (Figures and Tables) Figure 1: The encoder of T which uses a delay processor and a signal mapper. Figure 2: BER of the proposed codes, binary turbo codes (TC), and the 256-state conventional convolutional code (CC) in the AWGN channel. Figure 3: BER of the proposed codes, binary turbo codes (TC), and the 256-state conventional convolutional code (CC) in the Rayleigh fading channel. | Ex: | $(\tau_1,,\tau_m)$ | λ,L | Codeword length | |-----|----------------------------------|-------|-----------------| | la | $(9\lambda,5\lambda,2\lambda,0)$ | 8,16 | 512 code bits | | 1b | $(9\lambda,5\lambda,2\lambda,0)$ | 16,16 | 1024 code bits | Table 1: parameters of the proposed tail-biting trellis codes