2D DCT compression in the switched-current technique

Abstract. The article presents a methodology for designing an analogue processor for a DCT compression using methods and strategies for designing digital circuits: the row strategy, a standard digital router and an automatic synthesis of architecture from its description in a VHDL-AMS language. The correctness of work of the topography has been verified with post-layout simulations of processing an exemplary image in the compressing task, using the discrete cosine transform. The quality of processing has been compared with other solutions available in literature by calculating the PSNR and Accuracy coefficients for the processed image. The article also presents changes of the PSNR coefficient depending on the level of the applied compression.

Streszczenie. W artykule zaprezentowano została metodologia projektowania analogowego procesora kompresji DCT z wykorzystaniem metod i strategii projektowania układów cyfrowych: strategii wierszowej, standardowego cyfrowego routera oraz metod automatycznej syntezy architektury z jej opisu w języku VHDL-AMS. Poprawność działania topografii zweryfikowana została symulacjami post-layoutowymi procesu przetwarzania przykładowego obrazu w zadaniu jego kompresji za pomocą dyskretnego transformata kosinusowej. Jakość przetwarzania porównana została z innymi rozwiązania dostępnymi w literaturze poprzez wyliczenie współczynników PSNR oraz Accuracy dla przetworzonego obrazu. W artykule zaprezentowano również zmiany współczynnika PSNR w zależności od stopnia zastosowanej kompresji. (Kompresja dwuwymiarowa DCT w technice przełączanych prądów).

Keywords: DCT compression, switched-current technique, layout synthesis, layout design automation, image processing, data compression.

Słowa kluczowe: kompresja DCT, technika przełączanych prądów, synteza layoutu, automatyzacja projektowania layoutu, przetwarzanie obrazu, kompresja danych.

doi:10.12915/pe.2014.09.26

Introduction

The trend for miniaturisation of electronic devices and the reduction of power consumption drives searches for newer and newer solutions [1]. Digital circuits, despite a far-reaching automation of the design process and the unquestionable accuracy of data processing do not offer satisfying parameters of power consumption and chip area. Moreover dedicated analogue solutions allow to obtain even several hundred times faster calculation processes, compared with programmable digital circuits. These properties are the inspiration for developing methods of synthesising analogue circuits topographies [2]. Still, the choice of available tools is insufficient. Therefore the reason for little popularity of analogue circuits is, on one hand - the difficulty of designing them, resulting from lack of tools, as well as the need of possessing specialized knowledge, and on the other - the fear concerning the accuracy of data processing by an analogue circuit. Authors decided to discuss both matters. The article presents a methodology of designing an analogue circuit with switched-currents - dual to existing digital technique, i.e. basing on the standard CMOS technology, the row strategy [3], a standard digital routing and automatic synthesis of architecture from its description in HDL (Hardware Description Language). The generated SI structure has been tested against the real-image compression task using the 2D DCT transform. Due to very low power consumption - such a solution can be used in (among others) portable devices for data acquisition transmitting compressed information to a larger assembly system. Section 2. presents a strategy for designing an SI circuit at the layout stage, using automation methods, analogue to the ones used for digital circuits. The described approach proves that designing an analogue circuit is currently not as time consuming, difficult and uncertain process, as it is commonly believed and that the currently used methodology for designing digital circuits can be successfully adopted for analogue solutions. Section 3. shows a result of adapting an analogue circuit to a real-image compression task. Parameters for assessing the quality of its work are presented in the table 1. The accuracy of data processing is confronted with the bit resolution of the digital equivalent.

Switched-current DCT architecture

The following section presents a strategy for designing an analogue circuit analogical to a digital strategy. The discussed approach is based on using the SI switched-current technique because of the possibility of using a standard CMOS technology. Authors pointed out a couple of advantages of the technique, especially the possibility of parameterisation of physical properties of the circuit [4] and automatisation of the design process. Additionally the SI circuits are characterised with very low power consumption [5]. As an example - calculation of a 2D Discrete Cosine Transform (DCT) has been analysed. It is a transformation commonly used in image-processing tasks including the basic data compression standards such as JPEG and MPEG [6]. A typical 2D-DCT circuit realisation consists of 3 basic blocks: 2 calculating blocks responsible for signal addition and multiplication operations according to the cosine transform equation, each realising a one-dimensional transform and a third one - a memory block for partial results, as shown in Fig. 1.

![Fig. 1. Ideological chart of a circuit for calculating a 2D DCT transform](image)

Number of used memories depends on the dimension of the transformation. Such architecture has been presented (among others) in [7]. Signal multiplication operations are done using multi-output current mirrors and the addition operations are realised in nodes, according to the Kirchhoff’s current law. The role of memory in the implementation presented in this article is played by a cell with a balanced structure and a delaying element [8]. Authors have proposed current mirrors and memory cell topography architecture to be based on rules of the row strategy [3] used in designing digital circuits, i.e. was
characterised by a common height and varying length which allows placing them in a row. For the task of synthesising DCT calculating blocks and the memory block, authors have used a proprietary SI-Studio tool [9], generating a topography of an ASIC circuit from its description in the VHDL-AMS language. It is an analogical method to the one used in designing digital circuits, architecture of which is synthesised from a VHDL or Verilog language description. Chart of the process of a synthesis for an analogue 2D-DCT processor in the switched-current technique is shown in Fig. 2.

Beside the preliminary architecture description (1) the user of the environment has a possibility of providing information about technological rules (2), the heights of standard cells in a row (3) as well as power consumption, work speed and chip area occupancy parameters (4), according to which the synthesis process will be conducted. The SI-Studio environment also allows defining other parameters, such as names of layers, technological rules parameters, transistor models names in given technology, etc. Such an approach offers a technological independence of the design process. The topographies of memory cells and mirrors (5) synthesised with the mentioned tool are described in an AMPLE language (6) standard, in which cells height is one of code parameters. Implementation of the architecture is done using the IC Station tool from Mentor Graphics from the generated AMPLE description. The final stage of the SI architecture implementation is placement (7) in rows and cells routing (8). Authors have used a standard digital router available in the Mentor Graphics environment. Fig. 3 presents an automatically-generated, exemplary layouts of DCT circuits in different technologies and for different sizes of transforms.

The later part of the article will present results of testing the DCT4x4 circuit designed with the TSMC 0.18um technology (Fig. 3a). It is a circuit consisting of 900 transistors, 16 4-output current mirrors and a block of 16 memories for partial results. The area of the circuit is 0.06mm2.

Data processing

Testing of the synthesised topography was conducted using the environment described in literature [11]. Testbench uses three hardware description languages: VHDL, VHDL-AMS and SPICE. It allows defining, as the input, a graphical file in the PPM format. Results are presented as images with calculated PSNR and Accuracy parameters for the given compression level. For calculations the Mentor Graphics Questa-ADMS has been used. Testbench allows to divide tasks into any number of

Fig.2. The process of a synthesis of an analogue architecture with switched currents

Fig.3. Result of the synthesis of the exemplary topographies: a) – DCT4x4 in 0.18um TSMC technology, b) DCT4x4 calculating part blocks in 90nm TSMC; c) DCT8x8 in 0.18um TSMC, d) DCT8x8 calculating part blocks in 0.35um AMS. The images have been scaled for the needs of this article.
calculation units and automatic collecting of the results into one graphic file. Work of the designed circuit has been tested in a real-image compression task. Fig. 4b. presents the original input image. Compared to it are output images, obtained in post-layout simulations and calculated using the ideal Inverse Cosine Transform (IDCT) for different compression levels using the ‘zig-zag’ algorithm [10]: the 1st one calculated with the DC coefficient only and clipped AC coefficients of the image (Fig. 4c), with the DC coefficient and the 1st AC1 coefficient (Fig. 4d), with the DC coefficient and five AC1-5 coefficients (Fig. 4e).

Fig. 4. a) zig-zag sequence; b) original image; compressed: c) with coefficient [DC], d) with coefficients [DC, AC1], e) with coefficients [DC, AC1, AC2, … AC5]

The PSNR coefficient has been calculated using the inverse cosine transform. The square sum of subtractions between input signals and inverse signals gives the Mean Square Error (MSE) coefficient. Taking the maximum pixel value equal to 8µA a PSNR for the whole picture was obtained. For the presented 4x4 DCT circuit - PSNR is equal to 32.9 dB for compression with DC coefficient, 33.6 dB for [DC, AC1] compression, 37.2 dB for [DC, AC1, AC2, … AC5] compression and 45.7 dB without compression. Next, basing on the method shown in work [12] an accuracy parameter was also calculated. Firstly, error values were obtained from simulation output values for the whole picture and ideal output values subtraction. Error results were quantified to be expressed in bits. The output range was taken equally from 0 to 8µA and divided into 2^7, 2^8 and 2^9 levels. Taking errors rounded to the nearest integer values an accuracy corresponding to 7, 8 and 9 bits has been obtained as the maximum value of all 2^14 blocks. Next it was calculated as an average accuracy for the whole picture. Results for the subsequent three resolutions of the digital equivalent have been put in table 1. It is worth noticing that PSNR and Accuracy coefficients values are highly dependent on the values of input signals. Analysing both benchmarks for the real image gives much more measurable results comparing to often used in literature examples calculating a coefficient for single data matrices. The value of the PSNR coefficient is highly dependent on the level of compression. Fig. 5. shows the dependency of the coefficient on the cut-off point of the AC component in the image. It is worth noticing that the presented analogue circuit is characterised with the power consumption of 4.1mW, which is a competing solution against the commonly used digital solutions, especially due to the fact that the synthesis of the analogue architecture has been conducted automatically.

Fig.5. PSNR vs. AC cut-off point: post-scheme and post-layout simulations

Summary

The work presents a method of synthesis of analogue SI circuits with discrete time from a description in the VHDL-AMS language. The presented approach is inspired with a methodology of designing digital circuits and proves that it can be adapted to analogue solutions. As an example of the synthesis authors present a project of a circuit for calculating a discrete cosine transform. Properties of the circuit are examined using post-layout simulations of a real image. The calculated parameters of evaluating the quality of processing are a reliable source of evaluating the usability of the presented approach. They also prove that the presented solution is competitive against the commonly-used digital circuits.

Table 1. Comparison of parameters of the described circuit against realisations from literature

<table>
<thead>
<tr>
<th>Design</th>
<th>[13]</th>
<th>[14]</th>
<th>[7]</th>
<th>[12]</th>
<th>Current work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology</td>
<td>3µm</td>
<td>0.35µm</td>
<td>0.8µm</td>
<td>0.18µm</td>
<td>0.18µm</td>
</tr>
<tr>
<td>Transform size</td>
<td>8</td>
<td>8</td>
<td>4</td>
<td>4 and 8</td>
<td>4</td>
</tr>
<tr>
<td>Area [mm^2]</td>
<td>0.4</td>
<td>1.1</td>
<td>1</td>
<td>0.007/0.03</td>
<td>0.06</td>
</tr>
<tr>
<td>Power cons. [mW]</td>
<td>2</td>
<td>5.4</td>
<td>-</td>
<td>0.5/2.5</td>
<td>4.1</td>
</tr>
<tr>
<td>Accuracy [bits]</td>
<td>6-7</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>PSNR [dB]</td>
<td>-</td>
<td>31.4</td>
<td>38.6</td>
<td>47/30</td>
<td>32.9 for [DC]</td>
</tr>
<tr>
<td>Transform speed</td>
<td>&lt; 1</td>
<td>43</td>
<td>&gt; 0.25</td>
<td>&lt; 2.5</td>
<td>&gt; 0.1</td>
</tr>
</tbody>
</table>

REFERENCES


Authors: dr inż. Szymon Szczęsny, E-mail: szymon.szczesny@put.poznan.pl; dr inż. Marek Kropidłowski, prof. dr hab. inż. Andrzej Handkiewicz, E-mail: Andrzej.Handkiewicz@put.poznan.pl; mgr inż. Michał Melosik, E-mail: michal.melosik@put.poznan.pl; dr inż. Paweł Śniatała, E-mail: pawel.sniatala@put.poznan.pl; Politechnika Poznańska, Wydział Informatyki, ul. Piotrowo 3A, 60-965 Poznań.