| RESEARCH ARTICLE | $\cap$ |
|------------------|--------|

OPEN ACCESS

### **High-Performance Wallace Tree Multiplier**

K.K. Senthil Kumar\*, S. Yuvaraj \*\*, R. Seshasayanan\*\*\*

\*(Department Electronics and Communication Engineering, Prince Shri Venkateshwara Padmavathy Engineering College,

Chennai, Tamil Nadu and India

Email: senthilkumar.k.k.ece@psvpec.in)

\*\* (Department Electronics and Instrumentation Engineering, Meenakshi College of Engineering, Chennai, Tamil Nadu, and

India

Email: yuvarajjs@gmail.com)

\*\*\* (Department Electronics and Communication Engineering, Meenakshi College of Engineering, Chennai, Tamil Nadu, and

India

Email: se\_sha\_sa@yahoo.com)

### Abstract:

Multiplier is a crucial block of the most of the digital multiplication applications. With the advancement in the field of VLSI design, achieving high speed and low power dissipation has become a major concern for the VLSI design circuit designers. As a multiplier unit consumes large amount of power and has a major role to play in the speed of the circuit therefore its optimization will improve the performance of the circuit. The process of multiplication is implemented in hardware using shift and add operation, so to use of efficient multiplexer circuit will lead to improved multiplier operation. In this paper, reduced complexity Wallace tree multiplier circuit is proposed that uses efficient and improved adder based multiplexer. The circuits are verified using Xilinx ISE 9.2i tool and simulated in Altera ModelSim 6.5b. The proposed Wallace tree structure offers a decrement of approximately 70% in dissipation of power, approximately 86% in power delay product and 60% in area. The proposed multiplier is suitable to use in applications such as DSP structures, FIR filter, ALU's and several low power and high speed Multiplication operations.

*Keywords* — Multiplexer, high speed, low power, multiplier, VLSI, Wallace tree..

### I. INTRODUCTION

With the increase in integration scale, more and more advanced and compact signal processing systems are needed to be actualized on VLSI chip. These processing applications consume a hefty amount of power and require good computation capacity. With performance and area, power dissipation has also become a concerning factor for design of integrated circuits. There are two main factors that led to this budding of low power systems. Firstly, increased integration has led to increase in processing capacity due to which large flow of currents takes place leading to heating up of the chip. Secondly in portable electronic devices the battery life is limited and hence prolonged operation of these portable devices can be obtained by achieving low power design.

High speed, efficient addition of multiple operands is an essential operation in any computational unit. The speed and power efficiency of multiplier circuits is of critical importance in the overall performance of microprocessors. Multiplier circuits are an essential part of an arithmetic logic unit, or a digital signal processor system for performing filtering and convolution. The binary multiplication of integers or fixed-point

### International Journal of Computer Techniques --- Volume 7, Issue 01, February, 2020

numbers results in partial products that must be added to produce the final product. The addition of these partial products dominates the latency and power consumption of the multiplier.

It is known that in most of the signal processing algorithms, multiplication have a fundamental role to play. The system's performance is generally determined by the performance of the multiplier because the multiplier is generally the slowest element in the system. Furthermore, it is generally the most area consuming. Hence, optimizing the speed and area of the multiplier is a major design issue. All the multipliers use full adders and hence can be optimized using the modified full adders based multiplexer. In this paper, authors have proposed a compact Wallace tree multiplier structure to meet the present day needs to low power and high speed applications. The paper is dived as follows. The proposed multiplier uses improved adder designed using multiplexer logic in Reconfigurable FPGA Architecture. The proposed adder offers lesser delay and area than the conventional techniques discussed in literature.

The manuscript is organized as follows: Section II discusses the conventional approaches. Section III presents the proposed Wallace Tree structures. The results and discussion are compiled in section VI. Section V concludes the manuscript.

### **II. WALLACE TREE MULTIPLIER**

The important design considerations for any chip designer are power consumption, delay and area. Speed of the circuit changes with the speed/delay of the multiplier therefore a lot of research has been done to increase the speed of multiplier so that delay of the overall circuit can be reduced. Wallace Tree is a high speed and area efficient multiplier and is therefore of great importance in high speed applications [1]. It implements easy and efficient hardware methodology that multiplies integers using the column compression technique. Wallace tree offers fast speed because instead of linear dependency as in array multiplier, the total delay is proportional to the logarithm of word length of the operand of multiplier.

The operation of Wallace Tree Multiplier involves three steps: formation of partial products, grouping of these formed partial products and addition using adders. To improve the performance of Wallace Tree lot of research has been done [2-8]. In [2], author has proposed to use parallel prefix adders instead of conventional half and full adders in Wallace tree multiplier, leading to reduction in delay but the area and power dissipation constraints are not looked into. In [3], to reduce area and latency booth encoding with compressor approach is used as shown in Fig.1.



Fig.1. Booth encoding with compressor 4:2

Further in [4], XOR-XNOR based 3:2, 4:2 and 5:2 compressors are used in place of half and full adders in second stage of Wallace tree algorithm, leading to increase in speed. Though [3] and [4] has led to improvement in the speed of the multiplier but the area is not reduced considerably and also the use of 4:2 and 5:2 compressors increases the complexity resulting in complex routing as shown in Fig.2.



Fig.2. Booth encoding with compressor 3:2

Improvement is also done by estimating power using probabilistic gate level power estimator in each stage [5] or by rearranging the partial products in such a way so that switching activity is reduced. This offers a significant power reduction but area and speed remains unaltered. Besides this the improvement depends on the transition activity of the inputs. In [6], a full adder using 4:1 multiplexer is used in reduction phase leading to power reduction. In [7], full adder using 2:1 multiplexer is used also reducing power. This technique of implementing full adder as led to power reduction but the critical path delay is more than that of [8]. From all the previous offers best studied literature [8] the performance on ground of area, power and speed. The Fig.3 shows the conventional adder based on multiplexer used in [8].



# Fig.3. Conventional adder based on multiplexer

### **III. PROPOSED WORK**

As discussed in the previous section of the paper Wallace Tree multiplier (WTM) offers best speed compared to other multiplier circuits and hence various techniques to improve its structure have been proposed as discussed in [2-8]. In this paper a modified Wallace tree structure has been proposed that offers better performance compared to existing approaches. The proposed structure is designed using reduced complexity algorithm [9-10] and modified adder subcircuit to process the intermediate addition of bits. The Fig.2 shows the proposed adder, designed using 2:1 multiplexers. The Fig.3 displays the 2:1 pass transistor based mux. The use of pass transistor logic has led to considerable decrease in number of transistors and hence the area. Besides this it has best advantage of least static leakage. Due to very few Vdd to ground connections during switching the short circuit power is also least. The modified expressions for the sum and carry of the full adder circuit are given as following Equation (1) and Equation (2):

S = A XOR B XOR C = (A XOR B) XOR C = ((AB) `+AB) C + (A`B + AB`) C = (AB) `C+ABC+A`BC`+A (BC)` = A (B` C + B C`) + A` (B` C + B C`) = ((B XOR C)') A + (B XOR C) A` --- (1) CARRY = AB + BC + CA = C (B + A) + AB = C (B + A) (B + B`) + AB = B (C+A) (C+C`) + ACB = BC+ABC`+ACB` = B ((B XOR C)`) + A (B XOR C) --- (2) W = A A A B = C A (B XOR C) + A (B XOR C) --- (2)

Working of the circuit can be explained as follows and is verified from the truth table as shown in Table.1. If B = C = 0/1 then Sum = A and Carry = B. If  $\overline{B} = C$  then  $Sum = \overline{A}$  and Carry = C.

In our proposed Wallace Tree Structure this modified adder is used along with the reduced complexity algorithm for Wallace tree multiplier [9-10]. Unlike conventional Wallace multiplier in which both full adders and half adders are used to process three and two bits respectively, it uses only full adders unless the numbers of stages remain equal to that of algorithm. conventional Wallace The performance of multiplier is not affected by eliminating half adders as they don't compress the number of partial bits, two bits added gives two bits in output (Sum and Carry) as shown in Fig.4.

| Table.1. | Truth | Table | of Prop | posed | Adder |
|----------|-------|-------|---------|-------|-------|
|----------|-------|-------|---------|-------|-------|

| Α | В | С | Sum            | Carry          |
|---|---|---|----------------|----------------|
| 0 | 0 | 0 | Α              | В              |
| 0 | 0 | 1 | $\overline{A}$ | В              |
| 0 | 1 | 0 | $\overline{A}$ | В              |
| 0 | 1 | 1 | Α              | $\overline{B}$ |
| 1 | 0 | 0 | $\overline{A}$ | В              |
| 1 | 0 | 1 | Α              | $\overline{B}$ |
| 1 | 1 | 0 | Α              | $\overline{B}$ |
| 1 | 1 | 1 | $\overline{A}$ | $\overline{B}$ |

### International Journal of Computer Techniques -- Volume 7, Issue 01, February, 2020



## Fig.4. Proposed Adder in Wallace Tree Multiplier

The Fig.5 shows 4X4 bit multiplication using reduced complexity algorithm. It can be seen that only S3 and C2 are processed as two bits so that number stages does not exceed the conventional approach. The intermediate additions are performed by using the proposed adder. The proposed 8X8 Wallace tree multiplier structure is show in Fig.6.

| хзүз       |                  |                      |            |            | X1Y0 X0Y0<br>X0Y1 |  |
|------------|------------------|----------------------|------------|------------|-------------------|--|
| X3Y3       | X3Y2<br>C3       | S3<br>C2<br>↓<br>H.A | X0Y3       | <b>S</b> 1 | X1Y0 X0Y0<br>X0Y1 |  |
| X3Y3<br>C6 | X2Y3<br>S6<br>C5 | H.A<br>S5<br>C4      | X0Y3<br>S4 | S1         | X1Y0 X0Y0<br>X0Y1 |  |

| P7 | P6 | P5 | P4 | P3 | P2 | P1 | PO |
|----|----|----|----|----|----|----|----|

## Fig.5. 4X4 bit multiplication using reduced complexity algorithm.

Here N = 4 = r0. H.A is used here so r2 will contain only two. So that stages will not increase.



Fig.5. Proposed 4X4 bit Wallace Tree Multiplier algorithm.



Fig.6. Proposed 8X8 bit Wallace Tree Multiplier algorithm.

The Fig.6 shows 8X8 bit multiplication using reduced complexity algorithm. It can be seen that only  $X_0Y_3$ ,  $X_7Y_2$ ,  $X_7Y_5$  and  $X_7Y_7$  are processed as two bits are very important so that number stages does not exceed the conventional approach WTM. The intermediate additions are performed by using the proposed Wallace tree multiplier other conventional like Parallel Adder type Wallace tree multiplier and Counter based Wallace tree multiplier are respectively. The proposed Wallace tree multiplier structure is show in Fig.6. Also, Fig.7. Proposed 8X8 bit RTL Wallace Tree Multiplier algorithm and Fig.8. as shown the

### International Journal of Computer Techniques --- Volume 7, Issue 01, February, 2020

proposed 8X8 bit inner RTL Wallace Tree Multiplier algorithm.



Fig.7. Proposed 8X8 bit RTL Wallace Tree Multiplier algorithm.



Fig.8. Proposed 8X8 bit inner RTL Wallace Tree Multiplier algorithm.

### **IV. RESULTS AND DISCUSSIONS**

The proposed Wallace Tree Multiplier implemented in Reconfigurable FPGA target

XC3S500E-5-FT256. **Synthesis** and Simulations of all the circuits were performed in Xilinx 9.2i and ModelSim-Altera 6.5b. The proposed Wallace Tree Multiplier has been compared with the various conventional Wallace Tree Multiplier versions of [8] in Table.2. The Fig.9. Adder type WTM, Fig.10. Parallel Adder type WTM and Fig.11 Counter Based WTM shows the result for the conventional and proposed adder. From the output waveforms it is observed that circuit perform the correct functionality. For the sake of comparison from the conventional circuits of [8], same inputs were given to both the full adders and counter.



Fig.9. Proposed 8X8 bit Full adder based WTM output waveform.



Fig.10. Proposed 8X8 bit Parallel adder based WTM output waveform.

### International Journal of Computer Techniques -- Volume 7, Issue 01, February, 2020



Fig.11. Proposed 8X8 bit counter based WTM output waveform.

| Table.2. | Comparative | <b>Result of WTM</b> |
|----------|-------------|----------------------|
|----------|-------------|----------------------|

| Synthesis<br>Summary<br>Device:<br>XC3S500E-5-<br>FT256 | Proposed<br>8X8<br>WTM | PA<br>based<br>8X8<br>WTM | Counter<br>based<br>8X8<br>WTM | Total<br>BELs |
|---------------------------------------------------------|------------------------|---------------------------|--------------------------------|---------------|
| Number of<br>Slices:                                    | 45                     | <b>65</b> (1%)            | 74 (1%)                        | 4656          |
| Number of 4<br>input LUTs:                              | 82                     | 115<br>(1%)               | 129<br>(1%)                    | 9312          |
| Number of<br>bonded IOBs:                               | 32 (16%)               | 32<br>(16%)               | 32<br>(16%)                    | 190           |
| Number of Ins:                                          | 16                     | 16                        | 16                             |               |
| Number of IOs:                                          | 32                     | 32                        | 32                             |               |
| Maximum<br>combinational<br>path delay(ns):             | 22.630                 | 26.324                    | 26.948                         |               |

The proposed 8X8 Wallace Tree Multiplier consist the number of slices only 45 less than other conventional Parallel Adder type Wallace Tree Multiplier and also counter based Wallace Tree Multiplier. Again the Number of Look up Table (LUTs) is also for proposed WTM are 82 is less compare with other various conventional WTM. The Maxi. Combinational path delay (ns) for the proposed 8X8 WTM are 22.63 ns compare with other than conventional WTM.



Fig.12. Number of Slices between the Proposed and Conventional 8X8 bit WTM.

The Graph between the Number of Slices of Proposed Wallace Tree Multiplier and other than the Conventional 8X8 bit Wallace Tree Multiplier as shown in the Fig.12.



Fig.13. Number of 4 input LUTs between the Proposed and Conventional 8X8 bit WTM.

The Graph between the Numbers of 4 input LUTs of Proposed Wallace Tree Multiplier and other than the Conventional 8X8 bit Wallace Tree Multiplier as shown in the Fig.13.



Fig.14. Maximum Combinational path delay (ns) between the Proposed and Conventional 8X8 bit WTM.

The Graph between the Maximum Combinational path delay (ns) of Proposed Wallace Tree Multiplier and other than the Conventional 8X8 bit Wallace Tree Multiplier as shown in the Fig.14.

### **V. CONCLUSION**

In this paper Wallace tree multiplier has been investigated and then modified Wallace tree multiplier circuits were proposed in synthesized and simulated using Xilinx 9.2i and ModelSim-Altera 6.5b are respectively. Initially we simulated the existing type adder and Wallace Tree multiplier proposed in [8]. Then a new improved adder type is proposed that uses 2:1 multiplexer. The number of Slices, LUTs and maximum combinational path delay in nsec used is comparatively less than that of existing type adder and hence the area is minimized and also delay offered is less.

The proposed Wallace Structure offers an improvement of 69.23% in reduction of number of slice compare parallel adder type with Wallace Structure, and 60.81% in reduction of number of slice compare counter based Wallace Tree Multiplier. The value of 71.3% of reduction in number of LUTs and 63.56% reduction in the number of LUTs compare the proposed Wallace Tree multiplier and other then conventional type Wallace Tree

multiplier respectively. In 85.96% reduction in maximum combinational path delay in nsec and 83.97% reduction in maximum combinational path delay in nsec compare the proposed Wallace Tree multiplier and other then conventional type Wallace Tree multiplier respectively. Used in i.e. number of slices, number of LUTs, number of IOBs and maximum combinational path delay in nsec of the proposed Wallace multiplier has been minimized by a considerable magnitude. With the increase in demand of greater speed with less battery usage and minimum area, our proposed structures will prove beneficial in all the applications that perform mathematical operations using multipliers. Some of the applications in which it can be widely used are ALU's and DSP structures.

### REFERENCES

[1] C.S. Wallace, "A Suggestion for A Fast Multiplier", *IEEE Transaction on Electronic Computers*, Vol. 13, No. 1, pp. 14-17, 1964.

[2] S. Rajaram and K. Vanithamani, "Improvement of Wallace Multipliers using Parallel Prefix Adders", *Proceedings of IEEE International Conference on Signal Processing, Communication, Computing and Networking Technologies*, pp. 781-784, 2011.

[3] M.J. Rao and S. Dubey, "A High Speed and Area Efficient Booth Recoded Wallace Tree Multiplier for Fast Arithmetic Circuits", *Proceedings of Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics*, pp. 220-223, 2012.

[4] S. Karthick, S. Karthika and S. Valannathy, "Design and Analysis of Low Power Compressors", *International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering*, Vol. 1, No. 6, pp. 487-493, 2012.

[5] Saeeid Tahmasbi Oskuii, Per Gunnar Kjeldsberg and Oscar Gustafsson, "Power

### International Journal of Computer Techniques --- Volume 7, Issue 01, February, 2020

Optimized Partial Product Reduction Interconnect Ordering in Parallel Multipliers", *Proceedings of Nordic Circuits and Systems Conference*, pp. 1-6, 2007.

[6] S. Murugeswari and S.K. Mohideen, "Design of Area Efficient and Low Power Multipliers using Multiplexer based Full Adder", *Proceedings of 2nd International Conference on Current Trends in Engineering and Technology*, pp. 388-392, 2014.

[7] Yingtao Jiang, Abdulkarim Al-Sheraidah, Yuke Wang, Edwin Sha and Jin-Gyun Chung, "A Novel Multiplexer-based Low-Power Full Adder", *IEEE Transactions on Circuits and Systems*, Vol. 51, No. 7, pp. 345-348, 2004.

[8] Kokila Bharti Jaiswal, Nitish Kumar, Pavithra Seshadri and G. Laxminarayan, "Low Power Wallace Tree Multiplier using Modified Full Adder", *Proceedings of 3rd International Conference on Signal Processing*, *Communication and Networking*, pp. 1-4, 2015.

[9] R.S. Waters and E.E. Swartzlander, "A Reduced Complexity Wallace Multiplier Reduction", *IEEE Transactions on Computers*, Vol. 59, No. 8, pp. 1134-1137, 2010.

[10] Sandeep Kakde, Shahebaj Khan, Pravin Dakhole and Shailendra Badwaik, "Design of Area and Power Aware Reduced Complexity Wallace Tree Multiplier", *Proceedings of International Conference on Pervasive Computing*, pp. 1-6, 2015.

[11] N. Sureka; R. Porselvi; K. Kumuthapriya, "An efficient high speed Wallace tree multiplier", International IEEE Conference on Information Communication and Embedded Systems (ICICES), INSPEC Accession Number: 13485232, 2013.

[12] Shahzad Asif , Yinan Kong, "Low-Area Wallace Multiplier", Journal of Hindawi VLSI

Design, Volume 2014, Article ID 343960, 6 pages, 2014.

[13] Ramanathan P., Kowsalya P. & Anitha P., "Modified low power Wallace Tree multiplier using higher order compressors", International Journal of Electronics Letters, Vol. 5, no. 2, pp. 177-188, 2017.

[14] Sa'ed Abed, Yasser Khalil, Mahdi Modhaffar, Imtiaz Ahmad, "High-performance low-power approximate Wallace tree multiplier" *international Journal of Circuit Theory and Applications*, (2020). Wiley Online Library, 2018.

Mail your Manuscript to

<u>editorijctjournal@gmail.com</u> <u>editor@ijctjournal.org</u>