International Journal of Computer Techniques Volume 12 Issue 4 | Transforming Handwriting Recognition: Comparative Analysis of CNN-BiLSTM and ViT-Transformer Encoder Architectures

Transforming Handwriting Recognition Using CNN-BiLSTM and ViT Transformer | IJCT Journal

Transforming Handwriting Recognition: Comparative Analysis of CNN-BiLSTM and ViT-Transformer Encoder Architectures

Authors:
Manmohan Jatav, M. Tech Research Scholar (manmohanj8268@gmail.com)
Shivank Soni, Assistant Professor (shivanksoni@gmail.com)
Department of Computer Science & Engineering, Oriental Institute of Science & Technology, Bhopal, MP, India

Journal: International Journal of Computer Techniques – Volume 12 Issue 4
Publication Date: July – August 2025
ISSN: 2394-2231
URL: https://ijctjournal.org/

Abstract

This study compares the performance of traditional CNN-BiLSTM handwriting recognition models with a proposed ViT-LM framework that uses Vision Transformers and Transformer Encoders. Incorporating CTC loss and synthetic data augmentation from the IAM dataset, the ViT-LM system achieved superior accuracy with a 2.1% CER and 5.4% WER, setting a new benchmark for offline handwritten transcription tasks.

Keywords

Handwriting Recognition, Vision Transformer (ViT), Transformer Encoder, CNN-BiLSTM, CTC Loss, Deep Learning, Character Error Rate (CER), Word Error Rate (WER)

Conclusion

ViT-LM presents a scalable, context-aware architecture for handwritten text recognition, outperforming CNN-BiLSTM baselines. It offers stronger generalization, even with partial IAM dataset access. Future work includes GPU-based optimization, LM fine-tuning, and expanded ablation analysis. ViT-Transformer methods represent a promising direction for adaptable and robust handwriting recognition systems.

References

  1. Hamdan, Y. B., & Sathesh, A., JITDW, 2021. https://doi.org/10.36548/jitdw.2021.2.003
  2. Rahim, M. A., et al., CMES, 2024. DOI: 10.32604/cmes.2024.048714
  3. Carbune, V., et al., IJDAR, 2020, vol. 23, no. 2, pp. 89-102.
  4. Li, M., et al., AAAI, 2023, vol. 37, no. 11, pp. 13094-13102.
  5. Memon, J., et al., IEEE Access, 2020, vol. 8, pp. 142642-142668.

Post Comment