Loading Now
IJCT JOURNAL, International Journal of Research Publication and Reviews, Paper Publication fees, High Impact Factor, Fast Publication Journal, Low Publication Charges, Submit an article, Call for paper, Peer review journal, Engineering students journal

Bridging the Gap: Leveraging Transfer Learning for Low-Resource NLP Tasks

International Journal of Computer Techniques – Volume 10 Issue 5, Oct 2023 | ISSN 2394-2231

Praveen Kumar Myakala, Prudhvi Naayini
University of Colorado Boulder, Boulder, CO 80309, USA
Email: Praveen.Myakala@colorado.edu
Independent Researcher, Texas, USA
Email: Naayini.Prudhvi@gmail.com

Abstract

Natural Language Processing (NLP) has witnessed transformative progress with the advent of Large Language Models (LLMs) such as BERT, GPT-3, and T5. However, their impact predominantly benefits high-resource settings with abundant datasets and computational infrastructure. This creates an accessibility gap for low-resource languages, including Telugu, and domain-specific tasks, where such resources are scarce. Transfer learning offers a viable pathway to bridge this gap, leveraging techniques like multilingual pretraining, parameter-efficient fine-tuning (e.g., Adapters and LoRA), and few-shot learning to optimize performance in constrained environments. This paper examines state-of-the-art methodologies for applying transfer learning in low-resource NLP scenarios, highlighting the challenges posed by linguistic diversity, data scarcity, and computational constraints. The study proposes novel strategies, including leveraging synthetic data, lightweight architectures, and bias-aware training frameworks, to address these issues. Special emphasis is placed on democratizing NLP for underrepresented languages such as Telugu, ensuring ethical and equitable development across linguistic and domain boundaries.

Keywords

Transfer Learning · Low-Resource Natural Language Processing · Multilingual Models · Few-Shot Learning · Fine-Tuning Techniques

References

  1. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  2. Tom B. Brown, Benjamin Mann, Nick Ryder, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 2020.
  3. Colin Raffel et al. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020.
  4. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018.
  5. Pratik Joshi, Sebastin Santy, et al. The state and fate of linguistic diversity and inclusion in the NLP world. arXiv preprint arXiv:2004.09095, 2020.
  6. Robert Östling and Jörg Tiedemann. Under-resourced languages: A review of the computational literature. Computational Linguistics, 2017.
  7. Ilias Chalkidis et al. Legal-BERT: The muppets straight out of law school. arXiv preprint arXiv:2010.02559, 2020.
  8. Jinhyuk Lee et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 2020.
  9. Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, and Thomas Wolf. Transfer learning in natural language processing. Proceedings of the 2019 Annual Meeting of the Association for Computational Linguistics (Tutorial Abstracts), pages 15–18, 2019.
  10. Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359, 2010.
  11. Telmo Pires et al. How multilingual is multilingual BERT? arXiv preprint arXiv:1906.01502, 2019.
  12. Mikel Artetxe and Holger Schwenk. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics, 7:597–610, 2019.
  13. Tianyu Gao, Adam Fisch, and Danqi Chen. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), pages 3816–3830, 2021.
  14. Edward Hu, Yelong Shen, Phillip Wallis, et al. LoRA: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  15. Alexis Conneau et al. Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116, 2020.
  16. Emily M. Bender, Timnit Gebru, et al. On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 610–623, 2021.
  17. Su Lin Blodgett et al. Language (technology) is power: A critical survey of “bias” in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, 2020.
  18. Neil Houlsby et al. Parameter-efficient transfer learning for NLP. arXiv preprint arXiv:1902.00751, 2019.
  19. Jonas Pfeiffer et al. AdapterHub: A framework for adapting transformers. arXiv preprint arXiv:2007.07779, 2020.
  20. Timo Schick and Hinrich Schütze. Exploiting cloze questions for few-shot text classification and natural language inference. arXiv preprint arXiv:2012.15723, 2021.
  21. Yashwanth Reddy Regatte, Rama Rohit Reddy Gangula, and Radhika Mamidi. Dataset creation and evaluation of aspect-based sentiment analysis in Telugu, a low-resource language. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5017–5024, 2020.
  22. Victor Sanh et al. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
  23. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  24. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742, 2020.
  25. Alexis Conneau and Guillaume Lample. Cross-lingual language model pretraining. Advances in Neural Information Processing Systems, 2019.
  26. Francisco Guzman et al. The FLORES evaluation benchmarks for low-resource and multilingual machine translation. arXiv preprint arXiv:1902.01382, 2019.
  27. Rico Sennrich, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709, 2016.
  28. Junjie Hu et al. XTREME: A massively multilingual multi-task benchmark for evaluating cross-lingual generalization. arXiv preprint arXiv:2003.11080, 2020.
  29. Wilhelmina Nekoto et al. Participatory research for low-resource machine translation: A case study in African languages. Findings of EMNLP, 2020.
  30. Ren Li et al. Data augmentation approaches in natural language processing: A survey. AI Open, 2:168–180, 2021.
  31. Jiatao Wang et al. Switchout: An efficient data augmentation algorithm for neural machine translation. arXiv preprint arXiv:1808.07512, 2018.
  32. David I. Adelani et al. A thousand languages and still not enough. arXiv preprint arXiv:2112.03497, 2022.
  33. Zhijing Zhang et al. Fairness-aware learning in natural language processing: A survey. arXiv preprint arXiv:2202.06539, 2022.
  34. Florian Prost et al. Debiasing embeddings for reduced gender bias in text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6410–6417, 2019.
  35. Emma Strubell et al. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 3645–3650, 2019.

Online Resources:

  • arXiv: Archive of freely accessible research papers across numerous fields including NLP.
  • GitHub: Platform for hosting and sharing code repositories, vital for collaborative development and sharing of NLP models and datasets.
  • TensorFlow: An open-source platform for machine learning, essential for developing and deploying AI models.

About the Authors

Praveen Kumar Myakala
University of Colorado Boulder, Boulder, CO 80309, USA
Email: Praveen.Myakala@colorado.edu

Prudhvi Naayini
Independent Researcher, Texas, USA
Email: Naayini.Prudhvi@gmail.com

International Journal of Computer Techniques – Volume 10 Issue 5, Oct 2023 | ISSN 2394-2231