Designing the Neural Model for POS Tag Classification and Prediction of Words from Ancient Stone Inscription Script

Document Type : Primary Research paper

Authors

1 Junior Research Fellow, Department of Computer Science and Engineering College of Engineering, Guindy Campus, Anna University, Chennai, TamilNadu, India

2 Associate Professor, Department of Computer Science and Engineering College of Engineering, Guindy Campus, Anna University, Chennai, TamilNadu, India

Abstract

POS (Part-of-Speech) Tagging is essential to indicate labeling the words in the
corpus into grammatical categories in text analysis and marking up linguistic words in a
text. According to the inflections and combinations in the words of Tamil language, there
is still difficulty in POS Tagging classification and prediction of Tags of the words as the
automated tools are very rare compared to the aspects of rich English language. As if there
are tools for modern Tamil language there is a lack of such statistical methods and
techniques for the Ancient Tamil language such as the texts from inscriptions and scripts
of stone where the words are lengthy and combined without splitting up into morphemes or
lemmas. Package supportiveness and availability also considerably have some issues in
dealing with it as the words of Ancient Tamil script differs from modern Tamil. The
proposed work overcomes the complexity of classifying ancient words. The proposed work
is based on designing the Neural Model for POS Tag Classification and Prediction of
Words from the Ancient 11th century stone inscription script. Bi-LSTM model is
implemented with the embedding layer of vectors of words for training the POS Tagging
model based on pattern generation of regular expressions and classifying the words into
tags and prediction of Tags of words for any novel script given that involves syntactic tag
assigning and predicting tag for concerning words efficiently. The proposed model
provides 88.88% accuracy compared to the existing works in the stream.

Keywords