Manifestations of idiolect in the lexis of electronic mail

Gintarė Žalkauskaitė


The current study aims to establish features of idiolect in electronic mail and describe word forms that account for differences among individual writers of electronic letters. The data has been derived from a corpus of 65,000 words consisting of electronic letters written in Lithuanian by six subjects over 2008-2010 years. The WordSmith Tools software was used to generate frequency lists of six subcorpora, representing each subject. The analysis of the frequency data revealed certain peculiarities of the subcorpora. For example, the shortened form da (30 occurrences) of the adverb dar (‘still, yet, even more’) has been established in only one subcorpus. A list of 139 word forms significantly overused in a particular subcorpus was compiled. It was established, that words attributed to idiolect belong to different word classes, which proves that the features of idiolect can not be limited to one word class. The analysis of these words shows that the main part of them express modality and stance, e. g. some authors more often use modal verb galėti (‘can’), others – reikėti (‘need, have to’), others are chosen from possible synonyms and variants (e. g. shortened vs. full word forms), also there are groups of nonstandard word forms, used by some authors, and time reference words (e. g. one of the authors uses more concrete time references than others) and etc. Thus the study has established some lexical evidence of idiolect features in the electronic mail. The results of the study might be applied in forensic linguistics authorship attribution.

