Bengali/Bangla Phrase Set

A Phrase Set for Bengali/Bangla Text Entry Evaluations Based on Actual Text Messages

Ahmed Sabbir Arif & Sarah Fardeen

January 13, 2016
Last update: February 16, 2016

Abstract — User studies evaluating text entry techniques usually require participants to transcribe phrases. Yet at present, there is no dataset available for Bengali text entry research that includes phrases entered on mobile devices. This forces researchers to collect phrases from various sources, compromising the external validity of the work. In this paper, we present a set of Bengali phrases composed by real users on actual mobile devices. Through an analysis of the dataset, we show that it contains phrases with varying lengths, symbols, and numbers.

For further details, see our paper below.

Ahmed Sabbir Arif and Sarah Fardeen. 2016. A phrase set for Bengali text entry evaluations based on actual text messages. In Proceedings of the 34th Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '16). ACM, New York, NY, USA, 2992-2998. DOI: http://dx.doi.org/10.1145/2851581.2892394

Downloads

The complete dataset ZIP

All phrases, 266 phrases TXT
Phrases by length, 266 phrases (Unicode marks/signs are not counted as individual characters) TXT
Phrases by length, 266 phrases (Unicode marks/signs are counted as individual characters) TXT
Phrases without numbers, 252 phrases TXT
Phrases without punctuations, 122 phrases TXT
Phrases without numbers and punctuations, 111 phrases TXT

Letter (consonant) frequency TXT
Word frequency (the top 249) TXT

External Links

Download Unicode and ANSI Bengali fonts

See our paper on performance metrics for Bengali text entry research.

Sayan Sarcar, Ahmed Sabbir Arif, and Ali Mazalek. 2015. Metrics for Bengali text entry research. In CHI 2015 Workshop on Text Entry on the Edge (April 18, 2015). Seoul, South Korea, 4 pages.

Use WebTEM to record performance metrics for Bengali text entry techniques on any device.

Ahmed Sabbir Arif and Ali Mazalek. 2016. WebTEM: A Web application to record text entry metrics. In Proceedings of the 2016 ACM International Conference on Interactive Surfaces and Spaces (ISS '16). ACM, New York, NY, USA, 415-420. DOI: http://dx.doi.org/10.1145/2992154.2996791