Introduction¶
This package contains a variety of python modules for Myanmar text proccesing, such as syllabification, romanization, encoding conversion, nrc validation etc. Only python3 is currently supported at the moment.
Installation¶
The package is distributed on PyPI and can be installed with pip:
pip install python-myanmar
For more information, please read the full documentation here.
Installation¶
Stable release¶
To install Python Myanmar, run this command in your terminal:
$ pip install python-myanmar
This is the preferred method to install Python Myanmar, as it will always install the most recent stable release.
If you don’t have pip installed, this Python installation guide can guide you through the process.
From sources¶
The sources for Python Myanmar can be downloaded from the Github repo.
You can either clone the public repository:
$ git clone git://github.com/trhura/python-myanmar
Or download the tarball:
$ curl -OL https://github.com/trhura/python-myanmar/tarball/master
Once you have a copy of the source, you can install it with:
$ python setup.py install
Syllabification¶
Morphological and phonetic syllable break for Burmese text. Syllable break with zawgyi text will not be accurate / reliable. You probably should convert it to unicode before processing.
-
myanmar.language.
MorphoSyllableBreak
(text, encoding)[source]¶ Return an iterable of morphological / visual syllables in text.
>>> from myanmar.encodings import UnicodeEncoding >>> slb = list(MorphoSyllableBreak("အကြွေးပေး", UnicodeEncoding())) >>> list(s['syllable'] for s in slb) ['အ', 'ကြွေး', 'ပေး'] >>> slb[2] {'syllable': 'ပေး', 'consonant': 'ပ', 'eVowel': 'ေ', 'visarga': 'း'}
-
myanmar.language.
PhonemicSyllableBreak
(text, encoding)[source]¶ Return an iterable of phonemic syllables in text.
>>> from myanmar.encodings import UnicodeEncoding >>> slb = list(PhonemicSyllableBreak("သီးပင်အိုင်", UnicodeEncoding())) >>> list(s['syllable'] for s in slb) ['သီး', 'ပင်', 'အိုင်'] >>> slb[0] {'syllable': 'သီး', 'consonant': 'သ', 'iVowel': 'ီ', 'visarga': 'း'}
Encodings¶
Convert text in various Myanmar encodings. It currently supports wininnwa, zawgyi, unicode. Perfomance-wise, it is not as good as other regex-based converters.
Transliteration¶
Transliterate Burmese text with latin characters. Currently, romanization based on BGN_PCGN, MLCTS, IPA systems are available.
Phonenumbers¶
Validation and normalization for Myanmar phonenumbers. Based on mm_phonenumber module from Melomap.
-
myanmar.phonenumber.
get_landline_operator
(phonenumber)[source]¶ Get operator type for a given landline number.
>>> get_landline_operator('+95674601234') 'MyanmarAPN' >>> get_landline_operator('9524261234') 'MyanmarSpeedNet' >>> get_landline_operator('14681234') 'VoIPMyanmarGroup'
-
myanmar.phonenumber.
get_phone_operator
(phonenumber)[source]¶ Get operator type for a given phonenumber.
>>> get_phone_operator('+959262624625') <Operator.Mpt: 'MPT'> >>> get_phone_operator('09970000234') <Operator.Ooredoo: 'Ooredoo'> >>> get_phone_operator('123456789') <Operator.Unknown: 'Unknown'>
-
myanmar.phonenumber.
is_valid_phonenumber
(phonenumber)[source]¶ Checks whether a given phonenumber is a valid Myanmar number or not.
>>> is_valid_phonenumber('09420028187') True >>> is_valid_phonenumber('+959420028187') True >>> is_valid_phonenumber(9420028187) False >>> is_valid_phonenumber(94200281870) False
Myanmar NRC¶
Validation and normalization for Myanmar NRC number.
Credits¶
Development Lead¶
- Thura Hlaing <trhura at gmail.com>
Contributors¶
- Set Kyar Wa Lar <setkyar16 at gmail.com>
- Aye Chan Mon <polestar.mon20 at gmail.com>
- Soe Zayar <soezayar019@gmail.com>
- Kyaw Naing Win <kyawnaingwinknw@gmail.com>