A modern Pythonic implementation of the popular Bengali phonetic-typing software Avro Phonetic.
avro.py provides a fully fledged, batteries-included text parser which can parse, reverse and even convert English Roman script into its phonetic equivalent (unicode) of Bengali. At its core, it implements an extensively modified version of the Avro Phonetic Dictionary Search Library by Mehdi Hasan Khan.
Note
Update: As of October 2024, Python 3.8 has reached its EOL, so for keeping
this project updated, the minimum required version will be Python 3.9 from now
onwards. It is strongly suggested that you migrate your project for better
compatibility.
This package is inspired from Rifat Nabi's jsAvroPhonetic library and derives from Kaustav Das Modak's pyAvroPhonetic.
This package requires Python 3.9 or higher to be used inside your development environment.
# Install / upgrade.
$ pip install avro.py
avnie is a newly developed CLI tool that uses avro.py under the hood. You can install it using:
# Install / upgrade avnie.
$ pip install avnie
This small tour guide will describe how you can use avro.py back and forth to operate (cutlery!) on Bengali text. You can also check the examples directory for checking this whole snippet in action, as well as other use cases.
Let's assume I want to parse some English text to Bengali, which is "ami banglay gan gai.", so in this case to convert it to Bengali, we can use this snippet as a starter code and then extend upon it as our boilerplate for multiple operations later on:
# Import the package.
import avro
# Our dummy text.
dummy = 'ami banglay gan gai.'
# Parsing the text.
avro_output = avro.parse(dummy)
print(output) # Output: আমি বাংলায় গান গাই।
Alternatively, I can also do it in Bijoy Keyboard format:
bijoy_output = avro.parse(dummy, bijoy=True) # Output: Avwg evsjvh় Mvb MvB।
Or, we can take the previous avro_output
and convert it to Bijoy if we want to, like this:
bijoy_text = avro.to_bijoy(avro_output) # Output: Avwg evsjvh় Mvb MvB।
Conversely, we can convert the Bijoy text we got just now and convert it back to Unicode Bengali:
unicode_text = avro.to_unicode(bijoy_text) # Output: আমি বাংলায় গান গাই।
Finally, we can just reverse back to the original text we passed as input in the first place:
reversed_text = avro.reverse(uncode_text) # Output: ami banglay gan gai.
Since version 2024.12.5, the package now supports async
/await
syntax for all the functions.
Note
Unless you have a very specific use, the asynchronous functions only provide slight performance improvements and are not necessary for most use cases, so their usage is optional.
Here's a reiteration of the previous example using the new syntax:
# Imports.
import asyncio
import avro
# Main coroutine.
async def main() -> None:
# Our dummy text.
dummy = 'ami banglay gan gai.'
avro_output = await avro.parse_async(dummy)
print(output) # Output: আমি বাংলায় গান গাই।
bijoy_output = await avro.parse_async(dummy, bijoy=True)
print(bijoy_output) # Output: Avwg evsjvh় Mvb MvB।
bijoy_text = await avro.to_bijoy_async(avro_output)
print(bijoy_text) # Output: Avwg evsjvh় Mvb MvB।
unicode_text = await avro.to_unicode_async(bijoy_text)
print(unicode_text) # Output: আমি বাংলায় গান গাই।
reversed_text = await avro.reverse_async(uncode_text)
print(reversed_text) # Output: ami banglay gan gai.
# Running the event loop.
asyncio.run(main())
Fork -> Do your changes -> Send a Pull Request, it's that easy!
Additional Developer Notes
This project is based on the uv package manager by Astral. In order to automatically update and set up the environment, you can run the following command:
# (Optional) Install recommended Python version: (also sets up the virtual environment)
$ uv python install && uv venv
$ source .venv/bin/activate
# Install the project:
$ uv sync --all-extras --dev
# Build the project:
$ uv build --verbose
In order to run the tests, you can use the following command:
# Run unit tests:
$ uv run pytest .
If you come across any kind of bug or wanna request a feature, please let us know by opening an issue here. We do need more ideas to keep the project alive and running, don't we? :P
- Mehdi Hasan Khan for originally developing and maintaining Avro Phonetic.
- Rifat Nabi for porting it to Javascript.
- Sarim Khan for writing ibus-avro which helped to clarify my concepts further.
- Kaustav Das Modak for porting Rifat Nabi's JavaScript iteration to Python 2.
- Md Enzam Hossain for helping him understand the ins and outs of the Avro dictionary and the way it works.
Licensed under the MIT License.