General Information

This website contains a Text Corpus of the modern Tatar language consisting of over 500 million word occurrences (>620 mln tokens).
The corpus represents modern written Tatar language in electronic form.
The total count of different word forms in the Tatar corpus is about 5 mln.
This collection of Tatar texts in electronic form is intended for the use of those interested in the structure, present condition and prospects of the Tatar language.
The Corpus of Written Tatar language is indispensable for everyone who wants to study Tatar by the methods of corpus linguistics.

This project does not get financial support from any scientific fund or organization.
All work on the Corpus of Written Tatar is being done by the project participants in spare time.

Project news