General Information

This website contains a Text Corpus of the modern Tatar language consisting of over 500 million word occurrences (>620 mln tokens).
The corpus represents modern written Tatar language in electronic form.
The total count of different word forms in the Tatar corpus is about 5 mln.
This collection of Tatar texts in electronic form is intended for the use of those interested in the structure, present condition and prospects of the Tatar language.
The Corpus of Written Tatar language is indispensable for everyone who wants to study Tatar by the methods of corpus linguistics.

This project does not get financial support from any scientific fund or organization.
All work on the Corpus of Written Tatar is being done by the project participants in spare time.

Due to the quarantine situation, part of the equipment on which the Housing is located was turned off. Therefore, the main search form is temporarily not available.
You can use the alternative search form based on the NoSketch Engine system in the left menu or by clicking on the link here.

Project news