Polish historical corpus

From DM

Revision as of 15:16, 23 May 2012 by Jsbien (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The corpus uses data made available by PSNC Digital Libraries Team, namely a set of full text versions of selected Polish historical documents from four digital libraries in Poland. The texts has been prepared in the framework of the IMPACT project and used as so called Ground-Truth for evaluation and training of OCR programs.

The Poliqarp search engine provides access to two versions of the IMPACT Polish GT corpus: so called one-dimensional and two-dimensional. Together with some dictionaries of Polish they are available on the Poliqarp server at http://poliqarp.wbl.klf.uw.edu.pl/.

Personal tools