Technical background to Slavistik-Portal
and related services in FID Slavistics (SBB, UdL, 30.6.22)
Introduction +/-
- Eastern Europe Department
- Specialized Information Service (FID) Slawistik
- Slavistik-Portal
- Parallel Searching
- Related Services
Parallel search +/-
- parallel search (Metasearch) with Pazpar2 (client is javascript based, very flexible)
- Search example (live)
- Architecture , List of databases
- local databases indexed with SOLR (made by data coversion, harvesting, filtering)
- SOLR linking with P2 (passing on ranking and hit highlighting)
- Integration of event tracking (for user statistics)
Apache SOLR +/-
- Problem of Slavic characters and languages [inflection] [language mixing]
- Configuration of SOLR for Slavic languages [conf1] [conf2]
- Metadata preparation: APIs and Gateways, data import, harvesting [\metadata_processing]
- Indexing from XML sources with Posting Tool
- Full text indexing (Page to XML record)
Usefull Tools +/-
- Language Detection [url]
- Soundex-Keywords for Slavonic Languages [url]
- Widget technology [url]
- Tools for (Slavic) Computer Linguistik (Parallel Corpora, OCS Paradigm)
Multilingual Dictionaries of Slavic Languages +/-
- Getting of Data, Proofreading
- Data mining with Regular Expressions [Github]
- Presentation Layer and API [MSD] and [API]
OCS fulltext corpus of Church Slavonic prints +/-
- Digitization of Slavonic sources from 16th til 19th century [url]
- Training of data modells for Transkribus
- Processing the Data, Export and finelly
- Presentation of data and connecting to NLP-Toolset [url]