Izgradnja referentnog korpusa savremenog srpskog jezika
The construction of reference corpus of contemporary Serbian
Author
Utvić, MilošMentor
Krstev, CvetanaCommittee members
Ćorić, BožoVitas, Duško
Pavlović-Lažetić, Gordana

Polovina, Vesna
Metadata
Show full item recordAbstract
U prvom delu rada se razmatraju opˇsta pitanja koja se odnose na definiciju,
istorijat, parametre i klasifikaciju korpusa, kao i na korpusnu lingvistiku kao metodologiju
istraˇzivanja jezika. Posebna paˇznja je posve´cena pitanjima reprezentativnosti
i balansiranosti korpusa kao uzorka jezika. Takode je detaljno razmotren
i uticaj Interneta, odnosno veba, na kritiˇcko preispitivanje definicije korpusa. Kao
parametri korpusa, posebno su analizirani nosaˇc, domen i namena, obim (veliˇcina),
period, izvor/medijum, anotacija i viˇsejeziˇcnost. Na osnovu tih parametara su opisane
mogu´ce klasifikacije korpusa i posebno su izdvojeni nacionalni korpusi kao opˇsti,
referentni korpusi koji pretenduju da reprezentuju jezik jedne zemlje. Detaljno su
analizirani nacionalni korpusi slovenskih jezika. Poseban odeljak je posve´cen istorijatu
srpske korpusne lingvistike. Na kraju prvog dela rada su navedeni ciljevi
rada: razmatranje mogu´cnosti izgradnje opˇsteg korpusa srpskog jezika koji ...bi bio
elektronski, dinamiˇcki, sinhroni, balansiran, anotiran (morfoloˇski, strukturno, bibliografski),
kao i mogu´cnosti izgradnje prate´cih viˇsejeziˇcnih paralelnih korpusa u
kojima je srpski izvorni ili ciljni jezik...
The problem regarding the methods and tools to construct a corpus of
contemporary Serbian as a reference language resource is considered in this thesis.
The thesis consists of three parts.
General questions related to definition, history, parameters and classification of
corpora, as well as to corpus linguistics as a methodology in language research, are
considered in the first part of the thesis. The special attention is paid to questions
regarding representativeness and balance of corpus as a language sample. The affect
of Internet/Web on critical review of corpus definition is considered in detail,
too. Corpus parameters (storage medium, domain/purpose, size, time span, mode of
communication, annotation and multilinguality) are particularly analysed. Possible
classifications of corpora, based on these parameters, are described with emphasis
on national corpora as general reference corpora which are supposed to represent
the national language of a country. National corpor...a of Slavic languages are analysed
exhaustively. A special section is dedicated to the history of Serbian corpus
linguistics. The goals of thesis are listed in the end of the first part of the thesis: considering
possibilities for construction of general, electronic, dynamic, synchronous,
balanced, morphosyntactically and bibliographically-annotated corpus, as well as
the possibilities for construction of multilingual parallel corpora with Serbian as
source or target language...