What is lexical clustering?
Virtually any text will show recurring word patterns which may not be obvious at first sight. These patterns can be ascertained by using the ‘lexical cluster’ function incorporated in Corpus Presenter (choose the option Analyse lexical clusters in the Search menu, shortcut: Ctrl-U). The program will comb through any text and break it down into segments consisting of between 1 and 8 words and then it will list these segments with their frequencies. Such a breakdown is useful when analysing the style of an author or when examining typical collocations in a language or variety. The following illustrates the principle with chunks of three words at a time.
I think this
There are many options at your disposal with the current function, for instance you can view clusters alphabetically or by frequency and you can display the context in which each occurrence was found and go to the location in a text where a particular cluster was found. Results can be saved in a user-specified manner as a text or a database.
Analysing clusters of lexical items in texts
I think this would be an interesting project.
think this would
this would be
would be an
be an interesting
an interesting project