2. The Dictionary of Italian CollocationsLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations2Part of APRIL project (Personalised web environmentforlanguagelearning)NLP resourcesas a supportfor the lexicalcompetenceofstudentsofItalianwithin a VirtualLearningEnvironment(VLE).
3. PresentationoutlineLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations3background and motivationreference corpusmethodologydictionary compilationintegrationwithin VLE
4. Backgrounddifferentsyntactic and semanticprofiles, butprototypicalfeatures:semanticnon-compositionalitynon-substitutabilityofcomponentsbysemanticallysimilarwordsnon-insertionofexternalitemscontinuum ratherthan definite categoriesLREC 2010 - Stefania Spina - The DictionaryofItalianCollocations4
5. ContinuumLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations5semanticnon-compositionalityTagliare la corda runawayaprire la porta open the doornon-substitutabilityCamera oscura dark room{fare|porre|rivolgere|formulare} una domanda ask a question* Stanza oscurainsertionofexternalitemsfare una lunga calda riposante doccia take a long, hot, restfulshowerSistema *molto operativo operating system
6. Motivation: collocations in SLALREC 2010 - Stefania Spina - The Dictionary of Italian Collocations6improvinglearnersfluencynon-nativespeakers and L2 vocabulary: first single words, then more extendedchunkstrend tooveruse the creative combinationofisolatedwordsSinclairs open choiceprincipleExamplesfromItalianleanercorporapreoccupata per il corso che mi mette nelle difficolt (Russia)mettere in difficolt cause problemse poi alla fine ho fatto questa decisione (Vietnam) Prendere una decisione make a decision
7. DICILREC 2010 - Stefania Spina - The Dictionary of Italian Collocations7collocationsrequirespecificpedagogicalattentionDictionaryofItalianCollocations(DICI)itiscorpus-based; itis a learner-orientedtool: listof the most common Italiancollocations, classified on a frequencybasis;itisalsobased on statisticalmethodologies (dispersion in the differenttextualgenresrepresented in the corpus).
8. Reference corpusLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations8Perugia corpus: POS-tagged, lemmatized
9. POS filteringLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations9Analysisofexistinglistofcollocations:150 different POS sequences10 mostproductive POS sequences
10. Experimentalmethodology: 4stepsLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations10extractionof candidate collocationsfrom corpus;filteringof the candidate collocations: frequencyand dispersion;compilation of the dictionary;integrationof the dictionarywith the online learning6POS sequences
11. 12-million-word sample, 4sectionsCollocationsextractionLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations11via IMS Corpus Workbenchremovingall the candidateswithfrequency = 141643 collocationsTwo more filters:DispersionManual (non-collocations)
12. DispersionLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations12Examples:Aggrottare la fronte tofrown (fiction)Vincere le elezioni towin the elections (press)Dare una definizione togive a definition (academic prose)JuillandsDvalue (Juilland - Chang-Rodriguez, 1964)Dvalue: combinedwithfrequency = usageUsage value 2 2047 candidate collocationsManualselection. Finalresult:listof1553 word combinations = dictionaryentries
14. Compilation of the DictionaryLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations14Lexical database enrichedwithtwokindsof data:Visibleto the learner (client output)definition, examples, part-of-speech, syntacticcontextofoccurrenceofcollocationstobeprocessedbyotherapplications (server)internalsyntacticconfigurationforautomaticrecognition
15. DB integration in the VLELREC 2010 - Stefania Spina - The Dictionary of Italian Collocations15VirtualLearningEnvironment:web applicationspecificallydevotedtolanguagelearningLELE (Linguistically-EnhancedLearningEnvironment)providelanguagelearnerswithadditional NLP resources, in ordertoimprovetheirlinguisticcompetencereceptive and productivelearningactivitiesconcerning the recognition and the activeuseofcollocations
16. LELE FeaturesLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations16toautomaticallyrecognize and highlightmulti-wordunits in writtenItaliantexts;to show additionallinguistic information about the selectedcollocations;to generate collocationtestsforcollocationalcompetenceassessmentofsecond or foreignlanguagelearners.
17. LELE schemeLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations17server
20. ConclusionsLREC 2010 - Stefania Spina - The Dictionary of Italian Collocations20Nextstep:samemethodologyto the whole corpus, forall the 10 selected POS sequencesFurtherresearchrefinestatisticalmeasuresassigncollocationstodifferentlevelsofcompetenceothertools (productivetasks)
21. LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations21Stefania Spinastefania.spina@unistrapg.ithttp://april.unistrapg.it