The document describes FACTORS, a system designed to monitor characteristics of a subject domain through around 100 required information aspects called factors. The factors can be qualitative, like social tension, or quantitative, like unemployment numbers. Qualitative factors are assigned numerical values like small or large, while quantitative factors directly state numbers and units. The system uses patterns to search texts for factors, with qualitative patterns looking for "factor + numerical value" and quantitative patterns only searching for the factor. Patterns are formed from ontology concepts and user-marked text, and allow the system to generalize, accumulate synonyms, and understand units of measurement.
1 of 8
More Related Content
Text Pattern Formation For Information Extraction
1. Lidia M. Pivovarova Saint-Petersburg State University The Ph.D. advisor: prof. Valery Sh. Rubashkin NLDB 2008
2. FACTORS - - the system designed to monitor underling characteristics of a subject domain
3. General System Description The Ontology TEXTS Lemmatization, part-of-speech tagging, semantic mark-up Morph. analyzer Semantic analyzer Situation State Search Patterns
4. The Factors Factors ¨C the required information aspects. ~ 100 factors Factors: - qualitative e.g. social tension , investment attractiveness, level of sovereignty, human rights activity - quantitative e.g. the number of unemployed, an average salary, the inflation level, the ammount of import
5. Numerical values Qualitative factors: very small , small , less than average , average, more than average , large , very large . Quantitative factors: the number + <unit> e. g. an average salary ¨C> monetary unit (ruble, $, ¡ ) the number of unemployed -> no units
6. The Patterns Qualitative factors ->¡°factor + numerical value¡± patterns. e. g. Social tension <-- spontaneous meeting (large) Quantitative factors -> ¡°only factor¡± patterns. e. g. The number of unemployed <-- become unemployed Search algorithm 1) find a pattern 2) find a number + unit if not 3) find words large, small, increase, decrease etc.
7. Pattern Formation Process Pattern is a set of words and ontology concepts. Ontology provides: - pattern generalization - synonym accumulation - information about units Pattern formation: user marks relevant fragment in a text or chooses concept from the ontology.
8. Example As is known, European Union strictly demanded Latvia to close the both generating units of Ignalinskaya nuclear power station. It is also promised to remit 3 billions euro for this goal. Factors: The EU pressure to Latvia. The financial aid of EU to Latvia.