ºÝºÝߣshows by User: andresmasegosa / http://www.slideshare.net/images/logo.gif ºÝºÝߣshows by User: andresmasegosa / Tue, 24 Nov 2015 19:41:46 GMT ºÝºÝߣShare feed for ºÝºÝߣshows by User: andresmasegosa Varying parameter in classification based on imprecise probabilities /andresmasegosa/varying-parameter-in-classification-based-on-imprecise-probabilities smps2006-151124194146-lva1-app6892
We shall present a first explorative study of the variation of the parameter s of the imprecise Dirichlet model when it is used to build classification trees. In the method to build classification trees we use uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. We will use the imprecise Dirichlet model to obtain a credal set from a sample, where the set of probabilities obtained depends on s. According to the characteristics of the dataset used, we will see that the results can be improved varying the values of s. ]]>

We shall present a first explorative study of the variation of the parameter s of the imprecise Dirichlet model when it is used to build classification trees. In the method to build classification trees we use uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. We will use the imprecise Dirichlet model to obtain a credal set from a sample, where the set of probabilities obtained depends on s. According to the characteristics of the dataset used, we will see that the results can be improved varying the values of s. ]]>
Tue, 24 Nov 2015 19:41:46 GMT /andresmasegosa/varying-parameter-in-classification-based-on-imprecise-probabilities andresmasegosa@slideshare.net(andresmasegosa) Varying parameter in classification based on imprecise probabilities andresmasegosa We shall present a first explorative study of the variation of the parameter s of the imprecise Dirichlet model when it is used to build classification trees. In the method to build classification trees we use uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. We will use the imprecise Dirichlet model to obtain a credal set from a sample, where the set of probabilities obtained depends on s. According to the characteristics of the dataset used, we will see that the results can be improved varying the values of s. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/smps2006-151124194146-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> We shall present a first explorative study of the variation of the parameter s of the imprecise Dirichlet model when it is used to build classification trees. In the method to build classification trees we use uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. We will use the imprecise Dirichlet model to obtain a credal set from a sample, where the set of probabilities obtained depends on s. According to the characteristics of the dataset used, we will see that the results can be improved varying the values of s.
Varying parameter in classification based on imprecise probabilities from NTNU
]]>
279 5 https://cdn.slidesharecdn.com/ss_thumbnails/smps2006-151124194146-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data /slideshow/an-importance-sampling-approach-to-integrate-expert-knowledge-when-learning-bayesian-networks-from-data/55478360 slides-intercativelearning-ipmu-151124194045-lva1-app6891
The introduction of expert knowledge when learning Bayesian Networks from data is known to be an excellent approach to boost the performance of automatic learning methods, specially when the data is scarce. Previous approaches for this problem based on Bayesian statistics introduce the expert knowledge modifying the prior probability distributions. In this study, we propose a new methodology based on Monte Carlo simulation which starts with non-informative priors and requires knowledge from the expert a posteriori, when the simulation ends. We also explore a new Importance Sampling method for Monte Carlo simulation and the definition of new non-informative priors for the structure of the network. All these approaches are experimentally validated with five standard Bayesian networks. Read more: http://link.springer.com/chapter/10.1007%2F978-3-642-14049-5_70]]>

The introduction of expert knowledge when learning Bayesian Networks from data is known to be an excellent approach to boost the performance of automatic learning methods, specially when the data is scarce. Previous approaches for this problem based on Bayesian statistics introduce the expert knowledge modifying the prior probability distributions. In this study, we propose a new methodology based on Monte Carlo simulation which starts with non-informative priors and requires knowledge from the expert a posteriori, when the simulation ends. We also explore a new Importance Sampling method for Monte Carlo simulation and the definition of new non-informative priors for the structure of the network. All these approaches are experimentally validated with five standard Bayesian networks. Read more: http://link.springer.com/chapter/10.1007%2F978-3-642-14049-5_70]]>
Tue, 24 Nov 2015 19:40:45 GMT /slideshow/an-importance-sampling-approach-to-integrate-expert-knowledge-when-learning-bayesian-networks-from-data/55478360 andresmasegosa@slideshare.net(andresmasegosa) An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data andresmasegosa The introduction of expert knowledge when learning Bayesian Networks from data is known to be an excellent approach to boost the performance of automatic learning methods, specially when the data is scarce. Previous approaches for this problem based on Bayesian statistics introduce the expert knowledge modifying the prior probability distributions. In this study, we propose a new methodology based on Monte Carlo simulation which starts with non-informative priors and requires knowledge from the expert a posteriori, when the simulation ends. We also explore a new Importance Sampling method for Monte Carlo simulation and the definition of new non-informative priors for the structure of the network. All these approaches are experimentally validated with five standard Bayesian networks. Read more: http://link.springer.com/chapter/10.1007%2F978-3-642-14049-5_70 <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/slides-intercativelearning-ipmu-151124194045-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> The introduction of expert knowledge when learning Bayesian Networks from data is known to be an excellent approach to boost the performance of automatic learning methods, specially when the data is scarce. Previous approaches for this problem based on Bayesian statistics introduce the expert knowledge modifying the prior probability distributions. In this study, we propose a new methodology based on Monte Carlo simulation which starts with non-informative priors and requires knowledge from the expert a posteriori, when the simulation ends. We also explore a new Importance Sampling method for Monte Carlo simulation and the definition of new non-informative priors for the structure of the network. All these approaches are experimentally validated with five standard Bayesian networks. Read more: http://link.springer.com/chapter/10.1007%2F978-3-642-14049-5_70
An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data from NTNU
]]>
471 6 https://cdn.slidesharecdn.com/ss_thumbnails/slides-intercativelearning-ipmu-151124194045-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Bagging Decision Trees on Data Sets with Classification Noise /slideshow/bagging-decision-trees-on-data-sets-with-classification-noise/55478313 slides-foiks-2010-151124193924-lva1-app6891
In many of the real applications of supervised classification techniques, the data sets employed to learn the models contains classification noise (some instances of the data set have wrong assignations of the class label), principally due to deficiencies in the data capture process. Bagging ensembles of decision trees are considered to be one of the most outperforming supervised classification models in these situations. In this paper, we propose Bagging ensemble of credal decision trees, which are based on imprecise probabilities, via the Imprecise Dirichlet model, and information based uncertainty measures, via the maximum of entropy function. We remark that our method can be applied on data sets with continuous variables and missing data. With an experimental study, we prove that Bagging credal decision trees outperforms more complex Bagging approaches in data sets with classification noise. Furthermore, using a bias-variance error decomposition analysis, we also justify the performance of our approach showing that it achieves a stronger and more robust reduction of the variance error component.]]>

In many of the real applications of supervised classification techniques, the data sets employed to learn the models contains classification noise (some instances of the data set have wrong assignations of the class label), principally due to deficiencies in the data capture process. Bagging ensembles of decision trees are considered to be one of the most outperforming supervised classification models in these situations. In this paper, we propose Bagging ensemble of credal decision trees, which are based on imprecise probabilities, via the Imprecise Dirichlet model, and information based uncertainty measures, via the maximum of entropy function. We remark that our method can be applied on data sets with continuous variables and missing data. With an experimental study, we prove that Bagging credal decision trees outperforms more complex Bagging approaches in data sets with classification noise. Furthermore, using a bias-variance error decomposition analysis, we also justify the performance of our approach showing that it achieves a stronger and more robust reduction of the variance error component.]]>
Tue, 24 Nov 2015 19:39:23 GMT /slideshow/bagging-decision-trees-on-data-sets-with-classification-noise/55478313 andresmasegosa@slideshare.net(andresmasegosa) Bagging Decision Trees on Data Sets with Classification Noise andresmasegosa In many of the real applications of supervised classification techniques, the data sets employed to learn the models contains classification noise (some instances of the data set have wrong assignations of the class label), principally due to deficiencies in the data capture process. Bagging ensembles of decision trees are considered to be one of the most outperforming supervised classification models in these situations. In this paper, we propose Bagging ensemble of credal decision trees, which are based on imprecise probabilities, via the Imprecise Dirichlet model, and information based uncertainty measures, via the maximum of entropy function. We remark that our method can be applied on data sets with continuous variables and missing data. With an experimental study, we prove that Bagging credal decision trees outperforms more complex Bagging approaches in data sets with classification noise. Furthermore, using a bias-variance error decomposition analysis, we also justify the performance of our approach showing that it achieves a stronger and more robust reduction of the variance error component. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/slides-foiks-2010-151124193924-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In many of the real applications of supervised classification techniques, the data sets employed to learn the models contains classification noise (some instances of the data set have wrong assignations of the class label), principally due to deficiencies in the data capture process. Bagging ensembles of decision trees are considered to be one of the most outperforming supervised classification models in these situations. In this paper, we propose Bagging ensemble of credal decision trees, which are based on imprecise probabilities, via the Imprecise Dirichlet model, and information based uncertainty measures, via the maximum of entropy function. We remark that our method can be applied on data sets with continuous variables and missing data. With an experimental study, we prove that Bagging credal decision trees outperforms more complex Bagging approaches in data sets with classification noise. Furthermore, using a bias-variance error decomposition analysis, we also justify the performance of our approach showing that it achieves a stronger and more robust reduction of the variance error component.
Bagging Decision Trees on Data Sets with Classification Noise from NTNU
]]>
749 7 https://cdn.slidesharecdn.com/ss_thumbnails/slides-foiks-2010-151124193924-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
lassification with decision trees from a nonparametric predictive inference perspective /slideshow/lassification-with-decision-trees-from-a-nonparametric-predictive-inference-perspective/55478216 slides-ercim2011-151124193654-lva1-app6891
An application of nonparametric predictive inference for multinomial data (NPI) to classification tasks is presented. This model is applied to an established procedure for building classification trees using imprecise probabilities and uncertainty measures, thus far used only with the imprecise Dirichlet model (IDM), that is defined through the use of a parameter expressing previous knowledge. The accuracy of that procedure of classification has a significant dependence on the value of the parameter used when the IDM is applied. A detailed study involving 40 data sets shows that the procedure using the NPI model (which has no parameter dependence) obtains a better trade-off between accuracy and size of tree than does the procedure when the IDM is used, whatever the choice of parameter. In a bias-variance study of the errors, it is proved that the procedure with the NPI model has a lower variance than the one with the IDM, implying a lower level of over-fitting.]]>

An application of nonparametric predictive inference for multinomial data (NPI) to classification tasks is presented. This model is applied to an established procedure for building classification trees using imprecise probabilities and uncertainty measures, thus far used only with the imprecise Dirichlet model (IDM), that is defined through the use of a parameter expressing previous knowledge. The accuracy of that procedure of classification has a significant dependence on the value of the parameter used when the IDM is applied. A detailed study involving 40 data sets shows that the procedure using the NPI model (which has no parameter dependence) obtains a better trade-off between accuracy and size of tree than does the procedure when the IDM is used, whatever the choice of parameter. In a bias-variance study of the errors, it is proved that the procedure with the NPI model has a lower variance than the one with the IDM, implying a lower level of over-fitting.]]>
Tue, 24 Nov 2015 19:36:53 GMT /slideshow/lassification-with-decision-trees-from-a-nonparametric-predictive-inference-perspective/55478216 andresmasegosa@slideshare.net(andresmasegosa) lassification with decision trees from a nonparametric predictive inference perspective andresmasegosa An application of nonparametric predictive inference for multinomial data (NPI) to classification tasks is presented. This model is applied to an established procedure for building classification trees using imprecise probabilities and uncertainty measures, thus far used only with the imprecise Dirichlet model (IDM), that is defined through the use of a parameter expressing previous knowledge. The accuracy of that procedure of classification has a significant dependence on the value of the parameter used when the IDM is applied. A detailed study involving 40 data sets shows that the procedure using the NPI model (which has no parameter dependence) obtains a better trade-off between accuracy and size of tree than does the procedure when the IDM is used, whatever the choice of parameter. In a bias-variance study of the errors, it is proved that the procedure with the NPI model has a lower variance than the one with the IDM, implying a lower level of over-fitting. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/slides-ercim2011-151124193654-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> An application of nonparametric predictive inference for multinomial data (NPI) to classification tasks is presented. This model is applied to an established procedure for building classification trees using imprecise probabilities and uncertainty measures, thus far used only with the imprecise Dirichlet model (IDM), that is defined through the use of a parameter expressing previous knowledge. The accuracy of that procedure of classification has a significant dependence on the value of the parameter used when the IDM is applied. A detailed study involving 40 data sets shows that the procedure using the NPI model (which has no parameter dependence) obtains a better trade-off between accuracy and size of tree than does the procedure when the IDM is used, whatever the choice of parameter. In a bias-variance study of the errors, it is proved that the procedure with the NPI model has a lower variance than the one with the IDM, implying a lower level of over-fitting.
lassification with decision trees from a nonparametric predictive inference perspective from NTNU
]]>
743 5 https://cdn.slidesharecdn.com/ss_thumbnails/slides-ercim2011-151124193654-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Locally Averaged Bayesian Dirichlet Metrics /slideshow/locally-averaged-bayesian-dirichlet-metrics/55478178 slides-ecsqaru11-151124193543-lva1-app6892
The marginal likelihood of the data computed using Bayesian score metrics is at the core of score+search methods when learning Bayesian networks from data. However, common formulations of those Bayesian score metrics depend of free parameters which are hard to asses. Recent theoretical and experimental works have also shown as the commonly employed BDeu score metric is strongly biased by the particular assignments of its free parameter known as the equivalent sample size and, also, as an optimal selection of this parameter depends of the underlying distribution. This sensibility causes that wrong choices of this parameter lead to inferred models which do not properly represent the distribution generating the data even with large sample sizes. To overcome this issue we introduce here an approach which tries to marginalize this free parameter with a simple averaging method. As experimentally shown, this approach robustly performs as well as an optimum selection of this parameter while it prevents from the choice of wrong settings for this widely applied Bayesian score metric.]]>

The marginal likelihood of the data computed using Bayesian score metrics is at the core of score+search methods when learning Bayesian networks from data. However, common formulations of those Bayesian score metrics depend of free parameters which are hard to asses. Recent theoretical and experimental works have also shown as the commonly employed BDeu score metric is strongly biased by the particular assignments of its free parameter known as the equivalent sample size and, also, as an optimal selection of this parameter depends of the underlying distribution. This sensibility causes that wrong choices of this parameter lead to inferred models which do not properly represent the distribution generating the data even with large sample sizes. To overcome this issue we introduce here an approach which tries to marginalize this free parameter with a simple averaging method. As experimentally shown, this approach robustly performs as well as an optimum selection of this parameter while it prevents from the choice of wrong settings for this widely applied Bayesian score metric.]]>
Tue, 24 Nov 2015 19:35:43 GMT /slideshow/locally-averaged-bayesian-dirichlet-metrics/55478178 andresmasegosa@slideshare.net(andresmasegosa) Locally Averaged Bayesian Dirichlet Metrics andresmasegosa The marginal likelihood of the data computed using Bayesian score metrics is at the core of score+search methods when learning Bayesian networks from data. However, common formulations of those Bayesian score metrics depend of free parameters which are hard to asses. Recent theoretical and experimental works have also shown as the commonly employed BDeu score metric is strongly biased by the particular assignments of its free parameter known as the equivalent sample size and, also, as an optimal selection of this parameter depends of the underlying distribution. This sensibility causes that wrong choices of this parameter lead to inferred models which do not properly represent the distribution generating the data even with large sample sizes. To overcome this issue we introduce here an approach which tries to marginalize this free parameter with a simple averaging method. As experimentally shown, this approach robustly performs as well as an optimum selection of this parameter while it prevents from the choice of wrong settings for this widely applied Bayesian score metric. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/slides-ecsqaru11-151124193543-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> The marginal likelihood of the data computed using Bayesian score metrics is at the core of score+search methods when learning Bayesian networks from data. However, common formulations of those Bayesian score metrics depend of free parameters which are hard to asses. Recent theoretical and experimental works have also shown as the commonly employed BDeu score metric is strongly biased by the particular assignments of its free parameter known as the equivalent sample size and, also, as an optimal selection of this parameter depends of the underlying distribution. This sensibility causes that wrong choices of this parameter lead to inferred models which do not properly represent the distribution generating the data even with large sample sizes. To overcome this issue we introduce here an approach which tries to marginalize this free parameter with a simple averaging method. As experimentally shown, this approach robustly performs as well as an optimum selection of this parameter while it prevents from the choice of wrong settings for this widely applied Bayesian score metric.
Locally Averaged Bayesian Dirichlet Metrics from NTNU
]]>
356 5 https://cdn.slidesharecdn.com/ss_thumbnails/slides-ecsqaru11-151124193543-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification /slideshow/application-of-a-selective-gaussian-nave-bayes-model-for-diffuselarge-bcell-lymphoma-classification/55478128 pgm2004slides-151124193416-lva1-app6891
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification]]>

Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification]]>
Tue, 24 Nov 2015 19:34:16 GMT /slideshow/application-of-a-selective-gaussian-nave-bayes-model-for-diffuselarge-bcell-lymphoma-classification/55478128 andresmasegosa@slideshare.net(andresmasegosa) Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification andresmasegosa Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/pgm2004slides-151124193416-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification
Application of a Selective Gaussian Na誰ve Bayes Model for Diffuse-Large B-Cell Lymphoma Classification from NTNU
]]>
551 7 https://cdn.slidesharecdn.com/ss_thumbnails/pgm2004slides-151124193416-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
An interactive approach for cleaning noisy observations in Bayesian networks with the help of an expert /slideshow/an-interactive-approach-for-cleaning-noisy-observations-in-bayesian-networks-with-the-help-of-an-expert/55478083 pgm2012-slides-151124193307-lva1-app6891
When using Bayesian networks in real applications it is often the case that the empirical evidence or observations we employ for making inferences are corrupt and contain noise: Failure in a sensor, outliers, human errors, etc. Although many methods have been pro- posed in the literature for data cleaning (i.e. detect and correct noisy data values), all of these methods perform this task automatically. In this paper we argue that, if avail- able, expert knowledge should be used for this task and we propose two methods which explicitly interact with an expert for detecting and correcting noisy observations.]]>

When using Bayesian networks in real applications it is often the case that the empirical evidence or observations we employ for making inferences are corrupt and contain noise: Failure in a sensor, outliers, human errors, etc. Although many methods have been pro- posed in the literature for data cleaning (i.e. detect and correct noisy data values), all of these methods perform this task automatically. In this paper we argue that, if avail- able, expert knowledge should be used for this task and we propose two methods which explicitly interact with an expert for detecting and correcting noisy observations.]]>
Tue, 24 Nov 2015 19:33:07 GMT /slideshow/an-interactive-approach-for-cleaning-noisy-observations-in-bayesian-networks-with-the-help-of-an-expert/55478083 andresmasegosa@slideshare.net(andresmasegosa) An interactive approach for cleaning noisy observations in Bayesian networks with the help of an expert andresmasegosa When using Bayesian networks in real applications it is often the case that the empirical evidence or observations we employ for making inferences are corrupt and contain noise: Failure in a sensor, outliers, human errors, etc. Although many methods have been pro- posed in the literature for data cleaning (i.e. detect and correct noisy data values), all of these methods perform this task automatically. In this paper we argue that, if avail- able, expert knowledge should be used for this task and we propose two methods which explicitly interact with an expert for detecting and correcting noisy observations. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/pgm2012-slides-151124193307-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> When using Bayesian networks in real applications it is often the case that the empirical evidence or observations we employ for making inferences are corrupt and contain noise: Failure in a sensor, outliers, human errors, etc. Although many methods have been pro- posed in the literature for data cleaning (i.e. detect and correct noisy data values), all of these methods perform this task automatically. In this paper we argue that, if avail- able, expert knowledge should be used for this task and we propose two methods which explicitly interact with an expert for detecting and correcting noisy observations.
An interactive approach for cleaning noisy observations in Bayesian networks with the help of an expert from NTNU
]]>
392 4 https://cdn.slidesharecdn.com/ss_thumbnails/pgm2012-slides-151124193307-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Learning classifiers from discretized expression quantitative trait loci /slideshow/learning-classifiers-from-discretized-expression-quantitative-trait-loci/55478054 masegosaiwbbio-151124193219-lva1-app6892
Expression quantitative trait loci are used as a tool to iden- tify genetic causes of natural variation in gene expression. Only in a few cases the expression of a gene is controlled by a variant on a single marker. There is a plethora of different complexity levels of interaction ef- fects within markers, within genes and between marker and genes. This complexity challenges biostatisticians and bioinformatitians every day and makes findings difficult to appear. As a way to simplify analysis and better control confounders, we tried a new approach for associa- tion analysis between genotypes and expression data. We pursued to understand whether discretization of expression data can be useful in genome-transcriptome association analyses. By discretizing the depen- dent variable, algorithms for learning classifiers from data as well as performing block selection were used to help understanding the relation- ship between the expression of a gene and genetic markers. We present the results of a first set of studies in which we used this approach to de- tect new possible causes of expression variation of DRB5, a gene playing an important role within the immune system. A supplementary website including a link to the software with the method implemented can be found at http://bios.ugr.es/classDRB5. ]]>

Expression quantitative trait loci are used as a tool to iden- tify genetic causes of natural variation in gene expression. Only in a few cases the expression of a gene is controlled by a variant on a single marker. There is a plethora of different complexity levels of interaction ef- fects within markers, within genes and between marker and genes. This complexity challenges biostatisticians and bioinformatitians every day and makes findings difficult to appear. As a way to simplify analysis and better control confounders, we tried a new approach for associa- tion analysis between genotypes and expression data. We pursued to understand whether discretization of expression data can be useful in genome-transcriptome association analyses. By discretizing the depen- dent variable, algorithms for learning classifiers from data as well as performing block selection were used to help understanding the relation- ship between the expression of a gene and genetic markers. We present the results of a first set of studies in which we used this approach to de- tect new possible causes of expression variation of DRB5, a gene playing an important role within the immune system. A supplementary website including a link to the software with the method implemented can be found at http://bios.ugr.es/classDRB5. ]]>
Tue, 24 Nov 2015 19:32:19 GMT /slideshow/learning-classifiers-from-discretized-expression-quantitative-trait-loci/55478054 andresmasegosa@slideshare.net(andresmasegosa) Learning classifiers from discretized expression quantitative trait loci andresmasegosa Expression quantitative trait loci are used as a tool to iden- tify genetic causes of natural variation in gene expression. Only in a few cases the expression of a gene is controlled by a variant on a single marker. There is a plethora of different complexity levels of interaction ef- fects within markers, within genes and between marker and genes. This complexity challenges biostatisticians and bioinformatitians every day and makes findings difficult to appear. As a way to simplify analysis and better control confounders, we tried a new approach for associa- tion analysis between genotypes and expression data. We pursued to understand whether discretization of expression data can be useful in genome-transcriptome association analyses. By discretizing the depen- dent variable, algorithms for learning classifiers from data as well as performing block selection were used to help understanding the relation- ship between the expression of a gene and genetic markers. We present the results of a first set of studies in which we used this approach to de- tect new possible causes of expression variation of DRB5, a gene playing an important role within the immune system. A supplementary website including a link to the software with the method implemented can be found at http://bios.ugr.es/classDRB5. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/masegosaiwbbio-151124193219-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Expression quantitative trait loci are used as a tool to iden- tify genetic causes of natural variation in gene expression. Only in a few cases the expression of a gene is controlled by a variant on a single marker. There is a plethora of different complexity levels of interaction ef- fects within markers, within genes and between marker and genes. This complexity challenges biostatisticians and bioinformatitians every day and makes findings difficult to appear. As a way to simplify analysis and better control confounders, we tried a new approach for associa- tion analysis between genotypes and expression data. We pursued to understand whether discretization of expression data can be useful in genome-transcriptome association analyses. By discretizing the depen- dent variable, algorithms for learning classifiers from data as well as performing block selection were used to help understanding the relation- ship between the expression of a gene and genetic markers. We present the results of a first set of studies in which we used this approach to de- tect new possible causes of expression variation of DRB5, a gene playing an important role within the immune system. A supplementary website including a link to the software with the method implemented can be found at http://bios.ugr.es/classDRB5.
Learning classifiers from discretized expression quantitative trait loci from NTNU
]]>
288 4 https://cdn.slidesharecdn.com/ss_thumbnails/masegosaiwbbio-151124193219-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Split Criterions for Variable Selection Using Decision Trees /andresmasegosa/split-criterions-for-variable-selection-using-decision-trees masegosa-split-crit-ecsqaru07-151124193033-lva1-app6891
In the field of attribute mining, several feature selection methods have recently appeared indicating that the use of sets of decision trees learnt from a data set can be an useful tool for selecting relevant and informative variables regarding to a main class variable. With this aim, in this study, we claim that the use of a new split criterion to build decision trees outperforms another classic split criterions for variable selection purposes. We present an experimental study on a wide and different set of databases using only one decision tree with each split criterion to select variables for the Naive Bayes classifier.]]>

In the field of attribute mining, several feature selection methods have recently appeared indicating that the use of sets of decision trees learnt from a data set can be an useful tool for selecting relevant and informative variables regarding to a main class variable. With this aim, in this study, we claim that the use of a new split criterion to build decision trees outperforms another classic split criterions for variable selection purposes. We present an experimental study on a wide and different set of databases using only one decision tree with each split criterion to select variables for the Naive Bayes classifier.]]>
Tue, 24 Nov 2015 19:30:33 GMT /andresmasegosa/split-criterions-for-variable-selection-using-decision-trees andresmasegosa@slideshare.net(andresmasegosa) Split Criterions for Variable Selection Using Decision Trees andresmasegosa In the field of attribute mining, several feature selection methods have recently appeared indicating that the use of sets of decision trees learnt from a data set can be an useful tool for selecting relevant and informative variables regarding to a main class variable. With this aim, in this study, we claim that the use of a new split criterion to build decision trees outperforms another classic split criterions for variable selection purposes. We present an experimental study on a wide and different set of databases using only one decision tree with each split criterion to select variables for the Naive Bayes classifier. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-split-crit-ecsqaru07-151124193033-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In the field of attribute mining, several feature selection methods have recently appeared indicating that the use of sets of decision trees learnt from a data set can be an useful tool for selecting relevant and informative variables regarding to a main class variable. With this aim, in this study, we claim that the use of a new split criterion to build decision trees outperforms another classic split criterions for variable selection purposes. We present an experimental study on a wide and different set of databases using only one decision tree with each split criterion to select variables for the Naive Bayes classifier.
Split Criterions for Variable Selection Using Decision Trees from NTNU
]]>
717 6 https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-split-crit-ecsqaru07-151124193033-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
A Semi-naive Bayes Classifier with Grouping of Cases /slideshow/a-seminaive-bayes-classifier-with-grouping-of-cases/55477938 masegosa-seminb-ecsqaru2007-151124192924-lva1-app6891
In this work, we present a semi-naive Bayes classifier that searches for dependent attributes using different filter approaches. In order to avoid that the number of cases of the compound attributes be too high, a grouping procedure is applied each time after two variables are merged. This method tries to group two or more cases of the new variable into an unique value. In an emperical study, we show as this approach outperforms the naive Bayes classifier in a very robust way and reaches the performance of the Pazzani’s semi-naive Bayes [1] without the high cost of a wrapper search. ]]>

In this work, we present a semi-naive Bayes classifier that searches for dependent attributes using different filter approaches. In order to avoid that the number of cases of the compound attributes be too high, a grouping procedure is applied each time after two variables are merged. This method tries to group two or more cases of the new variable into an unique value. In an emperical study, we show as this approach outperforms the naive Bayes classifier in a very robust way and reaches the performance of the Pazzani’s semi-naive Bayes [1] without the high cost of a wrapper search. ]]>
Tue, 24 Nov 2015 19:29:24 GMT /slideshow/a-seminaive-bayes-classifier-with-grouping-of-cases/55477938 andresmasegosa@slideshare.net(andresmasegosa) A Semi-naive Bayes Classifier with Grouping of Cases andresmasegosa In this work, we present a semi-naive Bayes classifier that searches for dependent attributes using different filter approaches. In order to avoid that the number of cases of the compound attributes be too high, a grouping procedure is applied each time after two variables are merged. This method tries to group two or more cases of the new variable into an unique value. In an emperical study, we show as this approach outperforms the naive Bayes classifier in a very robust way and reaches the performance of the Pazzani’s semi-naive Bayes [1] without the high cost of a wrapper search. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-seminb-ecsqaru2007-151124192924-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In this work, we present a semi-naive Bayes classifier that searches for dependent attributes using different filter approaches. In order to avoid that the number of cases of the compound attributes be too high, a grouping procedure is applied each time after two variables are merged. This method tries to group two or more cases of the new variable into an unique value. In an emperical study, we show as this approach outperforms the naive Bayes classifier in a very robust way and reaches the performance of the Pazzani’s semi-naive Bayes [1] without the high cost of a wrapper search.
A Semi-naive Bayes Classifier with Grouping of Cases from NTNU
]]>
617 4 https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-seminb-ecsqaru2007-151124192924-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures /slideshow/combining-decision-trees-based-on-imprecise-probabilities-and-uncertainty-measures/55477880 masegosa-combined-ecsqaru2007-151124192810-lva1-app6891
In this article, we shall present a method for combining classification trees obtained by a simple method from the imprecise Dirichlet model (IDM) and uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. Our combine method has principally two characteristics: it obtains a high percentage of correct classifications using a few number of classification trees and it can be parallelized to apply on very large databases. ]]>

In this article, we shall present a method for combining classification trees obtained by a simple method from the imprecise Dirichlet model (IDM) and uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. Our combine method has principally two characteristics: it obtains a high percentage of correct classifications using a few number of classification trees and it can be parallelized to apply on very large databases. ]]>
Tue, 24 Nov 2015 19:28:10 GMT /slideshow/combining-decision-trees-based-on-imprecise-probabilities-and-uncertainty-measures/55477880 andresmasegosa@slideshare.net(andresmasegosa) Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures andresmasegosa In this article, we shall present a method for combining classification trees obtained by a simple method from the imprecise Dirichlet model (IDM) and uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. Our combine method has principally two characteristics: it obtains a high percentage of correct classifications using a few number of classification trees and it can be parallelized to apply on very large databases. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-combined-ecsqaru2007-151124192810-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In this article, we shall present a method for combining classification trees obtained by a simple method from the imprecise Dirichlet model (IDM) and uncertainty measures on closed and convex sets of probability distributions, otherwise known as credal sets. Our combine method has principally two characteristics: it obtains a high percentage of correct classifications using a few number of classification trees and it can be parallelized to apply on very large databases.
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures from NTNU
]]>
350 5 https://cdn.slidesharecdn.com/ss_thumbnails/masegosa-combined-ecsqaru2007-151124192810-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Interactive Learning of Bayesian Networks /slideshow/interactive-learning-of-bayesian-networks/55477802 interactivelearning-utrecht-151124192623-lva1-app6891
Using domain/expert knowledge when learning Bayesian networks from data has been considered a promising idea since the very beginning of the field. However, in most of the previously proposed approaches, human experts do not play an active role in the learning process. Once their knowledge is elicited, they do not participate any more. The interactive approach for integrating domain/expert knowledge we propose in this work aims to be more efficient and effective. In contrast to previous approaches, our method performs an active interaction with the expert in order to guide the search based learning process. This method relies on identifying the edges of the graph structure which are more unreliable considering the information present in the learning data. Another contribution of our approach is the integration of domain/expert knowledge at different stages of the learning process of a Bayesian network: while learning the skeleton and when directing the edges of the directed acyclic graph structure.]]>

Using domain/expert knowledge when learning Bayesian networks from data has been considered a promising idea since the very beginning of the field. However, in most of the previously proposed approaches, human experts do not play an active role in the learning process. Once their knowledge is elicited, they do not participate any more. The interactive approach for integrating domain/expert knowledge we propose in this work aims to be more efficient and effective. In contrast to previous approaches, our method performs an active interaction with the expert in order to guide the search based learning process. This method relies on identifying the edges of the graph structure which are more unreliable considering the information present in the learning data. Another contribution of our approach is the integration of domain/expert knowledge at different stages of the learning process of a Bayesian network: while learning the skeleton and when directing the edges of the directed acyclic graph structure.]]>
Tue, 24 Nov 2015 19:26:22 GMT /slideshow/interactive-learning-of-bayesian-networks/55477802 andresmasegosa@slideshare.net(andresmasegosa) Interactive Learning of Bayesian Networks andresmasegosa Using domain/expert knowledge when learning Bayesian networks from data has been considered a promising idea since the very beginning of the field. However, in most of the previously proposed approaches, human experts do not play an active role in the learning process. Once their knowledge is elicited, they do not participate any more. The interactive approach for integrating domain/expert knowledge we propose in this work aims to be more efficient and effective. In contrast to previous approaches, our method performs an active interaction with the expert in order to guide the search based learning process. This method relies on identifying the edges of the graph structure which are more unreliable considering the information present in the learning data. Another contribution of our approach is the integration of domain/expert knowledge at different stages of the learning process of a Bayesian network: while learning the skeleton and when directing the edges of the directed acyclic graph structure. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/interactivelearning-utrecht-151124192623-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Using domain/expert knowledge when learning Bayesian networks from data has been considered a promising idea since the very beginning of the field. However, in most of the previously proposed approaches, human experts do not play an active role in the learning process. Once their knowledge is elicited, they do not participate any more. The interactive approach for integrating domain/expert knowledge we propose in this work aims to be more efficient and effective. In contrast to previous approaches, our method performs an active interaction with the expert in order to guide the search based learning process. This method relies on identifying the edges of the graph structure which are more unreliable considering the information present in the learning data. Another contribution of our approach is the integration of domain/expert knowledge at different stages of the learning process of a Bayesian network: while learning the skeleton and when directing the edges of the directed acyclic graph structure.
Interactive Learning of Bayesian Networks from NTNU
]]>
255 5 https://cdn.slidesharecdn.com/ss_thumbnails/interactivelearning-utrecht-151124192623-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
A Bayesian approach to estimate probabilities in classification trees /slideshow/a-bayesian-approach-to-estimate-probabilities-in-classification-trees/55477758 id40-canomasegosamoral-poster-pgm08-151124192512-lva1-app6892
Classification or decision trees are one of the most effective methods for supervised clas- sification. In this work, we present a Bayesian approach to induce classification trees based on a Bayesian score splitting criterion and a new Bayesian method to estimate the probability of class membership based on Bayesian model averaging over the rules of the previously induced tree. In an experimental evaluation, we show as our approach reaches the performance of Quinlan’s C4.5, one of the most known decision tree inducers, in terms of predictive accuracy and clearly outperforms it in terms of better probability class estimates.]]>

Classification or decision trees are one of the most effective methods for supervised clas- sification. In this work, we present a Bayesian approach to induce classification trees based on a Bayesian score splitting criterion and a new Bayesian method to estimate the probability of class membership based on Bayesian model averaging over the rules of the previously induced tree. In an experimental evaluation, we show as our approach reaches the performance of Quinlan’s C4.5, one of the most known decision tree inducers, in terms of predictive accuracy and clearly outperforms it in terms of better probability class estimates.]]>
Tue, 24 Nov 2015 19:25:12 GMT /slideshow/a-bayesian-approach-to-estimate-probabilities-in-classification-trees/55477758 andresmasegosa@slideshare.net(andresmasegosa) A Bayesian approach to estimate probabilities in classification trees andresmasegosa Classification or decision trees are one of the most effective methods for supervised clas- sification. In this work, we present a Bayesian approach to induce classification trees based on a Bayesian score splitting criterion and a new Bayesian method to estimate the probability of class membership based on Bayesian model averaging over the rules of the previously induced tree. In an experimental evaluation, we show as our approach reaches the performance of Quinlan’s C4.5, one of the most known decision tree inducers, in terms of predictive accuracy and clearly outperforms it in terms of better probability class estimates. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/id40-canomasegosamoral-poster-pgm08-151124192512-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Classification or decision trees are one of the most effective methods for supervised clas- sification. In this work, we present a Bayesian approach to induce classification trees based on a Bayesian score splitting criterion and a new Bayesian method to estimate the probability of class membership based on Bayesian model averaging over the rules of the previously induced tree. In an experimental evaluation, we show as our approach reaches the performance of Quinlan’s C4.5, one of the most known decision tree inducers, in terms of predictive accuracy and clearly outperforms it in terms of better probability class estimates.
A Bayesian approach to estimate probabilities in classification trees from NTNU
]]>
195 4 https://cdn.slidesharecdn.com/ss_thumbnails/id40-canomasegosamoral-poster-pgm08-151124192512-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
A Bayesian Random Split to Build Ensembles of Classification Trees /slideshow/a-bayesian-random-split-to-build-ensembles-of-classification-trees/55477671 ecsqaru2009-brs-151124192309-lva1-app6892
Random forest models [1] consist of an ensemble of randomized decision trees. It is one of the best performing classification models. With this idea in mind, in this section we introduced a random split operator based on a Bayesian approach for building a random forest. The convenience of this split method for constructing ensembles of classification trees is justified with an error bias-variance decomposition analysis. This new split operator does not clearly depend on a parameter K as its random forest’s counterpart, and performs better with a lower number of trees. http://link.springer.com/chapter/10.1007%2F978-3-642-02906-6_41]]>

Random forest models [1] consist of an ensemble of randomized decision trees. It is one of the best performing classification models. With this idea in mind, in this section we introduced a random split operator based on a Bayesian approach for building a random forest. The convenience of this split method for constructing ensembles of classification trees is justified with an error bias-variance decomposition analysis. This new split operator does not clearly depend on a parameter K as its random forest’s counterpart, and performs better with a lower number of trees. http://link.springer.com/chapter/10.1007%2F978-3-642-02906-6_41]]>
Tue, 24 Nov 2015 19:23:09 GMT /slideshow/a-bayesian-random-split-to-build-ensembles-of-classification-trees/55477671 andresmasegosa@slideshare.net(andresmasegosa) A Bayesian Random Split to Build Ensembles of Classification Trees andresmasegosa Random forest models [1] consist of an ensemble of randomized decision trees. It is one of the best performing classification models. With this idea in mind, in this section we introduced a random split operator based on a Bayesian approach for building a random forest. The convenience of this split method for constructing ensembles of classification trees is justified with an error bias-variance decomposition analysis. This new split operator does not clearly depend on a parameter K as its random forest’s counterpart, and performs better with a lower number of trees. http://link.springer.com/chapter/10.1007%2F978-3-642-02906-6_41 <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru2009-brs-151124192309-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Random forest models [1] consist of an ensemble of randomized decision trees. It is one of the best performing classification models. With this idea in mind, in this section we introduced a random split operator based on a Bayesian approach for building a random forest. The convenience of this split method for constructing ensembles of classification trees is justified with an error bias-variance decomposition analysis. This new split operator does not clearly depend on a parameter K as its random forest’s counterpart, and performs better with a lower number of trees. http://link.springer.com/chapter/10.1007%2F978-3-642-02906-6_41
A Bayesian Random Split to Build Ensembles of Classification Trees from NTNU
]]>
259 8 https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru2009-brs-151124192309-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise /slideshow/an-experimental-study-about-simple-decision-trees-for-bagging-ensemble-on-datasets-with-classification-noise/55477551 ecsqaru2009-bagging-151124192017-lva1-app6891
Decision trees are simple structures used in supervised classification learning. The results of the application of decision trees in classification can be notably improved using ensemble methods such as Bagging, Boosting or Randomization, largely used in the literature. Bagging outperforms Boosting and Randomization in situations with classification noise. In this paper, we present an experimental study of the use of different simple decision tree methods for bagging ensemble in supervised classification, proving that simple credal decision trees (based on imprecise probabilities and uncertainty measures) outperforms the use of classical decision tree methods for this type of procedure when they are applied on datasets with classification noise.]]>

Decision trees are simple structures used in supervised classification learning. The results of the application of decision trees in classification can be notably improved using ensemble methods such as Bagging, Boosting or Randomization, largely used in the literature. Bagging outperforms Boosting and Randomization in situations with classification noise. In this paper, we present an experimental study of the use of different simple decision tree methods for bagging ensemble in supervised classification, proving that simple credal decision trees (based on imprecise probabilities and uncertainty measures) outperforms the use of classical decision tree methods for this type of procedure when they are applied on datasets with classification noise.]]>
Tue, 24 Nov 2015 19:20:17 GMT /slideshow/an-experimental-study-about-simple-decision-trees-for-bagging-ensemble-on-datasets-with-classification-noise/55477551 andresmasegosa@slideshare.net(andresmasegosa) An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise andresmasegosa Decision trees are simple structures used in supervised classification learning. The results of the application of decision trees in classification can be notably improved using ensemble methods such as Bagging, Boosting or Randomization, largely used in the literature. Bagging outperforms Boosting and Randomization in situations with classification noise. In this paper, we present an experimental study of the use of different simple decision tree methods for bagging ensemble in supervised classification, proving that simple credal decision trees (based on imprecise probabilities and uncertainty measures) outperforms the use of classical decision tree methods for this type of procedure when they are applied on datasets with classification noise. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru2009-bagging-151124192017-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Decision trees are simple structures used in supervised classification learning. The results of the application of decision trees in classification can be notably improved using ensemble methods such as Bagging, Boosting or Randomization, largely used in the literature. Bagging outperforms Boosting and Randomization in situations with classification noise. In this paper, we present an experimental study of the use of different simple decision tree methods for bagging ensemble in supervised classification, proving that simple credal decision trees (based on imprecise probabilities and uncertainty measures) outperforms the use of classical decision tree methods for this type of procedure when they are applied on datasets with classification noise.
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise from NTNU
]]>
402 5 https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru2009-bagging-151124192017-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classification: Some Improvements in Preprocessing and Variable Elimination /slideshow/selective-gaussian-nave-bayes-model-for-diffuse-largebcell-lymphoma-classification-some-improvements-in-preprocessing-and-variable-elimination/55477495 ecsqaru05-con-ugr-151124191844-lva1-app6891
In this work, we present some significant improvements for for feature selection in wrapper methods. They are two: the first of them consists in a proper preordering of the feature set; and the second one consists in the application of an irrelevant feature elimination method, where the irrelevance condition is subjected to the partial selected feature subset by the wrapper method. We validate these approaches with the Diffuse Large B-Cell Lymphoma subtype classification problem and we show that these two changes are an important improvement in the computation cost and the classification accuracy of these wrapper methods in this domain.]]>

In this work, we present some significant improvements for for feature selection in wrapper methods. They are two: the first of them consists in a proper preordering of the feature set; and the second one consists in the application of an irrelevant feature elimination method, where the irrelevance condition is subjected to the partial selected feature subset by the wrapper method. We validate these approaches with the Diffuse Large B-Cell Lymphoma subtype classification problem and we show that these two changes are an important improvement in the computation cost and the classification accuracy of these wrapper methods in this domain.]]>
Tue, 24 Nov 2015 19:18:44 GMT /slideshow/selective-gaussian-nave-bayes-model-for-diffuse-largebcell-lymphoma-classification-some-improvements-in-preprocessing-and-variable-elimination/55477495 andresmasegosa@slideshare.net(andresmasegosa) Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classification: Some Improvements in Preprocessing and Variable Elimination andresmasegosa In this work, we present some significant improvements for for feature selection in wrapper methods. They are two: the first of them consists in a proper preordering of the feature set; and the second one consists in the application of an irrelevant feature elimination method, where the irrelevance condition is subjected to the partial selected feature subset by the wrapper method. We validate these approaches with the Diffuse Large B-Cell Lymphoma subtype classification problem and we show that these two changes are an important improvement in the computation cost and the classification accuracy of these wrapper methods in this domain. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru05-con-ugr-151124191844-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In this work, we present some significant improvements for for feature selection in wrapper methods. They are two: the first of them consists in a proper preordering of the feature set; and the second one consists in the application of an irrelevant feature elimination method, where the irrelevance condition is subjected to the partial selected feature subset by the wrapper method. We validate these approaches with the Diffuse Large B-Cell Lymphoma subtype classification problem and we show that these two changes are an important improvement in the computation cost and the classification accuracy of these wrapper methods in this domain.
Selective Gaussian Na誰ve Bayes Model for Diffuse Large-B-Cell Lymphoma Classification: Some Improvements in Preprocessing and Variable Elimination from NTNU
]]>
323 6 https://cdn.slidesharecdn.com/ss_thumbnails/ecsqaru05-con-ugr-151124191844-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Evaluating query-independent object features for relevancy prediction /andresmasegosa/evaluating-queryindependent-object-features-for-relevancy-prediction ecir07-slides-151124191650-lva1-app6891
This paper presents a series of experiments investigating the effec- tiveness of query-independent features extracted from retrieved objects to predict relevancy. Features were grouped into a set of conceptual categories, and indi- vidually evaluated based on click-through data collected in a laboratory-setting user study. The results showed that while textual and visual features were useful for relevancy prediction in a topic-independent condition, a range of features can be effective when topic knowledge was available. We also re-visited the original study from the perspective of significant features identified by our experiments.]]>

This paper presents a series of experiments investigating the effec- tiveness of query-independent features extracted from retrieved objects to predict relevancy. Features were grouped into a set of conceptual categories, and indi- vidually evaluated based on click-through data collected in a laboratory-setting user study. The results showed that while textual and visual features were useful for relevancy prediction in a topic-independent condition, a range of features can be effective when topic knowledge was available. We also re-visited the original study from the perspective of significant features identified by our experiments.]]>
Tue, 24 Nov 2015 19:16:50 GMT /andresmasegosa/evaluating-queryindependent-object-features-for-relevancy-prediction andresmasegosa@slideshare.net(andresmasegosa) Evaluating query-independent object features for relevancy prediction andresmasegosa This paper presents a series of experiments investigating the effec- tiveness of query-independent features extracted from retrieved objects to predict relevancy. Features were grouped into a set of conceptual categories, and indi- vidually evaluated based on click-through data collected in a laboratory-setting user study. The results showed that while textual and visual features were useful for relevancy prediction in a topic-independent condition, a range of features can be effective when topic knowledge was available. We also re-visited the original study from the perspective of significant features identified by our experiments. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/ecir07-slides-151124191650-lva1-app6891-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> This paper presents a series of experiments investigating the effec- tiveness of query-independent features extracted from retrieved objects to predict relevancy. Features were grouped into a set of conceptual categories, and indi- vidually evaluated based on click-through data collected in a laboratory-setting user study. The results showed that while textual and visual features were useful for relevancy prediction in a topic-independent condition, a range of features can be effective when topic knowledge was available. We also re-visited the original study from the perspective of significant features identified by our experiments.
Evaluating query-independent object features for relevancy prediction from NTNU
]]>
213 4 https://cdn.slidesharecdn.com/ss_thumbnails/ecir07-slides-151124191650-lva1-app6891-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Effects of Highly Agreed Documents in Relevancy Prediction /slideshow/effects-of-highly-agreed-documents-in-relevancy-prediction/55477318 andres-poster-sigir07-151124191429-lva1-app6892
Finding significant contextual features is a challenging task in the development of interactive information retrieval (IR) systems. This paper investigated a simple method to facil- itate such a task by looking at aggregated relevance judge- ments of retrieved documents. Our study suggested that the agreement on relevance judgements can indicate the effec- tiveness of retrieved documents as the source of significant features. The effect of highly agreed documents gives us prac- tical implication for the design of adaptive search models in interactive IR systems.]]>

Finding significant contextual features is a challenging task in the development of interactive information retrieval (IR) systems. This paper investigated a simple method to facil- itate such a task by looking at aggregated relevance judge- ments of retrieved documents. Our study suggested that the agreement on relevance judgements can indicate the effec- tiveness of retrieved documents as the source of significant features. The effect of highly agreed documents gives us prac- tical implication for the design of adaptive search models in interactive IR systems.]]>
Tue, 24 Nov 2015 19:14:29 GMT /slideshow/effects-of-highly-agreed-documents-in-relevancy-prediction/55477318 andresmasegosa@slideshare.net(andresmasegosa) Effects of Highly Agreed Documents in Relevancy Prediction andresmasegosa Finding significant contextual features is a challenging task in the development of interactive information retrieval (IR) systems. This paper investigated a simple method to facil- itate such a task by looking at aggregated relevance judge- ments of retrieved documents. Our study suggested that the agreement on relevance judgements can indicate the effec- tiveness of retrieved documents as the source of significant features. The effect of highly agreed documents gives us prac- tical implication for the design of adaptive search models in interactive IR systems. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/andres-poster-sigir07-151124191429-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Finding significant contextual features is a challenging task in the development of interactive information retrieval (IR) systems. This paper investigated a simple method to facil- itate such a task by looking at aggregated relevance judge- ments of retrieved documents. Our study suggested that the agreement on relevance judgements can indicate the effec- tiveness of retrieved documents as the source of significant features. The effect of highly agreed documents gives us prac- tical implication for the design of adaptive search models in interactive IR systems.
Effects of Highly Agreed Documents in Relevancy Prediction from NTNU
]]>
116 4 https://cdn.slidesharecdn.com/ss_thumbnails/andres-poster-sigir07-151124191429-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Conference poster 6 /slideshow/conference-poster-6-40239854/40239854 conferenceposter6-141014043809-conversion-gate02
UAI Poster]]>

UAI Poster]]>
Tue, 14 Oct 2014 04:38:09 GMT /slideshow/conference-poster-6-40239854/40239854 andresmasegosa@slideshare.net(andresmasegosa) Conference poster 6 andresmasegosa UAI Poster <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/conferenceposter6-141014043809-conversion-gate02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> UAI Poster
Conference poster 6 from NTNU
]]>
126 1 https://cdn.slidesharecdn.com/ss_thumbnails/conferenceposter6-141014043809-conversion-gate02-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-andresmasegosa-48x48.jpg?cb=1613045115 I am a research fellow at NTNU (Norway) with broad interests in data mining and machine learning using probabilistic graphical models. Lately, my research has focused on scalable machine learning methods for solving real use cases in the financial (BCC group) and automotive industry (Daimler group). I am coauthor of more than 50 scientist papers in different journal and international conferences covering applied areas such as bioinformatics, information retrieval, crime prediction and sales forces allocation. I have been speaker in dozens of international conferences related to machine learning research over the last fifteen years. Recently, our team has been invited to present the AMIDST... andresmasegosa.wix.com/homepage https://cdn.slidesharecdn.com/ss_thumbnails/smps2006-151124194146-lva1-app6892-thumbnail.jpg?width=320&height=320&fit=bounds andresmasegosa/varying-parameter-in-classification-based-on-imprecise-probabilities Varying parameter in c... https://cdn.slidesharecdn.com/ss_thumbnails/slides-intercativelearning-ipmu-151124194045-lva1-app6891-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/an-importance-sampling-approach-to-integrate-expert-knowledge-when-learning-bayesian-networks-from-data/55478360 An Importance Sampling... https://cdn.slidesharecdn.com/ss_thumbnails/slides-foiks-2010-151124193924-lva1-app6891-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/bagging-decision-trees-on-data-sets-with-classification-noise/55478313 Bagging Decision Trees...