際際滷shows by User: asriverwang / http://www.slideshare.net/images/logo.gif 際際滷shows by User: asriverwang / Thu, 06 Nov 2014 13:04:59 GMT 際際滷Share feed for 際際滷shows by User: asriverwang Automatic Set List Identification and Song Segmentation of Full-Length Concert Videos @ISMIR2014 /slideshow/automatic-set-list-identification-and-song-segmentation-of-fulllength-concert-videos-ismir2014/41221915 ismir2014setlist-141106130459-conversion-gate01
Recently, plenty of full-length concert videos have become available on video-sharing websites such as YouTube. As each video generally contains multiple songs, natural questions that arise include what is the set list? and when does each song begin and end? Indeed, many full concert videos on YouTube contain song lists and timecodes contributed by uploaders and viewers. However, newly uploaded content and videos of lesser-known artists typically lack this metadata. Manually labeling such metadata would be labor-intensive, and thus an automated solution is desirable. In this paper, we define a novel research problem, automatic set list segmentation of full concert videos, which calls for techniques in music information retrieval (MIR) such as audio fingerprinting, cover song identification, musical event detection, music alignment, and structural segmentation. Moreover, we propose a greedy approach that sequentially identifies a song from a database of studio versions and simultaneously estimates its probable boundaries in the concert. We conduct preliminary evaluations on a collection of 20 full concerts and 1,152 studio tracks. Our result demonstrates the effectiveness of the proposed greedy algorithm.]]>

Recently, plenty of full-length concert videos have become available on video-sharing websites such as YouTube. As each video generally contains multiple songs, natural questions that arise include what is the set list? and when does each song begin and end? Indeed, many full concert videos on YouTube contain song lists and timecodes contributed by uploaders and viewers. However, newly uploaded content and videos of lesser-known artists typically lack this metadata. Manually labeling such metadata would be labor-intensive, and thus an automated solution is desirable. In this paper, we define a novel research problem, automatic set list segmentation of full concert videos, which calls for techniques in music information retrieval (MIR) such as audio fingerprinting, cover song identification, musical event detection, music alignment, and structural segmentation. Moreover, we propose a greedy approach that sequentially identifies a song from a database of studio versions and simultaneously estimates its probable boundaries in the concert. We conduct preliminary evaluations on a collection of 20 full concerts and 1,152 studio tracks. Our result demonstrates the effectiveness of the proposed greedy algorithm.]]>
Thu, 06 Nov 2014 13:04:59 GMT /slideshow/automatic-set-list-identification-and-song-segmentation-of-fulllength-concert-videos-ismir2014/41221915 asriverwang@slideshare.net(asriverwang) Automatic Set List Identification and Song Segmentation of Full-Length Concert Videos @ISMIR2014 asriverwang Recently, plenty of full-length concert videos have become available on video-sharing websites such as YouTube. As each video generally contains multiple songs, natural questions that arise include what is the set list? and when does each song begin and end? Indeed, many full concert videos on YouTube contain song lists and timecodes contributed by uploaders and viewers. However, newly uploaded content and videos of lesser-known artists typically lack this metadata. Manually labeling such metadata would be labor-intensive, and thus an automated solution is desirable. In this paper, we define a novel research problem, automatic set list segmentation of full concert videos, which calls for techniques in music information retrieval (MIR) such as audio fingerprinting, cover song identification, musical event detection, music alignment, and structural segmentation. Moreover, we propose a greedy approach that sequentially identifies a song from a database of studio versions and simultaneously estimates its probable boundaries in the concert. We conduct preliminary evaluations on a collection of 20 full concerts and 1,152 studio tracks. Our result demonstrates the effectiveness of the proposed greedy algorithm. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/ismir2014setlist-141106130459-conversion-gate01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Recently, plenty of full-length concert videos have become available on video-sharing websites such as YouTube. As each video generally contains multiple songs, natural questions that arise include what is the set list? and when does each song begin and end? Indeed, many full concert videos on YouTube contain song lists and timecodes contributed by uploaders and viewers. However, newly uploaded content and videos of lesser-known artists typically lack this metadata. Manually labeling such metadata would be labor-intensive, and thus an automated solution is desirable. In this paper, we define a novel research problem, automatic set list segmentation of full concert videos, which calls for techniques in music information retrieval (MIR) such as audio fingerprinting, cover song identification, musical event detection, music alignment, and structural segmentation. Moreover, we propose a greedy approach that sequentially identifies a song from a database of studio versions and simultaneously estimates its probable boundaries in the concert. We conduct preliminary evaluations on a collection of 20 full concerts and 1,152 studio tracks. Our result demonstrates the effectiveness of the proposed greedy algorithm.
Automatic Set List Identification and Song Segmentation of Full-Length Concert Videos @ISMIR2014 from Ju-Chiang Wang
]]>
776 4 https://cdn.slidesharecdn.com/ss_thumbnails/ismir2014setlist-141106130459-conversion-gate01-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Exploring the Relationship Between Multi-Modal Emotion Semantics of Music /slideshow/exploring-the-relationship-between-multimodal-emotion-semantics-of-music/26316423 mirum2012-130918115113-phpapp01
Computational modeling of music emotion has been addressed primarily by two approaches: the categorical approach that categorizes emotions into mood classes and the dimensional approach that regards emotions as numerical values over a few dimensions such as valence and activation. Being two extreme scenarios (discrete/continuous), the two approaches actually share a unified goal of understanding the emotion semantics of music. This paper presents the first computational model that unifies the two semantic modalities under a probabilistic framework, which makes it possible to explore the relationship between them in a computational way. With the proposed framework, mood labels can be mapped into the emotion space in an unsupervised and content-based manner, without any training ground truth annotations for the semantic mapping. Such a function can be applied to automatically generate a semantically structured tag cloud in the emotion space. To demonstrate the effectiveness of the proposed framework, we qualitatively evaluate the mood tag clouds generated from two emotion-annotated corpora, and quantitatively evaluate the accuracy of the categorical-dimensional mapping by comparing the results with those created by psychologists, including the one proposed by Whissell & Plutchik and the one defined in the Affective Norms for English Words (ANEW).]]>

Computational modeling of music emotion has been addressed primarily by two approaches: the categorical approach that categorizes emotions into mood classes and the dimensional approach that regards emotions as numerical values over a few dimensions such as valence and activation. Being two extreme scenarios (discrete/continuous), the two approaches actually share a unified goal of understanding the emotion semantics of music. This paper presents the first computational model that unifies the two semantic modalities under a probabilistic framework, which makes it possible to explore the relationship between them in a computational way. With the proposed framework, mood labels can be mapped into the emotion space in an unsupervised and content-based manner, without any training ground truth annotations for the semantic mapping. Such a function can be applied to automatically generate a semantically structured tag cloud in the emotion space. To demonstrate the effectiveness of the proposed framework, we qualitatively evaluate the mood tag clouds generated from two emotion-annotated corpora, and quantitatively evaluate the accuracy of the categorical-dimensional mapping by comparing the results with those created by psychologists, including the one proposed by Whissell & Plutchik and the one defined in the Affective Norms for English Words (ANEW).]]>
Wed, 18 Sep 2013 11:51:13 GMT /slideshow/exploring-the-relationship-between-multimodal-emotion-semantics-of-music/26316423 asriverwang@slideshare.net(asriverwang) Exploring the Relationship Between Multi-Modal Emotion Semantics of Music asriverwang Computational modeling of music emotion has been addressed primarily by two approaches: the categorical approach that categorizes emotions into mood classes and the dimensional approach that regards emotions as numerical values over a few dimensions such as valence and activation. Being two extreme scenarios (discrete/continuous), the two approaches actually share a unified goal of understanding the emotion semantics of music. This paper presents the first computational model that unifies the two semantic modalities under a probabilistic framework, which makes it possible to explore the relationship between them in a computational way. With the proposed framework, mood labels can be mapped into the emotion space in an unsupervised and content-based manner, without any training ground truth annotations for the semantic mapping. Such a function can be applied to automatically generate a semantically structured tag cloud in the emotion space. To demonstrate the effectiveness of the proposed framework, we qualitatively evaluate the mood tag clouds generated from two emotion-annotated corpora, and quantitatively evaluate the accuracy of the categorical-dimensional mapping by comparing the results with those created by psychologists, including the one proposed by Whissell & Plutchik and the one defined in the Affective Norms for English Words (ANEW). <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/mirum2012-130918115113-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Computational modeling of music emotion has been addressed primarily by two approaches: the categorical approach that categorizes emotions into mood classes and the dimensional approach that regards emotions as numerical values over a few dimensions such as valence and activation. Being two extreme scenarios (discrete/continuous), the two approaches actually share a unified goal of understanding the emotion semantics of music. This paper presents the first computational model that unifies the two semantic modalities under a probabilistic framework, which makes it possible to explore the relationship between them in a computational way. With the proposed framework, mood labels can be mapped into the emotion space in an unsupervised and content-based manner, without any training ground truth annotations for the semantic mapping. Such a function can be applied to automatically generate a semantically structured tag cloud in the emotion space. To demonstrate the effectiveness of the proposed framework, we qualitatively evaluate the mood tag clouds generated from two emotion-annotated corpora, and quantitatively evaluate the accuracy of the categorical-dimensional mapping by comparing the results with those created by psychologists, including the one proposed by Whissell &amp; Plutchik and the one defined in the Affective Norms for English Words (ANEW).
Exploring the Relationship Between Multi-Modal Emotion Semantics of Music from Ju-Chiang Wang
]]>
1076 4 https://cdn.slidesharecdn.com/ss_thumbnails/mirum2012-130918115113-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Personalized Music Emotion Recognition via Model Adaptation /slideshow/apsipa-asc-2012/26314732 apsipa2012-130918110116-phpapp01
In the music information retrieval (MIR) research, developing a computational model that comprehends the affective content of music signal and utilizes such a model to organize music collections have been an essential topic. Emotion perception in music is in nature subjective. Consequently, building a general emotion recognition system that performs equally well for every user could be insufficient. In contrast, it would be more desirable for one's personal computer/device being able to understand his/her perception of music emotion. In our previous work, we have developed the acoustic emotion Gaussians (AEG) model, which can learn the broad emotion perception of music from general users. Such a general music emotion model, called the background AEG model in this paper, can recognize the perceived emotion of unseen music from a general point of view. In this paper, we go one step further to realize the personalized music emotion modeling by adapting the background AEG model with a limited number of emotion annotations provided by a target user in an online and dynamic fashion. A novel maximum a posteriori (MAP)-based algorithm is proposed to achieve this in a probabilistic framework. We carry out quantitative evaluations on a well-known emotion annotated corpus, MER60, to validate the effectiveness of the proposed method for personalized music emotion recognition.]]>

In the music information retrieval (MIR) research, developing a computational model that comprehends the affective content of music signal and utilizes such a model to organize music collections have been an essential topic. Emotion perception in music is in nature subjective. Consequently, building a general emotion recognition system that performs equally well for every user could be insufficient. In contrast, it would be more desirable for one's personal computer/device being able to understand his/her perception of music emotion. In our previous work, we have developed the acoustic emotion Gaussians (AEG) model, which can learn the broad emotion perception of music from general users. Such a general music emotion model, called the background AEG model in this paper, can recognize the perceived emotion of unseen music from a general point of view. In this paper, we go one step further to realize the personalized music emotion modeling by adapting the background AEG model with a limited number of emotion annotations provided by a target user in an online and dynamic fashion. A novel maximum a posteriori (MAP)-based algorithm is proposed to achieve this in a probabilistic framework. We carry out quantitative evaluations on a well-known emotion annotated corpus, MER60, to validate the effectiveness of the proposed method for personalized music emotion recognition.]]>
Wed, 18 Sep 2013 11:01:16 GMT /slideshow/apsipa-asc-2012/26314732 asriverwang@slideshare.net(asriverwang) Personalized Music Emotion Recognition via Model Adaptation asriverwang In the music information retrieval (MIR) research, developing a computational model that comprehends the affective content of music signal and utilizes such a model to organize music collections have been an essential topic. Emotion perception in music is in nature subjective. Consequently, building a general emotion recognition system that performs equally well for every user could be insufficient. In contrast, it would be more desirable for one's personal computer/device being able to understand his/her perception of music emotion. In our previous work, we have developed the acoustic emotion Gaussians (AEG) model, which can learn the broad emotion perception of music from general users. Such a general music emotion model, called the background AEG model in this paper, can recognize the perceived emotion of unseen music from a general point of view. In this paper, we go one step further to realize the personalized music emotion modeling by adapting the background AEG model with a limited number of emotion annotations provided by a target user in an online and dynamic fashion. A novel maximum a posteriori (MAP)-based algorithm is proposed to achieve this in a probabilistic framework. We carry out quantitative evaluations on a well-known emotion annotated corpus, MER60, to validate the effectiveness of the proposed method for personalized music emotion recognition. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/apsipa2012-130918110116-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In the music information retrieval (MIR) research, developing a computational model that comprehends the affective content of music signal and utilizes such a model to organize music collections have been an essential topic. Emotion perception in music is in nature subjective. Consequently, building a general emotion recognition system that performs equally well for every user could be insufficient. In contrast, it would be more desirable for one&#39;s personal computer/device being able to understand his/her perception of music emotion. In our previous work, we have developed the acoustic emotion Gaussians (AEG) model, which can learn the broad emotion perception of music from general users. Such a general music emotion model, called the background AEG model in this paper, can recognize the perceived emotion of unseen music from a general point of view. In this paper, we go one step further to realize the personalized music emotion modeling by adapting the background AEG model with a limited number of emotion annotations provided by a target user in an online and dynamic fashion. A novel maximum a posteriori (MAP)-based algorithm is proposed to achieve this in a probabilistic framework. We carry out quantitative evaluations on a well-known emotion annotated corpus, MER60, to validate the effectiveness of the proposed method for personalized music emotion recognition.
Personalized Music Emotion Recognition via Model Adaptation from Ju-Chiang Wang
]]>
1122 6 https://cdn.slidesharecdn.com/ss_thumbnails/apsipa2012-130918110116-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
ACM Multimedia 2012 Grand Challenge: Music Video Generation /slideshow/acm-mulgrand-challenge/26314573 grandchallenge-130918105706-phpapp02
These slides present a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Gaussians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence.]]>

These slides present a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Gaussians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence.]]>
Wed, 18 Sep 2013 10:57:06 GMT /slideshow/acm-mulgrand-challenge/26314573 asriverwang@slideshare.net(asriverwang) ACM Multimedia 2012 Grand Challenge: Music Video Generation asriverwang These slides present a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Gaussians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/grandchallenge-130918105706-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> These slides present a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Gaussians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence.
ACM Multimedia 2012 Grand Challenge: Music Video Generation from Ju-Chiang Wang
]]>
256 3 https://cdn.slidesharecdn.com/ss_thumbnails/grandchallenge-130918105706-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
The Acoustic Emotion Gaussians Model for Emotion-based Music Annotation and Retrieval /slideshow/acm-multimedia-2012-the-acoustic-emotion/26314410 acmmm2012-130918105415-phpapp02
One of the most exciting but challenging endeavors in music research is to develop a computational model that comprehends the affective content of music signals and organizes a music collection according to emotion. In this paper, we propose a novel \emph{acoustic emotion Gaussians} (AEG) model that defines a proper generative process of emotion perception in music. As a generative model, AEG permits easy and straightforward interpretations of the model learning processes. To bridge the acoustic feature space and music emotion space, a set of \emph{latent feature classes}, which are learned from data, is introduced to perform the end-to-end semantic mappings between the two spaces. Based on the space of latent feature classes, the AEG model is applicable to both automatic music emotion annotation and emotion-based music retrieval. To gain insights into the AEG model, we also provide illustrations of the model learning process. A comprehensive performance study is conducted to demonstrate the superior accuracy of AEG over its predecessors, using two emotion annotated music corpora MER60 and MTurk. Our results show that the AEG model outperforms the state-of-the-art methods in automatic music emotion annotation. Moreover, for the first time a quantitative evaluation of emotion-based music retrieval is reported.]]>

One of the most exciting but challenging endeavors in music research is to develop a computational model that comprehends the affective content of music signals and organizes a music collection according to emotion. In this paper, we propose a novel \emph{acoustic emotion Gaussians} (AEG) model that defines a proper generative process of emotion perception in music. As a generative model, AEG permits easy and straightforward interpretations of the model learning processes. To bridge the acoustic feature space and music emotion space, a set of \emph{latent feature classes}, which are learned from data, is introduced to perform the end-to-end semantic mappings between the two spaces. Based on the space of latent feature classes, the AEG model is applicable to both automatic music emotion annotation and emotion-based music retrieval. To gain insights into the AEG model, we also provide illustrations of the model learning process. A comprehensive performance study is conducted to demonstrate the superior accuracy of AEG over its predecessors, using two emotion annotated music corpora MER60 and MTurk. Our results show that the AEG model outperforms the state-of-the-art methods in automatic music emotion annotation. Moreover, for the first time a quantitative evaluation of emotion-based music retrieval is reported.]]>
Wed, 18 Sep 2013 10:54:15 GMT /slideshow/acm-multimedia-2012-the-acoustic-emotion/26314410 asriverwang@slideshare.net(asriverwang) The Acoustic Emotion Gaussians Model for Emotion-based Music Annotation and Retrieval asriverwang One of the most exciting but challenging endeavors in music research is to develop a computational model that comprehends the affective content of music signals and organizes a music collection according to emotion. In this paper, we propose a novel \emph{acoustic emotion Gaussians} (AEG) model that defines a proper generative process of emotion perception in music. As a generative model, AEG permits easy and straightforward interpretations of the model learning processes. To bridge the acoustic feature space and music emotion space, a set of \emph{latent feature classes}, which are learned from data, is introduced to perform the end-to-end semantic mappings between the two spaces. Based on the space of latent feature classes, the AEG model is applicable to both automatic music emotion annotation and emotion-based music retrieval. To gain insights into the AEG model, we also provide illustrations of the model learning process. A comprehensive performance study is conducted to demonstrate the superior accuracy of AEG over its predecessors, using two emotion annotated music corpora MER60 and MTurk. Our results show that the AEG model outperforms the state-of-the-art methods in automatic music emotion annotation. Moreover, for the first time a quantitative evaluation of emotion-based music retrieval is reported. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/acmmm2012-130918105415-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> One of the most exciting but challenging endeavors in music research is to develop a computational model that comprehends the affective content of music signals and organizes a music collection according to emotion. In this paper, we propose a novel \emph{acoustic emotion Gaussians} (AEG) model that defines a proper generative process of emotion perception in music. As a generative model, AEG permits easy and straightforward interpretations of the model learning processes. To bridge the acoustic feature space and music emotion space, a set of \emph{latent feature classes}, which are learned from data, is introduced to perform the end-to-end semantic mappings between the two spaces. Based on the space of latent feature classes, the AEG model is applicable to both automatic music emotion annotation and emotion-based music retrieval. To gain insights into the AEG model, we also provide illustrations of the model learning process. A comprehensive performance study is conducted to demonstrate the superior accuracy of AEG over its predecessors, using two emotion annotated music corpora MER60 and MTurk. Our results show that the AEG model outperforms the state-of-the-art methods in automatic music emotion annotation. Moreover, for the first time a quantitative evaluation of emotion-based music retrieval is reported.
The Acoustic Emotion Gaussians Model for Emotion-based Music Annotation and Retrieval from Ju-Chiang Wang
]]>
765 5 https://cdn.slidesharecdn.com/ss_thumbnails/acmmm2012-130918105415-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-asriverwang-48x48.jpg?cb=1550016371 Ju-Chiang Wang received the Ph.D. degree in Department of Electrical Engineering, National Taiwan University, in 2013. His research interests mainly encompass Multimedia Information Retrieval (MIR), Machine Learning, and Music/Speech Signal Processing. In addition to being a computer science researcher, he is enthusiastic about playing and creating music in the local rock band. https://cdn.slidesharecdn.com/ss_thumbnails/ismir2014setlist-141106130459-conversion-gate01-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/automatic-set-list-identification-and-song-segmentation-of-fulllength-concert-videos-ismir2014/41221915 Automatic Set List Ide... https://cdn.slidesharecdn.com/ss_thumbnails/mirum2012-130918115113-phpapp01-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/exploring-the-relationship-between-multimodal-emotion-semantics-of-music/26316423 Exploring the Relation... https://cdn.slidesharecdn.com/ss_thumbnails/apsipa2012-130918110116-phpapp01-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/apsipa-asc-2012/26314732 Personalized Music Emo...