ºÝºÝߣshows by User: jonnydaenen / http://www.slideshare.net/images/logo.gif ºÝºÝߣshows by User: jonnydaenen / Sat, 20 Feb 2021 19:12:52 GMT ºÝºÝߣShare feed for ºÝºÝߣshows by User: jonnydaenen TEDxUHasseltSalon 2017 - Activate Your Data /slideshow/tedxuhasseltsalon-2017-activate-your-data/243136408 tedxuhasseltsalon2017jonnydaenenfinal-210220191252
Data science has become a fast growing field in today’s economy. With deep learning, AI and big data popping up in almost every area of the economy, the amounts of data that have to be processed are ever-increasing. With his PhD in the field of Databases, software engineer Jonny Daenen knows like no other about the importance of data science. It has led managers to improve their decision-making and initiated the growth of many companies worldwide. But how can it impact innovation? Activate your data!]]>

Data science has become a fast growing field in today’s economy. With deep learning, AI and big data popping up in almost every area of the economy, the amounts of data that have to be processed are ever-increasing. With his PhD in the field of Databases, software engineer Jonny Daenen knows like no other about the importance of data science. It has led managers to improve their decision-making and initiated the growth of many companies worldwide. But how can it impact innovation? Activate your data!]]>
Sat, 20 Feb 2021 19:12:52 GMT /slideshow/tedxuhasseltsalon-2017-activate-your-data/243136408 jonnydaenen@slideshare.net(jonnydaenen) TEDxUHasseltSalon 2017 - Activate Your Data jonnydaenen Data science has become a fast growing field in today’s economy. With deep learning, AI and big data popping up in almost every area of the economy, the amounts of data that have to be processed are ever-increasing. With his PhD in the field of Databases, software engineer Jonny Daenen knows like no other about the importance of data science. It has led managers to improve their decision-making and initiated the growth of many companies worldwide. But how can it impact innovation? Activate your data! <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/tedxuhasseltsalon2017jonnydaenenfinal-210220191252-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Data science has become a fast growing field in today’s economy. With deep learning, AI and big data popping up in almost every area of the economy, the amounts of data that have to be processed are ever-increasing. With his PhD in the field of Databases, software engineer Jonny Daenen knows like no other about the importance of data science. It has led managers to improve their decision-making and initiated the growth of many companies worldwide. But how can it impact innovation? Activate your data!
TEDxUHasseltSalon 2017 - Activate Your Data from Jonny Daenen
]]>
56 0 https://cdn.slidesharecdn.com/ss_thumbnails/tedxuhasseltsalon2017jonnydaenenfinal-210220191252-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Selligent Marketing Cloud /slideshow/selligent-marketing-cloud/242900804 selligent-210217115250
Overview of big data related activities in the marketing domain at Selligent. Presented at Big Data Symposium, Hasselt University, May 2018.
 ]]>

Overview of big data related activities in the marketing domain at Selligent. Presented at Big Data Symposium, Hasselt University, May 2018.
 ]]>
Wed, 17 Feb 2021 11:52:49 GMT /slideshow/selligent-marketing-cloud/242900804 jonnydaenen@slideshare.net(jonnydaenen) Selligent Marketing Cloud jonnydaenen Overview of big data related activities in the marketing domain at Selligent. Presented at Big Data Symposium, Hasselt University, May 2018.
 <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/selligent-210217115250-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Overview of big data related activities in the marketing domain at Selligent. Presented at Big Data Symposium, Hasselt University, May 2018.

Selligent Marketing Cloud from Jonny Daenen
]]>
99 0 https://cdn.slidesharecdn.com/ss_thumbnails/selligent-210217115250-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Meer Big Data, Minder Storende Reclame /slideshow/progra-meer-jonnydaenen/242900671 progra-meerjonnydaenen-210217115017
Big Data & Marketing, presented at Programming workshop for teachers at Hasselt University on 2018-04-24.]]>

Big Data & Marketing, presented at Programming workshop for teachers at Hasselt University on 2018-04-24.]]>
Wed, 17 Feb 2021 11:50:17 GMT /slideshow/progra-meer-jonnydaenen/242900671 jonnydaenen@slideshare.net(jonnydaenen) Meer Big Data, Minder Storende Reclame jonnydaenen Big Data & Marketing, presented at Programming workshop for teachers at Hasselt University on 2018-04-24. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/progra-meerjonnydaenen-210217115017-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Big Data &amp; Marketing, presented at Programming workshop for teachers at Hasselt University on 2018-04-24.
Meer Big Data, Minder Storende Reclame from Jonny Daenen
]]>
113 0 https://cdn.slidesharecdn.com/ss_thumbnails/progra-meerjonnydaenen-210217115017-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Building real-time data analytics on Google Cloud /slideshow/building-realtime-data-analytics-on-google-cloud/240009237 selligentdatasciencemeetup03-20202-201211150831
Presented at the 25th Data Science Leuven meetup on 2020/03/11 Jonny Daenen explains the steps they took at Selligent to create a multi-tenant real-time data pipeline. He discusses all challenges the team encountered as well as the tools they used. The benefits of using Google Cloud Platform to remove operational hurdles when moving data pipelines to production are demonstrated. ]]>

Presented at the 25th Data Science Leuven meetup on 2020/03/11 Jonny Daenen explains the steps they took at Selligent to create a multi-tenant real-time data pipeline. He discusses all challenges the team encountered as well as the tools they used. The benefits of using Google Cloud Platform to remove operational hurdles when moving data pipelines to production are demonstrated. ]]>
Fri, 11 Dec 2020 15:08:31 GMT /slideshow/building-realtime-data-analytics-on-google-cloud/240009237 jonnydaenen@slideshare.net(jonnydaenen) Building real-time data analytics on Google Cloud jonnydaenen Presented at the 25th Data Science Leuven meetup on 2020/03/11 Jonny Daenen explains the steps they took at Selligent to create a multi-tenant real-time data pipeline. He discusses all challenges the team encountered as well as the tools they used. The benefits of using Google Cloud Platform to remove operational hurdles when moving data pipelines to production are demonstrated. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/selligentdatasciencemeetup03-20202-201211150831-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the 25th Data Science Leuven meetup on 2020/03/11 Jonny Daenen explains the steps they took at Selligent to create a multi-tenant real-time data pipeline. He discusses all challenges the team encountered as well as the tools they used. The benefits of using Google Cloud Platform to remove operational hurdles when moving data pipelines to production are demonstrated.
Building real-time data analytics on Google Cloud from Jonny Daenen
]]>
78 0 https://cdn.slidesharecdn.com/ss_thumbnails/selligentdatasciencemeetup03-20202-201211150831-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
PXL Data Engineering Workshop By Selligent /slideshow/pxl-data-engineering-workshop-by-selligent/240008953 selligentpxl20201209-201211145734
On 2020-12-09 Laurens Vijnck and Jonny Daenen gave a workshop at PXL. During this session, we collectively provisioned a streaming ingestion pipeline in mere minutes. The technology stack included Pub/Sub, Dataflow, and BigQuery. Hereafter, students had the opportunity to perform interactive queries on their own real-time data to answer a series of business questions. These questions were borrowed from real-life cases that we encountered at Selligent Marketing Cloud. Google Colab (Free Jupyter Notebooks) and Google Data Studio have proven to be excellent tools to facilitate these kinds of interactive sessions.]]>

On 2020-12-09 Laurens Vijnck and Jonny Daenen gave a workshop at PXL. During this session, we collectively provisioned a streaming ingestion pipeline in mere minutes. The technology stack included Pub/Sub, Dataflow, and BigQuery. Hereafter, students had the opportunity to perform interactive queries on their own real-time data to answer a series of business questions. These questions were borrowed from real-life cases that we encountered at Selligent Marketing Cloud. Google Colab (Free Jupyter Notebooks) and Google Data Studio have proven to be excellent tools to facilitate these kinds of interactive sessions.]]>
Fri, 11 Dec 2020 14:57:34 GMT /slideshow/pxl-data-engineering-workshop-by-selligent/240008953 jonnydaenen@slideshare.net(jonnydaenen) PXL Data Engineering Workshop By Selligent jonnydaenen On 2020-12-09 Laurens Vijnck and Jonny Daenen gave a workshop at PXL. During this session, we collectively provisioned a streaming ingestion pipeline in mere minutes. The technology stack included Pub/Sub, Dataflow, and BigQuery. Hereafter, students had the opportunity to perform interactive queries on their own real-time data to answer a series of business questions. These questions were borrowed from real-life cases that we encountered at Selligent Marketing Cloud. Google Colab (Free Jupyter Notebooks) and Google Data Studio have proven to be excellent tools to facilitate these kinds of interactive sessions. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/selligentpxl20201209-201211145734-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> On 2020-12-09 Laurens Vijnck and Jonny Daenen gave a workshop at PXL. During this session, we collectively provisioned a streaming ingestion pipeline in mere minutes. The technology stack included Pub/Sub, Dataflow, and BigQuery. Hereafter, students had the opportunity to perform interactive queries on their own real-time data to answer a series of business questions. These questions were borrowed from real-life cases that we encountered at Selligent Marketing Cloud. Google Colab (Free Jupyter Notebooks) and Google Data Studio have proven to be excellent tools to facilitate these kinds of interactive sessions.
PXL Data Engineering Workshop By Selligent from Jonny Daenen
]]>
49 0 https://cdn.slidesharecdn.com/ss_thumbnails/selligentpxl20201209-201211145734-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Parallel Evaluation of Multi-Semi-Joins /slideshow/parallel-evaluation-of-multisemijoins/65746623 presentationvldb-160906170211
Presentation given on VLDB 2016: 42nd International Conference on Very Large Data Bases. Paper: http://dx.doi.org/10.14778/2977797.2977800 ArXiv: https://arxiv.org/abs/1605.05219 Poster: https://zenodo.org/record/61653 (doi 10.5281/zenodo.61653) Gumbo Software: https://github.com/JonnyDaenen/Gumbo Abstract While services such as Amazon AWS make computing power abundantly available, adding more computing nodes can incur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka wall-clock time) of queries. In this work, we provide algorithms for parallel evaluation of SGF queries in MapReduce that optimize total time, while retaining low net time. Not only can SGF queries specify all semi-join reducers, but also more expressive queries involving disjunction and negation. Since SGF queries can be seen as Boolean combinations of (potentially nested) semi-joins, we introduce a novel multi-semi-join (MSJ) MapReduce operator that enables the evaluation of a set of semi-joins in one job. We use this operator to obtain parallel query plans for SGF queries that outvalue sequential plans w.r.t. net time and provide additional optimizations aimed at minimizing total time without severely affecting net time. Even though the latter optimizations are NP-hard, we present effective greedy algorithms. Our experiments, conducted using our own implementation Gumbo on top of Hadoop, confirm the usefulness of parallel query plans, and the effectiveness and scalability of our optimizations, all with a significant improvement over Pig and Hive.]]>

Presentation given on VLDB 2016: 42nd International Conference on Very Large Data Bases. Paper: http://dx.doi.org/10.14778/2977797.2977800 ArXiv: https://arxiv.org/abs/1605.05219 Poster: https://zenodo.org/record/61653 (doi 10.5281/zenodo.61653) Gumbo Software: https://github.com/JonnyDaenen/Gumbo Abstract While services such as Amazon AWS make computing power abundantly available, adding more computing nodes can incur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka wall-clock time) of queries. In this work, we provide algorithms for parallel evaluation of SGF queries in MapReduce that optimize total time, while retaining low net time. Not only can SGF queries specify all semi-join reducers, but also more expressive queries involving disjunction and negation. Since SGF queries can be seen as Boolean combinations of (potentially nested) semi-joins, we introduce a novel multi-semi-join (MSJ) MapReduce operator that enables the evaluation of a set of semi-joins in one job. We use this operator to obtain parallel query plans for SGF queries that outvalue sequential plans w.r.t. net time and provide additional optimizations aimed at minimizing total time without severely affecting net time. Even though the latter optimizations are NP-hard, we present effective greedy algorithms. Our experiments, conducted using our own implementation Gumbo on top of Hadoop, confirm the usefulness of parallel query plans, and the effectiveness and scalability of our optimizations, all with a significant improvement over Pig and Hive.]]>
Tue, 06 Sep 2016 17:02:11 GMT /slideshow/parallel-evaluation-of-multisemijoins/65746623 jonnydaenen@slideshare.net(jonnydaenen) Parallel Evaluation of Multi-Semi-Joins jonnydaenen Presentation given on VLDB 2016: 42nd International Conference on Very Large Data Bases. Paper: http://dx.doi.org/10.14778/2977797.2977800 ArXiv: https://arxiv.org/abs/1605.05219 Poster: https://zenodo.org/record/61653 (doi 10.5281/zenodo.61653) Gumbo Software: https://github.com/JonnyDaenen/Gumbo Abstract While services such as Amazon AWS make computing power abundantly available, adding more computing nodes can incur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka wall-clock time) of queries. In this work, we provide algorithms for parallel evaluation of SGF queries in MapReduce that optimize total time, while retaining low net time. Not only can SGF queries specify all semi-join reducers, but also more expressive queries involving disjunction and negation. Since SGF queries can be seen as Boolean combinations of (potentially nested) semi-joins, we introduce a novel multi-semi-join (MSJ) MapReduce operator that enables the evaluation of a set of semi-joins in one job. We use this operator to obtain parallel query plans for SGF queries that outvalue sequential plans w.r.t. net time and provide additional optimizations aimed at minimizing total time without severely affecting net time. Even though the latter optimizations are NP-hard, we present effective greedy algorithms. Our experiments, conducted using our own implementation Gumbo on top of Hadoop, confirm the usefulness of parallel query plans, and the effectiveness and scalability of our optimizations, all with a significant improvement over Pig and Hive. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/presentationvldb-160906170211-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presentation given on VLDB 2016: 42nd International Conference on Very Large Data Bases. Paper: http://dx.doi.org/10.14778/2977797.2977800 ArXiv: https://arxiv.org/abs/1605.05219 Poster: https://zenodo.org/record/61653 (doi 10.5281/zenodo.61653) Gumbo Software: https://github.com/JonnyDaenen/Gumbo Abstract While services such as Amazon AWS make computing power abundantly available, adding more computing nodes can incur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka wall-clock time) of queries. In this work, we provide algorithms for parallel evaluation of SGF queries in MapReduce that optimize total time, while retaining low net time. Not only can SGF queries specify all semi-join reducers, but also more expressive queries involving disjunction and negation. Since SGF queries can be seen as Boolean combinations of (potentially nested) semi-joins, we introduce a novel multi-semi-join (MSJ) MapReduce operator that enables the evaluation of a set of semi-joins in one job. We use this operator to obtain parallel query plans for SGF queries that outvalue sequential plans w.r.t. net time and provide additional optimizations aimed at minimizing total time without severely affecting net time. Even though the latter optimizations are NP-hard, we present effective greedy algorithms. Our experiments, conducted using our own implementation Gumbo on top of Hadoop, confirm the usefulness of parallel query plans, and the effectiveness and scalability of our optimizations, all with a significant improvement over Pig and Hive.
Parallel Evaluation of Multi-Semi-Joins from Jonny Daenen
]]>
317 2 https://cdn.slidesharecdn.com/ss_thumbnails/presentationvldb-160906170211-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-jonnydaenen-48x48.jpg?cb=1679835711 As a computer scientist specialized in databases & Big Data, I am interested in understanding the inner workings of both existing and new technologies. I enjoy studying and applying these technologies, and making them accessible to a broad audience. https://cdn.slidesharecdn.com/ss_thumbnails/tedxuhasseltsalon2017jonnydaenenfinal-210220191252-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/tedxuhasseltsalon-2017-activate-your-data/243136408 TEDxUHasseltSalon 2017... https://cdn.slidesharecdn.com/ss_thumbnails/selligent-210217115250-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/selligent-marketing-cloud/242900804 Selligent Marketing Cloud https://cdn.slidesharecdn.com/ss_thumbnails/progra-meerjonnydaenen-210217115017-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/progra-meer-jonnydaenen/242900671 Meer Big Data, Minder ...