Big Data Day LA 2016/ Data Science Track - Backstage to a Data Driven Culture...Data Con LAWhen you're the first data professional at the organization there are technical, process, and qualitative considerations for analytics and data science to address (A/DS). This talk is an overview of strategy, infrastructure, and tools for creating your first A/DS stacks. At this stage, the range of problems that you are able to solve relate to organization, operational, data engineering, business intelligence, and communication. Creating the optimal A/DS stack can seamlessly pave the way to big data and integrating the newest technologies in the future. Please share your stories and experience with us as well. Outline of talk, where sections intend to be interactive and get feedback from the audience:
1. So you're the first Data Scientist
2. Setting Their Expectations
3. Lay of the Land - Data requirements and organizational survey
4. Setting Your Expectations
5. Infrastructure - Your Stack Options
6. Resources: Get Help, Get a Team
7. Discussion
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Data Con LAThis document discusses how Redis can be used for analytics at high speeds. It provides examples of how Redis data structures and operations allow for real-time bidding, recommendations, and time-series analytics. Redis on flash is presented as a cost-effective way to achieve high performance by using flash as an extension of RAM. Redis modules are introduced as a way to extend Redis capabilities with features like full text search, graphs, and SQL.
Big Data Day LA 2016/ Data Science Track - Backstage to a Data Driven Culture...Data Con LAWhen you're the first data professional at the organization there are technical, process, and qualitative considerations for analytics and data science to address (A/DS). This talk is an overview of strategy, infrastructure, and tools for creating your first A/DS stacks. At this stage, the range of problems that you are able to solve relate to organization, operational, data engineering, business intelligence, and communication. Creating the optimal A/DS stack can seamlessly pave the way to big data and integrating the newest technologies in the future. Please share your stories and experience with us as well. Outline of talk, where sections intend to be interactive and get feedback from the audience:
1. So you're the first Data Scientist
2. Setting Their Expectations
3. Lay of the Land - Data requirements and organizational survey
4. Setting Your Expectations
5. Infrastructure - Your Stack Options
6. Resources: Get Help, Get a Team
7. Discussion
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Data Con LAThis document discusses how Redis can be used for analytics at high speeds. It provides examples of how Redis data structures and operations allow for real-time bidding, recommendations, and time-series analytics. Redis on flash is presented as a cost-effective way to achieve high performance by using flash as an extension of RAM. Redis modules are introduced as a way to extend Redis capabilities with features like full text search, graphs, and SQL.
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...Data Con LAThe goal of this talk to lay out a framework for what algorithms work best in which situations, and why. Drawing on results of hundreds of crowd-sourced predictive modeling contests, this talk shows examples of how structure informs a choice in algorithm. As an illustration of these concepts, ZestFinance's work with China's retail giant, JD.com is used to describe how the right algorithms were applied to the right datasets to turn shopping data into credit data -- creating credit scores from scratch.
Big Data Day LA 2016 Keynote - Andy Feng/ YahooData Con LABig Data Day LA 2016 Keynote - Andy Feng, VP-Architecture at Yahoo talks about Hadoop and Big Data Innovation
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LAThis presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
Big Data Day LA 2016 Keynote - Reynold Xin/ DatabricksData Con LAThis document discusses scaling big data using Apache Spark. It provides an overview of Spark's philosophy of providing a unified engine to support end-to-end applications using high-level APIs. It outlines some of the new features in Apache Spark 2.0, including improvements to structured APIs, structured streaming, and new deep learning and graph processing libraries. It also discusses initiatives by Databricks to grow the Spark community through massive open online courses and a free community edition of the Databricks platform.
Big Data Day LA 2016/ Use Case Driven track - Reliable Media Reporting in an ...Data Con LAOnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
Big Data Day LA 2016/ NoSQL track - Introduction to Graph Databases, Oren Gol...Data Con LAMany organizations have adopted graph databases - IoT, health care, financial services, telecommunications and governments. This talk, based on our research and implementation of a graph database at Sanguine, a startup based in LA, dives into a few use cases and equips attendees with everything they need to start using a graph database.
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LAThis document discusses Apache NiFi and stream processing. It provides an overview of NiFi's key concepts of managing data flow, data provenance, and securing data. NiFi allows users to visually build data flows with drag and drop processors. It offers features such as guaranteed delivery, data buffering, prioritized queuing, and data provenance. NiFi is based on Flow-Based Programming and is used to reliably transfer data between systems, enrich and prepare data, and deliver data to analytic platforms.
Big Data Day LA 2016/ NoSQL track - Privacy vs. Security in a Big Data World,...Data Con LATamara Dull discusses privacy and security in a big data world. She asks if privacy vs security is the right discussion, noting they are two sides of the same coin. Big data has changed the discussion by making more data available from more sources. To address privacy and security concerns, she suggests implementing privacy and security by design in products, and that individuals have a role to play in their communities, with friends and families.