Presentation for Transparency Camp EU
How to open up data when it's not open?
A quick overview of basic and advanced tools & technologies for scraping data.
This document outlines various tools and techniques for web scraping, including both manual and automated methods. It notes that scraping requires creativity, problem solving, and adapting techniques to individual problems. Various scraping methods are presented, such as copy-paste, regular expressions, HTML parsing, and scraping software. It also addresses ethical and legal considerations around scraping and recommends using APIs when possible to avoid brute force attacks. The goal is to explore the "fine art of scraping" rather than provide set solutions.
Digital Publishing Made Easy with the OSCI ToolkitKyle Jaebker
?
Developing digital publications can often be a challenging process with uncertain outcomes. The OSCI Toolkit was developed to ease the publishing process by creating both an authoring environment for content creators and a flexible reading experience for users. This session will show how the OSCI Toolkit can be leveraged to generate online publications and ePub documents. With lots of out-of-the-box functionality, main features will be discussed, as well as customizing for your institution.
This document provides an introduction and overview of graph databases. It discusses key concepts like vertices, edges, and paths. It also covers different graph database tools and languages, including Neo4j, Cosmos DB, Gremlin, and Cypher. Example use cases are presented like social networks, recommendations, and knowledge graphs. Common operations like CRUD and querying are also addressed. The document aims to demonstrate how graph databases are well-suited for connected data and relationship-based queries.
This document discusses architecture patterns for ASP.Net MVC web applications. It covers topics like software components, domain design, infrastructure, ORM strategies including micro ORMs and full-fledged ORMs. It also discusses repositories, database schema, separating read models from the domain, using commands to isolate business logic in the domain, and managing transactions per request. The presentation demonstrates an app called "Let's Go Out" and summarizes key points like using IoC containers, ORMs, separating read and domain models, implementing commands, and handling transactions.
TOYOTA-CBA AUTO PARTS INDUSTRY CO., LIMITED-TOYOTAQueensa Hu
?
This document is a catalogue from CBA Auto Parts Industry Co., Ltd listing Toyota part numbers and vehicle applications. It contains over 100 part numbers organized by a 123xx-xxxx format, along with the vehicle make, model, and year they are intended for. The document provides contact information for CBA Auto Parts at the top including their address, phone number, fax, and email for inquiries.
The document contains a list of various vehicle service manuals, repair manuals, and technical manuals for automobiles, motorcycles, tractors, snowblowers, and other vehicles from manufacturers such as John Deere, Chevrolet, Ferrari, Nissan, Subaru, Suzuki, Yamaha, and others. The manuals cover a range of vehicle models from the 1940s through the 2000s and include shop manuals, repair manuals, technical manuals, and more.
Managing and Using Assets in Rich Flash ExperiencesDavid Ortinau
?
My presentation from Flash Camp St. Louis 2010. We discussed SWC, SWF loaders, FDT, casalib, LoaderMax, and sundry other bits.
Visit http://davidortinau.com for code samples.
Reaktive Programmierung mit den Reactive Extensions (Rx)NETUserGroupBern
?
This document provides an overview of a presentation on reactive programming. The presentation introduces reactive programming concepts and the Reactive Extensions (RX) framework in .NET. It discusses how RX treats events as observable collections and uses LINQ-like query operators to compose and combine event streams. The presentation includes demos showing how to create observables from events and asynchronous data sources and use RX operators to query and transform the resulting observable sequences.
This document lists several side projects and coursework that Chris Schilling has worked on to gain programming experience outside of university coursework. It describes projects like a to-do list app called Timen' that used Cordova and Ionic, a personal website that is a work in progress using technologies like JQuery, SCSS, and Pug, and an Android meme generator app. It also details a larger senior design project called SoundNStein that used C# and machine learning to generate music from uploaded sound clips. For each project, the skills and technologies learned are outlined.
Feature driven agile oriented web applicationsRam G Athreya
?
The document provides an overview of feature driven agile oriented web applications. It discusses why web development is important as more businesses move online. It also covers challenges in web development and provides an agenda for covering the full spectrum of web app development, including current technologies. The document proposes developing a stock market app as an example project to demonstrate concepts. It includes wireframes and diagrams of the backend and frontend architecture for web apps.
The document discusses Ruby on Rails, a web application framework. It provides an overview of Ruby and Rails, explaining that Ruby is an object-oriented programming language and Rails is a full-stack framework built on Ruby that follows the model-view-controller pattern. It also discusses how Rails emphasizes conventions over configuration and helps developers build applications quickly.
Citizen Developer Tools are not just for Citizen Developers (session at Share...Antti Koskela
?
So, the citizen developers have all the cool tools, and those that actually code for a living are left with legacy stuff? Not so fast! The same tools that Microsoft is targeting for citizen developers make development easier, faster and cheaper for everyone!
This session combines tools such as Flow, Azure Cognitive Services and Azure Functions with some actual simple development work to provide highly customized, Machine Learning powered analysis workflow for the newly baked Modern Team Sites in SharePoint Online. This demo-heavy session will look at real business scenarios, and how we can solve them using citizen developer tools and some code (Because we’re developers after all, right?)
After this session, you'll know how to create rich and customized business automation processes that use the latest tools offered to us by Microsoft.
This document discusses design patterns for cloud architecture. It begins with an introduction to software design patterns and their history. It then covers the evolution of cloud architecture from physical infrastructure to virtual machines to containers and serverless computing. Various common design patterns are presented for each layer including three-tier apps, singleton, sidecar, ambassador and event-driven patterns. The document emphasizes that infrastructure can be treated as code with reusable patterns and provides resources for further learning.
While there are many Cloud design patterns for infrastructure, there are also many Cloud design patterns for developers. Come and learn how you can take your software design patterns and apply them to the next generation of cloud applications, or simply modernise your existing software architectures.
Speaker: Arden Packeer, Solutions Architect, Amazon Web Services
Doug McCune - Using Open Source Flex and ActionScript ProjectsDoug McCune
?
The document summarizes Doug McCune's presentation on riding coattails to the top using open source Flex/ActionScript projects. It discusses finding popular open source projects on sites like Google Code and RIAForge, highlights some hot projects in areas like computer vision, sound, and mapping, and provides demos of projects like Adobe's Open Source Media Framework and the Axiis data visualization framework. It also addresses challenges of staying up to date in this rapidly evolving space.
PLAT-20 Building Alfresco Prototypes in a Few HoursAlfresco Software
?
SIDE provides a set of tools that enable the developers to customize Alfresco very easily. From a single data model, we will show you forms, views and complex objects (automatically generated) that can be combined in a matter of minutes to build a fully functional prototype. We will show you direct dashboard customization by the user through the use of views and charts generated by SIDE from a single data model. SIDE is available in open source.
This document provides a summary of tools and resources for mobile development across the design, development, testing, and deployment lifecycle with a focus on Xamarin. It includes links to design guidelines, prototyping tools, IDE extensions, data libraries, debugging tools, UI testing libraries, and deployment libraries. It also lists community resources like blogs, podcasts, and chat channels for staying up to date with Xamarin development.
This presentation examines the main building blocks for building a big data pipeline in the enterprise. The content uses inspiration from some of the top big data pipelines in the world like the ones built by Netflix, Linkedin, Spotify or Goldman Sachs
Building intranet applications with ASP.NET AJAX and jQueryAlek Davis
?
This document provides an overview of building intranet applications using ASP.NET AJAX and jQuery. It discusses the technologies used, including ASP.NET AJAX, jQuery, and Rich Internet Applications. It covers development topics such as common patterns, tools, and debugging. The document also provides references and resources for further learning about ASP.NET AJAX, jQuery, and their integration.
Citizen Developer Tools - session at SPS New England 10/20/2018Antti Koskela
?
So, the citizen developers have all the cool tools, and those that actually code for a living are left with legacy stuff? Not so fast! The same tools that Microsoft is targeting for citizen developers make development easier, faster and cheaper for everyone!
This session combines tools such as Flow, Azure Cognitive Services and Azure Functions with some actual simple development work to provide highly customized, Machine Learning powered analysis workflow for the newly baked Modern Team Sites in SharePoint Online. This demo-heavy session will look at real business scenarios, and how we can solve them using citizen developer tools and some code (Because we’re developers after all, right?)
After this session, you'll know how to create rich and customized business automation processes that use the latest tools offered to us by Microsoft.
The web has changed! Users spend more time on mobile than on desktops and expect to have an amazing user experience on both. APIs are the heart of the new web as the central point of access data, encapsulating logic and providing the same data and same features for desktops and mobiles. In this workshop, Antonio will show you how to create complex APIs in an easy and quick way using API Platform built on Symfony.
Analytics in Search
Many companies including Lucidworks have embraced the Kibana open source code to add visualization and analytics to enhance search management. Ravi Krishnamurthy , VP of Professional Services at Lucidworks, will show Silk, Lucid's implementation of Kibana, which provides all the capabilities of the open source code but adds enterprise-critical capabilities like authentication and security to protect restricted content.
Building intranet applications with ASP.NET AJAX and jQueryAlek Davis
?
This document provides an overview of building intranet applications using ASP.NET AJAX and jQuery. It discusses the technologies used, including ASP.NET AJAX, jQuery, and rich internet applications. It also covers development topics such as common patterns, tools for debugging, and references for further learning. The presentation aims to provide an understanding of jQuery and ASP.NET AJAX and how to build applications using them.
Sharpen your "Architectural Documentation" SawKevin Hakanson
?
All solutions implicitly have an architecture, ideally one which is both intentional and documented. The Architectural Decision Records (ADR) process distributes architectural decision-making across team members. Accelerate the time-consuming process of hand drawing diagrams by rendering from a text-based source. Communicate effectively by committing both your markdown-based ADRs and text-based diagrams into your source code repository. This talk will review these techniques, provide actionable steps to adoption, and even live-code some examples.
A design talk geared towards designers who are new to the world of web design. I’ll cover items such as: how web design is unique from other kinds of design (such as print), how to leverage research and analytics to create data informed designs, steps to become a proficient web designer and how to choose and work with developers. If there are folks in the room using Illustrator or PSD, I'll show you how to set up Illustrator files for web design and prep files for a developer.
Ran Romano leads Wix's ML engineering team. The presentation discusses Wix's ML platform, which allows data scientists to build, deploy, maintain and monitor ML models. The platform includes an offline and online feature store to prepare and serve features for models. It aims to get new models into production within 3 weeks. Various models in use at Wix are discussed, including for churn prediction, semantic search, and image analysis. Future plans include improved model CI/CD and auto feature discovery.
Darin Briskman, Amazon Web Services delivers a keynote at the Canadian Executive Cloud & DevOps Summit in Toronto on June 9, 2017 on the topic of Artificial Intelligence.
Reaktive Programmierung mit den Reactive Extensions (Rx)NETUserGroupBern
?
This document provides an overview of a presentation on reactive programming. The presentation introduces reactive programming concepts and the Reactive Extensions (RX) framework in .NET. It discusses how RX treats events as observable collections and uses LINQ-like query operators to compose and combine event streams. The presentation includes demos showing how to create observables from events and asynchronous data sources and use RX operators to query and transform the resulting observable sequences.
This document lists several side projects and coursework that Chris Schilling has worked on to gain programming experience outside of university coursework. It describes projects like a to-do list app called Timen' that used Cordova and Ionic, a personal website that is a work in progress using technologies like JQuery, SCSS, and Pug, and an Android meme generator app. It also details a larger senior design project called SoundNStein that used C# and machine learning to generate music from uploaded sound clips. For each project, the skills and technologies learned are outlined.
Feature driven agile oriented web applicationsRam G Athreya
?
The document provides an overview of feature driven agile oriented web applications. It discusses why web development is important as more businesses move online. It also covers challenges in web development and provides an agenda for covering the full spectrum of web app development, including current technologies. The document proposes developing a stock market app as an example project to demonstrate concepts. It includes wireframes and diagrams of the backend and frontend architecture for web apps.
The document discusses Ruby on Rails, a web application framework. It provides an overview of Ruby and Rails, explaining that Ruby is an object-oriented programming language and Rails is a full-stack framework built on Ruby that follows the model-view-controller pattern. It also discusses how Rails emphasizes conventions over configuration and helps developers build applications quickly.
Citizen Developer Tools are not just for Citizen Developers (session at Share...Antti Koskela
?
So, the citizen developers have all the cool tools, and those that actually code for a living are left with legacy stuff? Not so fast! The same tools that Microsoft is targeting for citizen developers make development easier, faster and cheaper for everyone!
This session combines tools such as Flow, Azure Cognitive Services and Azure Functions with some actual simple development work to provide highly customized, Machine Learning powered analysis workflow for the newly baked Modern Team Sites in SharePoint Online. This demo-heavy session will look at real business scenarios, and how we can solve them using citizen developer tools and some code (Because we’re developers after all, right?)
After this session, you'll know how to create rich and customized business automation processes that use the latest tools offered to us by Microsoft.
This document discusses design patterns for cloud architecture. It begins with an introduction to software design patterns and their history. It then covers the evolution of cloud architecture from physical infrastructure to virtual machines to containers and serverless computing. Various common design patterns are presented for each layer including three-tier apps, singleton, sidecar, ambassador and event-driven patterns. The document emphasizes that infrastructure can be treated as code with reusable patterns and provides resources for further learning.
While there are many Cloud design patterns for infrastructure, there are also many Cloud design patterns for developers. Come and learn how you can take your software design patterns and apply them to the next generation of cloud applications, or simply modernise your existing software architectures.
Speaker: Arden Packeer, Solutions Architect, Amazon Web Services
Doug McCune - Using Open Source Flex and ActionScript ProjectsDoug McCune
?
The document summarizes Doug McCune's presentation on riding coattails to the top using open source Flex/ActionScript projects. It discusses finding popular open source projects on sites like Google Code and RIAForge, highlights some hot projects in areas like computer vision, sound, and mapping, and provides demos of projects like Adobe's Open Source Media Framework and the Axiis data visualization framework. It also addresses challenges of staying up to date in this rapidly evolving space.
PLAT-20 Building Alfresco Prototypes in a Few HoursAlfresco Software
?
SIDE provides a set of tools that enable the developers to customize Alfresco very easily. From a single data model, we will show you forms, views and complex objects (automatically generated) that can be combined in a matter of minutes to build a fully functional prototype. We will show you direct dashboard customization by the user through the use of views and charts generated by SIDE from a single data model. SIDE is available in open source.
This document provides a summary of tools and resources for mobile development across the design, development, testing, and deployment lifecycle with a focus on Xamarin. It includes links to design guidelines, prototyping tools, IDE extensions, data libraries, debugging tools, UI testing libraries, and deployment libraries. It also lists community resources like blogs, podcasts, and chat channels for staying up to date with Xamarin development.
This presentation examines the main building blocks for building a big data pipeline in the enterprise. The content uses inspiration from some of the top big data pipelines in the world like the ones built by Netflix, Linkedin, Spotify or Goldman Sachs
Building intranet applications with ASP.NET AJAX and jQueryAlek Davis
?
This document provides an overview of building intranet applications using ASP.NET AJAX and jQuery. It discusses the technologies used, including ASP.NET AJAX, jQuery, and Rich Internet Applications. It covers development topics such as common patterns, tools, and debugging. The document also provides references and resources for further learning about ASP.NET AJAX, jQuery, and their integration.
Citizen Developer Tools - session at SPS New England 10/20/2018Antti Koskela
?
So, the citizen developers have all the cool tools, and those that actually code for a living are left with legacy stuff? Not so fast! The same tools that Microsoft is targeting for citizen developers make development easier, faster and cheaper for everyone!
This session combines tools such as Flow, Azure Cognitive Services and Azure Functions with some actual simple development work to provide highly customized, Machine Learning powered analysis workflow for the newly baked Modern Team Sites in SharePoint Online. This demo-heavy session will look at real business scenarios, and how we can solve them using citizen developer tools and some code (Because we’re developers after all, right?)
After this session, you'll know how to create rich and customized business automation processes that use the latest tools offered to us by Microsoft.
The web has changed! Users spend more time on mobile than on desktops and expect to have an amazing user experience on both. APIs are the heart of the new web as the central point of access data, encapsulating logic and providing the same data and same features for desktops and mobiles. In this workshop, Antonio will show you how to create complex APIs in an easy and quick way using API Platform built on Symfony.
Analytics in Search
Many companies including Lucidworks have embraced the Kibana open source code to add visualization and analytics to enhance search management. Ravi Krishnamurthy , VP of Professional Services at Lucidworks, will show Silk, Lucid's implementation of Kibana, which provides all the capabilities of the open source code but adds enterprise-critical capabilities like authentication and security to protect restricted content.
Building intranet applications with ASP.NET AJAX and jQueryAlek Davis
?
This document provides an overview of building intranet applications using ASP.NET AJAX and jQuery. It discusses the technologies used, including ASP.NET AJAX, jQuery, and rich internet applications. It also covers development topics such as common patterns, tools for debugging, and references for further learning. The presentation aims to provide an understanding of jQuery and ASP.NET AJAX and how to build applications using them.
Sharpen your "Architectural Documentation" SawKevin Hakanson
?
All solutions implicitly have an architecture, ideally one which is both intentional and documented. The Architectural Decision Records (ADR) process distributes architectural decision-making across team members. Accelerate the time-consuming process of hand drawing diagrams by rendering from a text-based source. Communicate effectively by committing both your markdown-based ADRs and text-based diagrams into your source code repository. This talk will review these techniques, provide actionable steps to adoption, and even live-code some examples.
A design talk geared towards designers who are new to the world of web design. I’ll cover items such as: how web design is unique from other kinds of design (such as print), how to leverage research and analytics to create data informed designs, steps to become a proficient web designer and how to choose and work with developers. If there are folks in the room using Illustrator or PSD, I'll show you how to set up Illustrator files for web design and prep files for a developer.
Ran Romano leads Wix's ML engineering team. The presentation discusses Wix's ML platform, which allows data scientists to build, deploy, maintain and monitor ML models. The platform includes an offline and online feature store to prepare and serve features for models. It aims to get new models into production within 3 weeks. Various models in use at Wix are discussed, including for churn prediction, semantic search, and image analysis. Future plans include improved model CI/CD and auto feature discovery.
Darin Briskman, Amazon Web Services delivers a keynote at the Canadian Executive Cloud & DevOps Summit in Toronto on June 9, 2017 on the topic of Artificial Intelligence.
Deep-QPP: A Pairwise Interaction-based Deep Learning Model for Supervised Que...suchanadatta3
?
Motivated by the recent success of end-to-end deep neural models
for ranking tasks, we present here a supervised end-to-end neural
approach for query performance prediction (QPP). In contrast to
unsupervised approaches that rely on various statistics of document
score distributions, our approach is entirely data-driven. Further,
in contrast to weakly supervised approaches, our method also does
not rely on the outputs from different QPP estimators. In particular, our model leverages information from the semantic interactions between the terms of a query and those in the top-documents retrieved with it. The architecture of the model comprises multiple layers of 2D convolution filters followed by a feed-forward layer of parameters. Experiments on standard test collections demonstrate
that our proposed supervised approach outperforms other state-of-the-art supervised and unsupervised approaches.
The Role of Christopher Campos Orlando in Sustainability Analyticschristophercamposus1
?
Christopher Campos Orlando specializes in leveraging data to promote sustainability and environmental responsibility. With expertise in carbon footprint analysis, regulatory compliance, and green business strategies, he helps organizations integrate sustainability into their operations. His data-driven approach ensures companies meet ESG standards while achieving long-term sustainability goals.
Valkey 101 - SCaLE 22x March 2025 Stokes.pdfDave Stokes
?
An Introduction to Valkey, Presented March 2025 at the Southern California Linux Expo, Pasadena CA. Valkey is a replacement for Redis and is a very fast in memory database, used to caches and other low latency applications. Valkey is open-source software and very fast.
Boosting MySQL with Vector Search Scale22X 2025.pdfAlkin Tezuysal
?
As the demand for vector databases and Generative AI continues to rise, integrating vector storage and search capabilities into traditional databases has become increasingly important. This session introduces the *MyVector Plugin*, a project that brings native vector storage and similarity search to MySQL. Unlike PostgreSQL, which offers interfaces for adding new data types and index methods, MySQL lacks such extensibility. However, by utilizing MySQL's server component plugin and UDF, the *MyVector Plugin* successfully adds a fully functional vector search feature within the existing MySQL + InnoDB infrastructure, eliminating the need for a separate vector database. The session explains the technical aspects of integrating vector support into MySQL, the challenges posed by its architecture, and real-world use cases that showcase the advantages of combining vector search with MySQL's robust features. Attendees will leave with practical insights on how to add vector search capabilities to their MySQL
Analyzing Consumer Spending Trends and Purchasing Behavioromololaokeowo1
?
This project explores consumer spending patterns using Kaggle-sourced data to uncover key trends in purchasing behavior. The analysis involved cleaning and preparing the data, performing exploratory data analysis (EDA), and visualizing insights using ExcelI. Key focus areas included customer demographics, product performance, seasonal trends, and pricing strategies. The project provided actionable insights into consumer preferences, helping businesses optimize sales strategies and improve decision-making.
Design Data Model Objects for Analytics, Activation, and AIaaronmwinters
?
Explore using industry-specific data standards to design data model objects in Data Cloud that can consolidate fragmented and multi-format data sources into a single view of the customer.
Design of the data model objects is a critical first step in setting up Data Cloud and will impact aspects of the implementation, including the data harmonization and mappings, as well as downstream automations and AI processing. This session will provide concrete examples of data standards in the education space and how to design a Data Cloud data model that will hold up over the long-term as new source systems and activation targets are added to the landscape. This will help architects and business analysts accelerate adoption of Data Cloud.
2. Despite its fame, scraping is
a highly creative process.
It requires…
? Problem solving
? Lateral thinking
? Planning for efficiency
? Prediction and planning for possible outcomes
and uses
? Methodological choices affect the whole outcome
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
3. annot be a course on web scraping teaching you ‘so
We can dive into the fine art of scraping by learning tools
and techniques. Then each individual personality can
adapt, mix and mash this knowledge according to each
individual problem
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
4. “process of automatically
collecting information
from the Web”
~ Wikipedia on “Web Scraping”
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
6. ? Manual copy-paste
? Text grepping / RegEx
? HTML parsing (CSS,
XPATH…)
? Web Scraping Software
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
10. X-Path Parsing in
Google Sheets
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
11. SearchLink
“SearchLink is a System Service for OS X which
handles searching multiple sources and automatically
generating Markdown links for text”
http://brettterpstra.com/projects/searchlink/
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project
18. Consider…
? Brute force attacks and DoS (prefer APIs,
when available)
? Your use of the data (copyright, privacy…)
Credits: Toolbox icons created by Ralf Schmitzer from the Noun Project