SIGIR 2017 talk
Tokyo, Japan
A significant amount of search queries originate from some real
world information need or tasks [13]. In order to improve the
search experience of the end users, it is important to have accurate
representations of tasks. As a result, signicant amount of research
has been devoted to extracting proper representations of tasks in
order to enable search systems to help users complete their tasks, as
well as providing the end user with beer query suggestions [9], for
beer recommendations [41], for satisfaction prediction [36] and for
improved personalization in terms of tasks [24, 38]. Most existing
task extraction methodologies focus on representing tasks as at
structures. However, tasks oen tend to have multiple subtasks
associated with them and a more naturalistic representation of tasks
would be in terms of a hierarchy, where each task can be composed
of multiple (sub)tasks. To this end, we propose an ecient Bayesian
nonparametric model for extracting hierarchies of such tasks &
subtasks. We evaluate our method based on real world query log
data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies
This document discusses unsupervised and supervised approaches to object retrieval.
It begins by covering unsupervised approaches, describing common local and global features used for object retrieval like SIFT, HOG, and deep features. It also discusses feature aggregation methods like bag-of-features and Fisher vectors.
The document then reviews state-of-the-art results, noting methods that achieved mean average precision scores over 0.8 on standard datasets using techniques like selective match kernels and sum-pooled convolutional features.
It concludes by proposing future attempts could explore improving features, distance metrics, and incorporating supervision, suggesting object retrieval may benefit from a dual supervised/unsupervised learning approach.
The document discusses dynamic search and modeling user information seeking behavior. It describes:
1) Characteristics of dynamic search tasks including rich user interactions over multiple queries, temporal dependency between queries and clicked documents, and aiming to fulfill complex evolving information needs.
2) A dual-agent reinforcement learning framework for dynamic search where the user and search engine are modeled as cooperative agents taking actions and receiving rewards.
3) Experiments on TREC datasets showing the proposed approaches outperform other retrieval systems in modeling dynamic search tasks.
INSTG020 lecture for UCL DIS students - Project ManagementAndrew Preater
油
Talk delivered to UCL information sciences / library studies masters students on Tuesday 27 January 2015, then on 2 February 2016.
際際滷s are updated for the current lecture.
(Presented at the Deep Learning Re-Work SF Summit on 01/25/2018)
In this talk, we go through the traditional recommendation systems set-up, and show that deep learning approaches in that set-up don't bring a lot of extra value. We then focus on different ways to leverage these techniques, most of which relying on breaking away from that traditional set-up; through providing additional data to your recommendation algorithm, modeling different facets of user/item interactions, and most importantly re-framing the recommendation problem itself. In particular we show a few results obtained by casting the problem as a contextual sequence prediction task, and using it to model time (a very important dimension in most recommendation systems).
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
油
End to-end goal-oriented question answering systems
version 2.0: An updated version with references of the old version (/QiHe2/kdd-2018-tutorial-end-toend-goaloriented-question-answering-systems).
08/22/2018: The old version was just deleted for reducing the confusion.
A lot of the conversation in AI among educators is whether it is good or bad for education. The larger point we are missing is that AI will be used by students when they join the industry for sure. So, why not embrace this change and adapt to this change in a big way. This presentation will give you thought starters on how to navigate the world of AI! This was part of a Faculty training program I conducted at a School of Management in Navi Mumbai!
This document introduces a study that uses simulated data and domain adaptation techniques to efficiently train policies for robotic grasping. The study trains a neural network policy for grasping using large amounts of simulated data. It then adapts the features and visual appearance of the simulated data to match the real world domain using domain adversarial training and a GAN. In experiments, it finds that domain adaptation improves the grasping policy trained on simulated data, leading to more successful real world grasps.
Building a Fast and Powerful Search App with Lucidworks Site Search - Andrew ...Lucidworks
油
The document summarizes Lucidworks' site search product. It introduces the presenters and provides an agenda for the presentation. It then discusses building search applications from scratch and introduces a user story about a man named Gary who needs to build a search app for his company website. The presentation then overviews Lucidworks' site search capabilities, such as automatic crawlers, AI/ML, business logic control, data segmentation, deployment options, and analytics. It demos the product and discusses the technical implementation behind Lucidworks' Fusion platform.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
油
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
Human-centric Software Development ToolsGail Murphy
油
What characteristics research into software development tools? This talk explores how research can help understand why some tools are effective and some are not and can help drive to the development of more effective tools for software developers.
The document discusses how conducting an ethnography study saved time and money for a project to redesign a call center application. It describes how the initial design was developed without ethnography and had to be reworked after usability issues were discovered. An ethnography was then conducted, observing over 100 customer service representatives. This led to design improvements that reduced average call times by 3 minutes, saving the company $48 million per year. While the ethnography cost $27,000, it provided a high return on investment by avoiding redesign costs and increasing efficiency. The document advocates for using ethnography and quantifying benefits to gain support from business stakeholders.
Facebook London - Learning from User InteractionsRishabh Mehrotra
油
As increasingly larger proportions of users interact with online services like search engines and recommender systems to satisfy their information needs, developing better understanding of user interactions becomes important for improving user experience and gauging user satisfaction. In this talk, I will focus on different aspects of user behavior, and present algorithms that learn from user interactions. Starting with understanding users information needs, I will present techniques which aim at extracting tasks from a collection of search log data. The mined knowledge from log activity data reveals users' underlying intentions and interests, which provide unique signals for human centric optimization and personalization. I will discuss different ways of building user models which leverage such behavioral signals. Going beyond user modeling, I will touch upon novel ways of leveraging user interaction sequences to detect implicit measures of user satisfaction for metric development. Finally, I will discuss offline counterfactual estimation of online metrics which are essential for efficient experimentation.
Predicting Supply-side Engagement on Video Sharing PlatformsRishabh Mehrotra
油
Video sharing platforms are one of the most popular and engaging platforms on the Internet today. Despite the increasing levels of user activity on these video platforms, current research on digital platforms have largely focused on social media and networking websites like Facebook and Twitter. We depart from previous work that have focused primarily on user demands (i.e. activity of viewers), and instead focus our attention to the supply-side activities on the platform (i.e. activity of video uploaders). We perform a large-scale empirical study by leveraging longitudinal video upload data from a major online video platform, demonstrating (i) heterogeneity of video types (e.g. presence of popular vs. niche genres), and (ii) inherent seasonality effects associated with video uploads. Through our analyses, we uncover a set of informative genre-clusters and estimate a self-exciting Hawkes point-process model on each of these clusters, to fully specify and estimate the video upload process.
Additionally, we disentangle potential factors that govern user engagement and determine the video upload rates, which help supplement our analysis with additional explanatory power. Our results emphasize that using a parsimonious and relatively simple point-process model, we were able to obtain a high model fit, as well as perform prediction of video upload volumes with a higher accuracy than a number of competing models. The findings from this study can benefit platform owners in better understanding how their supply-side users engage with their site over time. We also offer a robust method for performing media upload prediction that is likely to be generalizable across media platforms which demonstrate similar temporal and genre-level heterogeneity.
Fairness in Web Search: Auditing Search Engines for Differential SatisfactionRishabh Mehrotra
油
Invited Talk: K4All Workshop, London, UK
http://www.k4all.org/event/workshop-me/
Many online services, such as search engines, social media
platforms, and digital marketplaces, are advertised as being
available to any user, regardless of their age, gender,
or other demographic factors. However, there are growing
concerns that these services may systematically underserve
some groups of users. In this paper, we present a framework
for internally auditing such services for differences in
user satisfaction across demographic groups, using search engines
as a case study. We first explain the pitfalls of na即脹vely
comparing the behavioral metrics that are commonly used
to evaluate search engines. We then propose three methods
for measuring latent differences in user satisfaction from observed
differences in evaluation metrics. To develop these
methods, we drew on ideas from the causal inference literature
and the multilevel modeling literature. Our framework
is broadly applicable to other online services, and provides
general insight into interpreting their evaluation metrics.
1) The document analyzes differences in search behavior between regular search users and frequent news users.
2) It finds that frequent news users have a smaller breadth of topical interests compared to regular users, suggesting they follow a smaller number of topics closely.
3) Additionally, the popularity of a query influences regular users more than frequent news users, who issue more queries of moderate popularity within their key interest areas.
Search Tasks, Proactive Search & Digital AssistantsRishabh Mehrotra
油
The document summarizes a presentation on modeling search tasks and behaviors. It discusses extracting tasks and subtasks from search sessions and organizing them into hierarchies. It finds most sessions involve multiple short tasks across diverse topics. Users are grouped based on multitasking tendencies, with some focusing on one task, others juggling multiple, and "supertaskers" rapidly switching among many. Effort also varies, with multitaskers spending less time per task on average. Topics of interest differ between groups. The goal is to move from modeling sessions to hierarchical task representations to improve search.
JACKPOT TANGKI4D BERMAIN MENGGUNAKAN ID PRO 2025 TEPERCAYA LISENSI STAR GAMIN...TANGKI4D
油
MODAL 50RIBU JACKPOT 10JUTA
BERMAIN DI STARLIGHT PRINCESS
TUNGGU APA LAGI MAIN KAN SEKARANG
GUNAKAN POLA BERMAIN REKOMENDASI KAMI
3x MANUAL SPIN DC ON-OFF
10x TURBO Spin DC OFF
2x MANUAL Spin DC ON-OFF
20x CEPAT Spin DC OFF
COMBO DENGAN BUY FITURE SPIN
#Tangki4dexclusive #tangki4dlink #tangki4dvip #bandarsbobet #idpro2025 #stargamingasia #situsjitu #jppragmaticplay
Custom Development vs Off-the-Shelf Solutions for Shopify Plus ERP Integratio...CartCoders
油
Choosing between custom development and off-the-shelf solutions for Shopify Plus ERP integration? Our latest blog explores the pros and cons to help you decide the best approach for optimizing your eCommerce operations.
cyber hacking and cyber fraud by internet online moneyVEENAKSHI PATHAK
油
Cyber fraud is a blanket term to describe crimes committed by cyberattacks via the internet. These crimes are committed with the intent to illegally acquire and leverage an individual's or businesss sensitive information for monetary gain
A lot of the conversation in AI among educators is whether it is good or bad for education. The larger point we are missing is that AI will be used by students when they join the industry for sure. So, why not embrace this change and adapt to this change in a big way. This presentation will give you thought starters on how to navigate the world of AI! This was part of a Faculty training program I conducted at a School of Management in Navi Mumbai!
This document introduces a study that uses simulated data and domain adaptation techniques to efficiently train policies for robotic grasping. The study trains a neural network policy for grasping using large amounts of simulated data. It then adapts the features and visual appearance of the simulated data to match the real world domain using domain adversarial training and a GAN. In experiments, it finds that domain adaptation improves the grasping policy trained on simulated data, leading to more successful real world grasps.
Building a Fast and Powerful Search App with Lucidworks Site Search - Andrew ...Lucidworks
油
The document summarizes Lucidworks' site search product. It introduces the presenters and provides an agenda for the presentation. It then discusses building search applications from scratch and introduces a user story about a man named Gary who needs to build a search app for his company website. The presentation then overviews Lucidworks' site search capabilities, such as automatic crawlers, AI/ML, business logic control, data segmentation, deployment options, and analytics. It demos the product and discusses the technical implementation behind Lucidworks' Fusion platform.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
油
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
Human-centric Software Development ToolsGail Murphy
油
What characteristics research into software development tools? This talk explores how research can help understand why some tools are effective and some are not and can help drive to the development of more effective tools for software developers.
The document discusses how conducting an ethnography study saved time and money for a project to redesign a call center application. It describes how the initial design was developed without ethnography and had to be reworked after usability issues were discovered. An ethnography was then conducted, observing over 100 customer service representatives. This led to design improvements that reduced average call times by 3 minutes, saving the company $48 million per year. While the ethnography cost $27,000, it provided a high return on investment by avoiding redesign costs and increasing efficiency. The document advocates for using ethnography and quantifying benefits to gain support from business stakeholders.
Facebook London - Learning from User InteractionsRishabh Mehrotra
油
As increasingly larger proportions of users interact with online services like search engines and recommender systems to satisfy their information needs, developing better understanding of user interactions becomes important for improving user experience and gauging user satisfaction. In this talk, I will focus on different aspects of user behavior, and present algorithms that learn from user interactions. Starting with understanding users information needs, I will present techniques which aim at extracting tasks from a collection of search log data. The mined knowledge from log activity data reveals users' underlying intentions and interests, which provide unique signals for human centric optimization and personalization. I will discuss different ways of building user models which leverage such behavioral signals. Going beyond user modeling, I will touch upon novel ways of leveraging user interaction sequences to detect implicit measures of user satisfaction for metric development. Finally, I will discuss offline counterfactual estimation of online metrics which are essential for efficient experimentation.
Predicting Supply-side Engagement on Video Sharing PlatformsRishabh Mehrotra
油
Video sharing platforms are one of the most popular and engaging platforms on the Internet today. Despite the increasing levels of user activity on these video platforms, current research on digital platforms have largely focused on social media and networking websites like Facebook and Twitter. We depart from previous work that have focused primarily on user demands (i.e. activity of viewers), and instead focus our attention to the supply-side activities on the platform (i.e. activity of video uploaders). We perform a large-scale empirical study by leveraging longitudinal video upload data from a major online video platform, demonstrating (i) heterogeneity of video types (e.g. presence of popular vs. niche genres), and (ii) inherent seasonality effects associated with video uploads. Through our analyses, we uncover a set of informative genre-clusters and estimate a self-exciting Hawkes point-process model on each of these clusters, to fully specify and estimate the video upload process.
Additionally, we disentangle potential factors that govern user engagement and determine the video upload rates, which help supplement our analysis with additional explanatory power. Our results emphasize that using a parsimonious and relatively simple point-process model, we were able to obtain a high model fit, as well as perform prediction of video upload volumes with a higher accuracy than a number of competing models. The findings from this study can benefit platform owners in better understanding how their supply-side users engage with their site over time. We also offer a robust method for performing media upload prediction that is likely to be generalizable across media platforms which demonstrate similar temporal and genre-level heterogeneity.
Fairness in Web Search: Auditing Search Engines for Differential SatisfactionRishabh Mehrotra
油
Invited Talk: K4All Workshop, London, UK
http://www.k4all.org/event/workshop-me/
Many online services, such as search engines, social media
platforms, and digital marketplaces, are advertised as being
available to any user, regardless of their age, gender,
or other demographic factors. However, there are growing
concerns that these services may systematically underserve
some groups of users. In this paper, we present a framework
for internally auditing such services for differences in
user satisfaction across demographic groups, using search engines
as a case study. We first explain the pitfalls of na即脹vely
comparing the behavioral metrics that are commonly used
to evaluate search engines. We then propose three methods
for measuring latent differences in user satisfaction from observed
differences in evaluation metrics. To develop these
methods, we drew on ideas from the causal inference literature
and the multilevel modeling literature. Our framework
is broadly applicable to other online services, and provides
general insight into interpreting their evaluation metrics.
1) The document analyzes differences in search behavior between regular search users and frequent news users.
2) It finds that frequent news users have a smaller breadth of topical interests compared to regular users, suggesting they follow a smaller number of topics closely.
3) Additionally, the popularity of a query influences regular users more than frequent news users, who issue more queries of moderate popularity within their key interest areas.
Search Tasks, Proactive Search & Digital AssistantsRishabh Mehrotra
油
The document summarizes a presentation on modeling search tasks and behaviors. It discusses extracting tasks and subtasks from search sessions and organizing them into hierarchies. It finds most sessions involve multiple short tasks across diverse topics. Users are grouped based on multitasking tendencies, with some focusing on one task, others juggling multiple, and "supertaskers" rapidly switching among many. Effort also varies, with multitaskers spending less time per task on average. Topics of interest differ between groups. The goal is to move from modeling sessions to hierarchical task representations to improve search.
JACKPOT TANGKI4D BERMAIN MENGGUNAKAN ID PRO 2025 TEPERCAYA LISENSI STAR GAMIN...TANGKI4D
油
MODAL 50RIBU JACKPOT 10JUTA
BERMAIN DI STARLIGHT PRINCESS
TUNGGU APA LAGI MAIN KAN SEKARANG
GUNAKAN POLA BERMAIN REKOMENDASI KAMI
3x MANUAL SPIN DC ON-OFF
10x TURBO Spin DC OFF
2x MANUAL Spin DC ON-OFF
20x CEPAT Spin DC OFF
COMBO DENGAN BUY FITURE SPIN
#Tangki4dexclusive #tangki4dlink #tangki4dvip #bandarsbobet #idpro2025 #stargamingasia #situsjitu #jppragmaticplay
Custom Development vs Off-the-Shelf Solutions for Shopify Plus ERP Integratio...CartCoders
油
Choosing between custom development and off-the-shelf solutions for Shopify Plus ERP integration? Our latest blog explores the pros and cons to help you decide the best approach for optimizing your eCommerce operations.
cyber hacking and cyber fraud by internet online moneyVEENAKSHI PATHAK
油
Cyber fraud is a blanket term to describe crimes committed by cyberattacks via the internet. These crimes are committed with the intent to illegally acquire and leverage an individual's or businesss sensitive information for monetary gain
IDM Crack 2025 Internet Download Manger Patchwistrendugftr
油
copy & paste もゐ https://filedownloadx.com/download-link/
This project provides a cracked version of IDM, enabling users to use the premium features without purchasing a license. This project is for educational purposes only. Using cracked software is illegal and unethical. We strongly recommend purchasing a legitimate license from the official IDM website to support the developers and respect copyright laws.
Shopify API Integration for Custom Analytics_ Advanced Metrics & Reporting Gu...CartCoders
油
CartCoders offers specialized Shopify integration services to enhance your eCommerce store's functionality and user experience. Connect your Shopify store seamlessly with essential software and applications. Perfect for businesses aiming to streamline operations and boost efficiency.
Elliptic Curve Cryptography Algorithm with Recurrent Neural Networks for Atta...IJCNCJournal
油
The increasing use of Industrial Internet of Things (IIoT) devices has brought about new security vulnerabilities, emphasizing the need to create strong and effective security solutions. This research proposes a two-layered approach to enhance security in IIoT networks by combining lightweight encryption and RNN-based attack detection. The first layer utilizes Improved Elliptic Curve Cryptography (IECC), a novel encryption scheme tailored for IIoT devices with limited computational resources. IECC employs a Modified Windowed Method (MWM) to optimize key generation, reducing computational overhead and enabling efficient secure data transmission between IIoT sensors and gateways. The second layer employs a Recurrent Neural Network (RNN) for real-time attack detection. The RNN model is trained on a comprehensive dataset of IIoT network traffic, including instances of Distributed Denial of Service (DDoS), Man-in-the-Middle (MitM), ransomware attacks, and normal communications. The RNN effectively extracts contextual features from IIoT nodes and accurately predicts and classifies potential attacks. The effectiveness of the proposed two-layered approach is evaluated using three phases. The first phase compares the computational efficiency of IECC to established cryptographic algorithms including RSA, AES, DSA, Diffie-Hellman, SHA-256 and ECDSA. IECC outperforms all competitors in key eneration speed, encryption and decryption time, throughput, memory usage, information loss, and overall processing time. The second phase evaluates the prediction accuracy of the RNN model compared to other AI-based models DNNs, DBNs, RBFNs, and LSTM networks. The proposed RNN achieves the highest overall accuracy of 96.4%, specificity of 96.5%, precision of 95.2%, and recall of 96.8%, and the lowest false positive of 3.2% and false negative rates of 3.1%.
HITRUST Overview and AI Assessments Webinar.pptxAmyPoblete3
油
This webinar provides an overview of HITRUST, a widely recognized cybersecurity framework, and its application in AI assessments for risk management and compliance. It explores different HITRUST assessment options, including AI-specific frameworks, and highlights how organizations can streamline certification processes to enhance security and regulatory adherence.
Introduction on how unique identifier systems are managed and coordinated - R...APNIC
油
Sunny Chendi, Senior Regional Advisor, Membership and Policy at APNIC, presented an 'Introduction on how unique identifier systems are managed and coordinated - RIRs (APNIC for APAC), ICANN, IETF and policy development' at MyAPIGA 2025 held in Putrajaya from 16 to 18 February 2025.
12. Extracting Search Tasks: Prior Work
Problems:
Link query to on-going task = long chains
impure tasks
Rely on large corpus of pre-tagged queries
Do not aggregate across users
Tasks are not necessarily flat-structures
complex tasks decompose into sub-tasks
19. Build upon Bayesian Rose Trees
Each node of the tree corresponds to a task
Each task represented by a set of queries
Goal: Find the tree structure that maximizes
Number of partitions consistent with T can be exponentially large
Approximate using dynamic programming:
奪
=
)()(
))(|())(()|(
TPartT
TQpTpTQp
f
ff
Hierarchical Task Extraction
Likelihood of queries
belong to same task
)|)(()1()()|(
)(
ii
TchT
TTT TTleavespQfTQP
i
-+= pp
Mixture over
partitions of
data points
25. Experiment 1: Search task identification
Experiment 2: Crowd-sourced evaluation of hierarchy
Experiment 3: Term prediction application
Baselines:
1. Bestlink-SVM
2. QC-WCC/QC-HTC
3. LDA-Hawkes
4. LDA-TW
5. Jones hierarchy
6. BHCD: Bayesian Hierarchical Community Detection
7. Bayesian agglomerative clustering
Experimental Evaluation
Task extraction baselines
Hierarchical model baselines
26. Pairwise precision/recall:
LDA-TW performs worst
Too strong assumptions on queries belonging to
same task
Gains over QC-HTC/WCC
Query affinities can better reflect semantic
relationships
Experimental Evaluation I
[Search Task Identification]
Flattened version of hierarchy is useful too!
28. Indirect evaluation based on term
prediction
1. Construct hierarchy
2. Map to correct node in the hierarchy
3. Leverage node queries for term prediction
Assumption: identifying good tasks should
help in predicting future queries
Intersection of TREC Session track & AOL
log data
Experimental Evaluation III
[Term Prediction]
Outperforms flat-task extraction techniques as well as hierarchical baselines