7. From Queries to Dialogues
Q1: how is the weather in Chicago
Q2: how is it this weekend
Q3: find me hotels
Q4: which one of these is the cheapest
Q5: which one of these has at least 4 stars
Q6: find me directions from the Chicago airport to
number one
Users dialogue
with Cortana:
Task is Finding
a hotel in
Chicago
8. From Queries to Dialogues
Q1: find me a pharmacy nearby
Q2: which of these is highly rated
Q3: show more information about number 2
Q4: how long will it take me to get there
Q5: Thanks
Users dialogue
with Cortana:
Task is Finding
a pharmacy
9. Main Research Question
How can we automatically predict user
satisfaction with search dialogues on
intelligent assistants using
click, touch, and voice interactions?
13. How to define user satisfaction
with search dialogues?
14. Cortana:
Here are ten
restaurants
near you
Cortana:
Here are ten
restaurants near
you that have
good reviews
Cortana:
Getting you
direction to the
Mayuri Indian
Cuisine
User:
show
restauran
ts near
me
User:
show the
best ones
User:
show
directions
to the
second
one
No Clicks
???
15. Cortana:
Here are ten
restaurants
near you
Cortana:
Here are ten
restaurants near
you that have
good reviews
Cortana:
Getting you
direction to the
Mayuri Indian
Cuisine
User:
show
restauran
ts near
me
User:
show the
best ones
User:
show
directions
to the
second
one
SAT? SAT? SAT?
Overall
SAT?
? SAT? SAT? SAT?
16. User Frustration
Q1: what's the weather like in San Francisco
Q2: what's the weather like in Mountain View
Q3: can you find me a hotel close to Mountain
View
Q4: can you show me the cheapest ones
Q5: show me the third one
Q6: show me the directions from SFO to this
hotel
Q6: show me the directions from SFO to this
hotel
Q7: go back to first hotel (misrecognition)
Q8: show me hotels in Mountain View
Q9: show me cheap hotels in Mountain View
Q10: show me more about the third one
Dialog with
Intelligent Assistant
Task is Planning a
weekend
RestartsearchAuserissatisfied
20. 3 seconds 6 seconds
33% of
ViewPort
66% of
ViewPort
ViewPortHeight
2 seconds
20% of
ViewPort
1s 4s 0.4s 5.4s+ + =
Tracking User Interaction
22. Number of Swipes
Number of up-swipes
Number of down-swipes
Total distance swiped (pixels)
Number of swipes normalized by
time
Total distance divided by num. of
swipes
Total swiped distance divided by
time
Number of swipe direction
changes
SERP answer duration (seconds)
which is shown on screen (even
partially)
Fraction of visible pixels belonging
to SERP answer
Attributed time (seconds) to viewing
a particular element (answer) on
SERP
Attributed time (seconds) per unit
height (pixels) associated with a
particular element on SERP
Attributed time (milliseconds) per
unit area (square pixels) associated
with a particular element on SERP
Tracking User Interaction:
Touch Signals
24. Quality of Interaction Model
Method Accuracy (%) Average F1 (%)
Baseline 70.62 61.38
Interaction Model 80.81*
(14.43)
79.08*
(28.83)
* Statistically significant improvement (p < 0,05 )
34. Cortana:
Here are ten
restaurants
near you
Cortana:
Here are ten
restaurants near
you that have
good reviews
Cortana:
Getting you
direction to the
Mayuri Indian
Cuisine
User:
show
restauran
ts near
me
User:
show the
best ones
User:
show
directions
to the
second
one
From Queries to Dialogues:
Sequential Interaction
40. User satisfaction with personal assistants is defined in the generalized
form, which showed understanding the nature of user satisfaction as an
aggregation of satisfaction with all dialogues tasks and not as a
satisfaction with all dialogues queries separately
We showed that features derived from voice and especially from touch
and voice interactions add significant gain in accuracy over the baseline
We proposed a novel and dynamic approach to restore user reward
function
Thank you!
Questions?
Editor's Notes
#19: We utilize acoustic feature to characterize
voice interaction happening in search dialogues. More
specifically, we use the phonetic similarity between consecutive
requests to identify patterns of repetition. Metaphone representation
[39] is a way of indexing words by their pronunciation that allows
us to represent words by how they are pronounced as opposed
to how they are written.