人机交互学术速递[1.10]
Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!
cs.HC人机交互,共计10篇
【1】 Visual Attention Prediction Improves Performance of Autonomous Drone Racing Agents
标题:视觉注意预测提高自主无人机竞速智能体的性能
链接:https://arxiv.org/abs/2201.02569
备注:12 pages, 6 figures
摘要:Humans race drones faster than neural networks trained for end-to-end
autonomous flight. This may be related to the ability of human pilots to select
task-relevant visual information effectively. This work investigates whether
neural networks capable of imitating human eye gaze behavior and attention can
improve neural network performance for the challenging task of vision-based
autonomous drone racing. We hypothesize that gaze-based attention prediction
can be an efficient mechanism for visual information selection and decision
making in a simulator-based drone racing task. We test this hypothesis using
eye gaze and flight trajectory data from 18 human drone pilots to train a
visual attention prediction model. We then use this visual attention prediction
model to train an end-to-end controller for vision-based autonomous drone
racing using imitation learning. We compare the drone racing performance of the
attention-prediction controller to those using raw image inputs and image-based
abstractions (i.e., feature tracks). Our results show that attention-prediction
based controllers outperform the baselines and are able to complete a
challenging race track consistently with up to 88% success rate. Furthermore,
visual attention-prediction and feature-track based models showed better
generalization performance than image-based models when evaluated on hold-out
reference trajectories. Our results demonstrate that human visual attention
prediction improves the performance of autonomous vision-based drone racing
agents and provides an essential step towards vision-based, fast, and agile
autonomous flight that eventually can reach and even exceed human performances.
【2】 Project IRL: Playful Co-Located Interactions with Mobile Augmented Reality
标题:IRL项目:与移动增强现实进行有趣的协同交互
链接:https://arxiv.org/abs/2201.02558
摘要:We present Project IRL (In Real Life), a suite of five mobile apps we created
to explore novel ways of supporting in-person social interactions with
augmented reality. In recent years, the tone of public discourse surrounding
digital technology has become increasingly critical, and technology's influence
on the way people relate to each other has been blamed for making people feel
"alone together," diverting their attention from truly engaging with one
another when they interact in person. Motivated by this challenge, we focus on
an under-explored design space: playful co-located interactions. We evaluated
the apps through a deployment study that involved interviews and participant
observations with 101 people. We synthesized the results into a series of
design guidelines that focus on four themes: (1) device arrangement (e.g., are
people sharing one phone, or does each person have their own?), (2) enablers
(e.g., should the activity focus on an object, body part, or pet?), (3)
affordances of modifying reality (i.e., features of the technology that enhance
its potential to encourage various aspects of social interaction), and (4)
co-located play (i.e., using technology to make in-person play engaging and
inviting). We conclude by presenting our design guidelines for future work on
embodied social AR.
【3】 In Situ Data Summaries for Flexible Feature Analysis in Large-Scale Multiphase Flow Simulations
标题:大尺度多相流模拟中柔性特征分析的现场数据汇总
链接:https://arxiv.org/abs/2201.02557
摘要:The study of multiphase flow is essential for understanding the complex
interactions of various materials. In particular, when designing chemical
reactors such as fluidized bed reactors (FBR), a detailed understanding of the
hydrodynamics is critical for optimizing reactor performance and stability. An
FBR allows experts to conduct different types of chemical reactions involving
multiphase materials, especially interaction between gas and solids. During
such complex chemical processes, formation of void regions in the reactor,
generally termed as bubbles, is an important phenomenon. Study of these bubbles
has a deep implication in predicting the reactor's overall efficiency. But
physical experiments needed to understand bubble dynamics are costly and
non-trivial. Therefore, to study such chemical processes and bubble dynamics, a
state-of-the-art massively parallel computational fluid dynamics discrete
element model (CFD-DEM), MFIX-Exa is being developed for simulating multiphase
flows. Despite the proven accuracy of MFIX-Exa in modeling bubbling phenomena,
the very-large size of the output data prohibits the use of traditional post
hoc analysis capabilities in both storage and I/O time. To address these issues
and allow the application scientists to explore the bubble dynamics in an
efficient and timely manner, we have developed an end-to-end visual analytics
pipeline that enables in situ detection of bubbles using statistical
techniques, followed by a flexible and interactive visual exploration of bubble
dynamics in the post hoc analysis phase. Positive feedback from the experts has
indicated the efficacy of the proposed approach for exploring bubble dynamics
in very-large scale multiphase flow simulations.
【4】 Developing Assistive Technology to Support Reminiscence Therapy: A User-Centered Study to Identify Caregivers' Needs
标题:开发辅助技术支持记忆治疗:一项以用户为中心的研究,以确定照顾者的需求
链接:https://arxiv.org/abs/2201.02418
备注:27 pages, 2 figures, Manuscript submitted to the the Special Issue on Advances in Human-Centred Dementia Technology of the International Journal of Human-Computer Studies
摘要:Reminiscence therapy is an inexpensive non-pharmacological therapy commonly
used due to its therapeutic value for PwD, as it can be used to promote
independence, positive moods and behavior, and improve their quality of life.
Caregivers are one of the main pillars in the adoption of digital technologies
for reminiscence therapy, as they are responsible for its administration.
Despite their comprehensive understanding of the needs and difficulties
associated with the therapy, their perspective has not been fully taken into
account in the development of existing technological solutions. To inform the
design of technological solutions within dementia care, we followed a
user-centered design approach through worldwide surveys, follow-up
semi-structured interviews, and focus groups. Seven hundred and seven informal
and 52 formal caregivers participated in our study. Our findings show that
technological solutions must provide mechanisms to carry out the therapy in a
simple way, reducing the amount of work for caregivers when preparing and
conducting therapy sessions. They should also diversify and personalize the
current session (and following ones) based on both the biographical information
of the PwD and their emotional reactions. This is particularly important since
the PwD often become agitated, aggressive or angry, and caregivers might not
know how to properly deal with this situation (in particular, the informal
ones). Additionally, formal caregivers need an easy way to manage information
of the different PwD they take care of, and consult the history of sessions
performed (in particular, to identify images that triggered negative emotional
reactions, and consult any notes taken about them). As a result, we present a
list of validated functional requirements gathered for the PwD and both formal
and informal caregivers, as well as the corresponding expected primary and
secondary outcomes.
【5】 Unwinding Rotations Improves User Comfort with Immersive Telepresence Robots
标题:使用身临其境的网真机器人,展开旋转可提高用户舒适度
链接:https://arxiv.org/abs/2201.02392
备注:Accepted for publication in HRI (Int. Conf. on Human-Robot Interaction) 2022
摘要:We propose unwinding the rotations experienced by the user of an immersive
telepresence robot to improve comfort and reduce VR sickness of the user. By
immersive telepresence we refer to a situation where a 360\textdegree~camera on
top of a mobile robot is streaming video and audio into a head-mounted display
worn by a remote user possibly far away. Thus, it enables the user to be
present at the robot's location, look around by turning the head and
communicate with people near the robot. By unwinding the rotations of the
camera frame, the user's viewpoint is not changed when the robot rotates. The
user can change her viewpoint only by physically rotating in her local setting;
as visual rotation without the corresponding vestibular stimulation is a major
source of VR sickness, physical rotation by the user is expected to reduce VR
sickness. We implemented unwinding the rotations for a simulated robot
traversing a virtual environment and ran a user study (N=34) comparing
unwinding rotations to user's viewpoint turning when the robot turns. Our
results show that the users found unwound rotations more preferable and
comfortable and that it reduced their level of VR sickness. We also present
further results about the users' path integration capabilities, viewing
directions, and subjective observations of the robot's speed and distances to
simulated people and objects.
【6】 SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search
标题:SAL-Lightning数据集:网络搜索期间的搜索和眼睛注视行为、资源交互和知识获取
链接:https://arxiv.org/abs/2201.02339
备注:To be published at the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR '22)
摘要:The emerging research field Search as Learning investigates how the Web
facilitates learning through modern information retrieval systems. SAL research
requires significant amounts of data that capture both search behavior of users
and their acquired knowledge in order to obtain conclusive insights or train
supervised machine learning models. However, the creation of such datasets is
costly and requires interdisciplinary efforts in order to design studies and
capture a wide range of features. In this paper, we address this issue and
introduce an extensive dataset based on a user study, in which $114$
participants were asked to learn about the formation of lightning and thunder.
Participants' knowledge states were measured before and after Web search
through multiple-choice questionnaires and essay-based free recall tasks. To
enable future research in SAL-related tasks we recorded a plethora of features
and person-related attributes. Besides the screen recordings, visited Web
pages, and detailed browsing histories, a large number of behavioral features
and resource features were monitored. We underline the usefulness of the
dataset by describing three, already published, use cases.
【7】 A Taxonomy of Social VR Design
标题:社会虚拟现实设计的一种分类学
链接:https://arxiv.org/abs/2201.02253
摘要:Social VR has experienced tremendous growth in the commercial space recently
as an emerging technology for rich interactions themed around leisure, work,
and relationship building. As a result, the state of social VR application
design has become rapidly obfuscated, which complicates identification of
design trends and uncommon features that could inform future design, and
hinders inclusion of new voices in this design space. To help address this
problem, we present a taxonomy of social VR application design choices as
informed by 44 commercial and prototypical applications. Our taxonomy was
informed by multiple discovery strategies including literature review, search
of VR-themed subreddits, and autobiographical landscape research. The taxonomy
elucidates various features across three design areas: the self, interaction,
and the environment.
【8】 Multi-modal data fusion of Voice and EMG data for Robotic Control
标题:机器人控制中语音和肌电数据的多模态数据融合
链接:https://arxiv.org/abs/2201.02237
摘要:Wearable electronic equipment is constantly evolving and is increasing the
integration of humans with technology. Available in various forms, these
flexible and bendable devices sense and can measure the physiological and
muscular changes in the human body and may use those signals to machine
control. The MYO gesture band, one such device, captures Electromyography data
(EMG) using myoelectric signals and translates them to be used as input signals
through some predefined gestures. Use of this device in a multi-modal
environment will not only increase the possible types of work that can be
accomplished with the help of such device, but it will also help in improving
the accuracy of the tasks performed. This paper addresses the fusion of input
modalities such as speech and myoelectric signals captured through a microphone
and MYO band, respectively, to control a robotic arm. Experimental results
obtained as well as their accuracies for performance analysis are also
presented.
【9】 PIEEG: Turn a Raspberry Pi into a Brain-Computer-Interface to measure biosignals
标题:PIEEG:把树莓PI变成脑机接口来测量生物信号
链接:https://arxiv.org/abs/2201.02228
摘要:This paper presents an inexpensive, high-precision, but at the same time,
easy-to-maintain PIEEG board to convert a RaspberryPI to a Brain-computer
interface. This shield allows measuring and processing eight real-time EEG
(Electroencephalography) signals. We used the most popular programming
languages - C, C++ and Python to read the signals, recorded by the device . The
process of reading EEG signals was demonstrated as completely and clearly as
possible. This device can be easily used for machine learning enthusiasts to
create projects for controlling robots and mechanical limbs using the power of
thought. We will post use cases on GitHub
(https://github.com/Ildaron/EEGwithRaspberryPI) for controlling a robotic
machine, unmanned aerial vehicle, and more just using the power of thought.
【10】 Predicting Trust Using Automated Assessment of Multivariate Interactional Synchrony
标题:基于多变量交互同步性自动评估的信任预测
链接:https://arxiv.org/abs/2201.02223
摘要:Diverse disciplines are interested in how the coordination of interacting
agents' movements, emotions, and physiology over time impacts social behavior.
Here, we describe a new multivariate procedure for automating the investigation
of this kind of behaviorally-relevant "interactional synchrony", and introduce
a novel interactional synchrony measure based on features of dynamic time
warping (DTW) paths. We demonstrate that our DTW path-based measure of
interactional synchrony between facial action units of two people interacting
freely in a natural social interaction can be used to predict how much trust
they will display in a subsequent Trust Game. We also show that our approach
outperforms univariate head movement models, models that consider participants'
facial action units independently, and models that use previously proposed
synchrony or similarity measures. The insights of this work can be applied to
any research question that aims to quantify the temporal coordination of
multiple signals over time, but has immediate applications in psychology,
medicine, and robotics.
机器翻译,仅供参考
点击“阅读原文”获取带摘要的学术速递