Teaching robots social autonomy from in situ human guidance
Abstract
Striking the right balance between robot autonomy and human control is a core challenge in social robotics, in both technical and ethical terms. On the one hand, extended robot autonomy offers the potential for increased human productivity and for the off-loading of physical and cognitive tasks. On the other hand, making the most of human technical and social expertise, as well as maintaining accountability, is highly desirable. This is particularly relevant in domains such as medical therapy and education, where social robots hold substantial promise, but where there is a high cost to poorly performing autonomous systems, compounded by ethical concerns. We present a field study in which we evaluate SPARC (supervised progressively autonomous robot competencies), an innovative approach addressing this challenge whereby a robot progressively learns appropriate autonomous behavior from in situ human demonstrations and guidance. Using online machine learning techniques, we demonstrate that the robot could effectively acquire legible and congruent social policies in a high-dimensional child-tutoring situation needing only a limited number of demonstrations while preserving human supervision whenever desirable. By exploiting human expertise, our technique enables rapid learning of autonomous social and domain-specific policies in complex and nondeterministic environments. Last, we underline the generic properties of SPARC and discuss how this paradigm is relevant to a broad range of difficult human-robot interaction scenarios.
Get full access to this article
View all available purchase options and get full access to this article.
Already a Subscriber?Sign In
Supplementary Material
Summary
Fig. S1. Steps of the study.
Table S1. Post hoc comparison of timing of actions for the supervised condition.
Table S2. Post hoc comparison of timing of actions for the autonomous condition.
Table S3. Exposure to learning units.
Table S4. Game duration.
Resources
File (aat1186_sm.pdf)
REFERENCES AND NOTES
1
C. Breazeal, C. D. Kidd, A. L. Thomaz, G. Hoffman, M. Berlin, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2005) (IEEE, 2005), pp. 708–713.
2
R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction (MIT Press, 1998).
3
A. Billard, S. Calinon, R. Dillmann, S. Schaal, Springer Handbook of Robotics (Springer, 2008), pp. 1371–1394.
4
B. D. Argall, S. Chernova, M. Veloso, B. Browning, A survey of robot learning from demonstration. Robot. Auton. Syst. 57, 469–483 (2009).
5
P. Robinette, A. M. Howard, A. R. Wagner, Effect of robot performance on human–robot trust in time-critical situations. IEEE Trans. Hum. Mach. Syst. 47, 425–436 (2017).
6
Y. Liu, A. Gupta, P. Abbeel, S. Levine, in 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2018), pp. 1118–1125.
7
M. C. Gombolay, R. E. Jensen, J. L. Stigile, S.-H. Son, J. A. Shah, Apprenticeship scheduling: Learning to schedule from human experts, Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), New York, NY, 9 to 15 July 2016.
8
H. Admoni, B. Scassellati, in Proceedings of the 16th International Conference on Multimodal Interaction (ACM, 2014), pp. 196–199.
9
C.-M. Huang, B. Mutlu, in Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2014), pp. 57–64.
10
A. Mihoub, G. Bailly, C. Wolf, F. Elisei, Learning multimodal behavioral models for face-to-face social interaction. J. Multimod. User Interfaces 9, 195–210 (2015).
11
P. Liu, D. F. Glas, T. Kanda, H. Ishiguro, Data-driven HRI: Learning social behaviors by example from human–human interaction. IEEE Trans. Robot. 32, 988–1008 (2016).
12
L. Riek, Wizard of Oz studies in HRI: A systematic review and new reporting guidelines. J. Hum. Robot Interact. 1, 119–136 (2012).
13
P. Sequeira, P. Alves-Oliveira, T. Ribeiro, E. Di Tullio, in Proceedings of the 11th ACM/IEEE International Conference on Human Robot Interaction (IEEE Press, 2016), pp. 197–204.
14
M. Clark-Turner, M. Begum, in Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2018), pp. 372–372.
15
W. B. Knox, S. Spaulding, C. Breazeal, Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, Québec, Canada, 27 to 28 July 2014.
16
W. B. Knox, S. Spaulding, C. Breazeal, in Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems (International Foundation for Autonomous Agents and Multiagent Systems, 2016), pp. 1309–1310.
17
J. A. Fails, D. R. Olsen Jr., in Proceedings of the 8th International Conference on Intelligent User Interfaces (ACM, 2003), pp. 39–45.
18
S. Amershi, M. Cakmak, W. B. Knox, T. Kulesza, Power to the people: The role of humans in interactive machine learning. AI Mag. 35, 105–120 (2015).
19
W. B. Knox, P. Stone, in Proceedings of the Fifth International Conference on Knowledge Capture (ACM, 2009), pp. 9–16.
20
A. L. Thomaz, C. Breazeal, Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artif. Intell. 172, 716–737 (2008).
21
T. L. Sanders, T. Wixon, K. E. Schafer, J. Y. Chen, P. Hancock, in 2014 IEEE International Inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA) (IEEE, 2014), pp. 156–159.
22
S. Chernova, M. Veloso, Interactive policy learning through confidence-based autonomy. J. Artif. Intell. Res. 34, 10.1613/jair.2584, 1–25 (2009).
23
E. Senft, S. Lemaignan, P. Baxter, T. Belpaeme, Proceedings of the Artificial Intelligence for Human-Robot Interaction Symposium, at AAAI Fall Symposium Series, Westlin Arlington Gateway, Arlington, VA, 9 to 11 November 2017.
24
D. Leyzberg, S. Spaulding, B. Scassellati, in Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2014), pp. 423–430.
25
I. Leite, G. Castellano, A. Pereira, C. Martinho, A. Paiva, in Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2012), pp. 367–374.
26
E. Senft, P. Baxter, J. Kennedy, S. Lemaignan, T. Belpaeme, Supervised autonomy for online learning in human-robot interaction. Pattern Recogn. Lett. 99, 77–86 (2017).
27
E. Senft, P. Baxter, J. Kennedy, T. Belpaeme, SPARC: Supervised Progressively Autonomous Robot Competencies, in International Conference on Social Robotics (Springer, 2015), pp. 603–612.
28
M. Nurmi, Predictive text input (2006). US Patent App. 11/035,687.
29
Department for Education, Schools, pupils and their characteristics: January 2018 (2018); https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/719226/Schools_Pupils_and_their_Characteristics_2018_Main_Text.pdf [accessed 18 February 2019].
30
T. Belpaeme, J. Kennedy, A. Ramachandran, B. Scassellati, F. Tanaka, Social robots for education: A review. Sci. Robot. 3, eaat5954 (2018).
31
P. Dillenbourg, Design for classroom orchestration. Comput. Educ. 69, 485–492 (2013).
32
D. E. Meltzer, The relationship between mathematics preparation and conceptual learning gains in physics: A possible “hidden variable” in diagnostic pretest scores. Am. J. Phys. 70, 1259–1268 (2002).
33
T. Schodde, K. Bergmann, S. Kopp, in Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2017), pp. 128–136.
34
P. Baxter, R. Wood, T. Belpaeme, in Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction (ACM, 2012), pp. 105–106.
35
36
M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, A. Ng, ROS: An open-source robot operating system, in ICRA Workshop on Open Source Software, vol. 3, p. 5.
37
R. Bellman, Dynamic Programming (Princeton Univ. Press, 1957).
38
N. S. Podolefsky, K. K. Perkins, W. K. Adams, Factors promoting engaged exploration with computer simulations. Phys. Rev. Sp. Top. Phys. Educ. Res. 6, 020117 (2010).
39
H. Jeffreys, The Theory of Probability (OUP Oxford, 1998).
40
Z. Dienes, Bayesian versus orthodox statistics: Which side are you on? Perspect. Psychol. Sci. 6, 274–290 (2011).
41
JASP Team, JASP (Version 0.8.6)[Computer Software] (2018).
Information & Authors
Information
Published In

Science Robotics
Volume 4 | Issue 35
October 2019
October 2019
Copyright
Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
This is an article distributed under the terms of the Science Journals Default License.
Submission history
Received: 24 February 2019
Accepted: 16 September 2019
Acknowledgments
This work was supported by the EU FP7 DREAM project (grant no. 611391), the EU H2020 Marie Skłodowska-Curie Actions project DoRoThy (grant no. 657227), and the EU H2020 L2TOR project (grant no. 688014). Author contributions: E.S., S.L., P.E.B., and T.B. designed the study. E.S. implemented the technical components based on S.L.’s work. E.S. and M.B. ran the study. M.B. taught the robot. All the authors contributed actively to the writing. Competing interests: The authors declare that they have no competing interests. Data and materials availability: Sources, preprocessed data, script required to generate the graphs, and JASP file for the statistical analysis can be found at https://zenodo.org/record/3386613. All other data needed to evaluate the conclusions in the paper are present in the paper or the Supplementary Materials.
Authors
Funding Information
Metrics & Citations
Metrics
Article Usage
Altmetrics
Citations
Export citation
Select the format you want to export the citation of this publication.
Cited by
- undefined, Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, (101-109), (2021).https://doi.org/10.1145/3434073.3444666
- Use of Robots to Help Students With Diverse Needs, Handbook of Research on Policies, Protocols, and Practices for Social Work in the Digital World, (74-90), (2021).https://doi.org/10.4018/978-1-7998-7772-1.ch005
- undefined, Interaction Design and Children, (542-546), (2021).https://doi.org/10.1145/3459990.3465210
- Mapping Industry 4.0 Enabling Technologies into United Nations Sustainability Development Goals, Sustainability, 13, 5, (2560), (2021).https://doi.org/10.3390/su13052560
- Two is better than one: Social rewards from two agents enhance offline improvements in motor skills more than single agent, PLOS ONE, 15, 11, (e0240622), (2020).https://doi.org/10.1371/journal.pone.0240622
- Multi-Channel Interactive Reinforcement Learning for Sequential Tasks, Frontiers in Robotics and AI, 7, (2020).https://doi.org/10.3389/frobt.2020.00097
Loading...
View Options
Get Access
Log in to view the full text
AAAS login provides access to Science for AAAS Members, and access to other journals in the Science family to users who have purchased individual subscriptions.
- Become a AAAS Member
- Activate your AAAS ID
- Purchase Access to Other Journals in the Science Family
- Account Help
Log in via OpenAthens.
Log in via Shibboleth.
More options
Register for free to read this article
As a service to the community, this article is available for free. Login or register for free to read this article.
View options
PDF format
Download this article as a PDF file
Download PDF





