Educational Technology & Society 5 (4) 2002
ISSN 1436-4522

Theories for Deep Change in Affect-sensitive Cognitive Machines: A Constructivist Model

Barry Kort and Rob Reilly
M.I.T. Media Laboratory
bkort@media.mit.edu
reilly@media.mit.edu

 

ABSTRACT

There is an interplay between emotions and learning, but this interaction is far more complex than previous learning theories have articulated. This article proffers a novel model by which to regard the interplay of emotions upon learning and discusses the larger practical aim of crafting computer-based models that will recognize a learner’s affective state and respond appropriately to it so that learning will proceed at an optimal pace.

Keywords: Constructivism, cognition, emotions, learning


1. Introduction

 

Do emotions contribute to intelligence, and if so, what are the implications for the development of a technology of affective computing?

- Robert Provine, What Questions Are On Psychologist’s Minds Today?

 

In regard to the integration of affect into human-computer interactions the emerging discipline of Affective Computing has begun to address a variety of technical, methodological, and research issues such as machine recognition of affective states of the user, synthesis of affective states of cartoon avatars or embodied agents, applications incorporating social-emotional intelligence. In order for Affective Computing to become a discipline it should be supported by:

  • a comprehensive model, which captures the relationship(s) among the Material Economy, the Information Economy and our proposed Emotion Economy,
  • a novel model that supports model-based reasoning, and,
  • an innovative learning cycle model that integrates/accounts for affect.

 

2.  The Interlinked Economies Model

It seems apparent that a development in one technology creates flow and fluctuation in another technology—for example, an improved means of communication may decrease the pressure to improve transportation. Looking further at almost any major 20th century technological realm, such as medicine, transportation, communications, or energy, it is clear that knowledge, information and ideas made it possible to create entire industries that dramatically changed the Material Economy—so a development in technology causes flow and fluctuation in other realms.

For example, we invested heavily in railroads as a result of the invention of the steam engine. We invested heavily in telecommunications as a result of the invention of the telegraph, telephone and radio. Further advances in mobile power sources gave us the automobile and flying machines. From the Industrial Revolution to the Information Age, commercial economies have become increasingly dependent upon and driven by knowledge and information.

We believe that a our model (Figure 1) can explain the interaction—the flow and fluctuation among the Economies—necessary to frame a dialogue leading to new insights and innovations that incorporate theories of affect into the field of human-computer interaction (HCI).

 


Figure 1. Three Economy Model

 

The first economy is the Material Economy (Figure 1). We are all aware of this economy, as it is the most familiar. It involves the flow of goods and services and is mediated by money. Everyone has a reasonable appreciation of how the Material Economy operates even without having taken a course in basic Economics.

A newer economy, which arose in the second-half of the 20th century, is one that we refer to as the Information-Attention Economy. It was spawned by the advent of information theory, the advent of information technologies and by mass media. This economy is concerned with the flow of information between producers and consumers. It is partly a commercial economy (e.g., newspapers, magazines, books). Other information is traded as part of the gift economy. So the Information-Attention Economy is both a commercial economy and a gift economy.

As the amount of information increases to a point where its manageability becomes an issue—there is too much information to attend to—another element of the Information-Attention Economy appears. This is the system’s ability to ‘pay attention’ to the flow of Information that is in flux. The Information-Attention Economy also has a quantity aspect. Just as the Material Economy can be measured in dollars and cents, the Information-Attention Economy can be measured in ‘hits’ and bits.

We refer to our third economy, which is much less visible and much harder to measure, as the Emotion-Learning-Spiritual Economy. The centerpiece of this ‘Economy’ is the theory of emotions and learning, which we present in more detail later in paper.

But suffice to say in this introduction, we tend to learn from sources of information that we bother to pay attention to. The reason that we ‘pay attention’ is that they nurture our interest, which for our purposes is the act of learning. Associated with learning, as we will see in our models, are positive emotions and negative emotions. When the process of learning is not working well, we experience feelings such as confusion, despair, or frustration. And when learning is working well, we can experience curiosity, fascination, and intrigue. Some especially desirable emotions are enthusiasm, delight and amazement. So this brings us to the high end of the emotional spectrum where the highest emotions are perhaps awe, wonder, enlightenment—the eureka moment—the epiphany or revelation, where everything becomes clear. This is the essence of the Emotion-Learning-Spiritual Economy.

How do these three economies relate to each other? Are they independent and disconnected - is the Material Economy unrelated to the Information-Attention Economy and is that unrelated to the Emotion-Learning-Spiritual Economy? Or are they connected somehow so that flow and fluctuation in one of the three economies will induce fluctuation and flow in one or more of the other two economies?  Just as James Clerk Maxwell showed how electricity and magnetism are coupled, we believe there is a similar coupling among the three economies that needs to be understood and explored.

We want to look ahead. Just as we have well-established economic theory that undergirds the Material Economy and a well-established information theory that underpins the Information-Attention Economy, we need to craft a similar theory for the Emotion-Learning-Spiritual Economy and couple these theories together.

 

3. Science and Storymaking

The education establishment, including most of its research community, remains committed to the educational philosophy of the late nineteenth and early twentieth centuries, and so far none of those who challenge these hallowed traditions has been able to loosen the hold the educational establishment has on how children are taught.

- Seymour Papert, The Children’s Machine

 

To understand the need for a novel model, let us first examine the current educational model. The current model, as shown in Figure 2, begins with ‘data,’ which is a collection of answers to questions that the learner has not yet seen fit to ask or needed to ask.  Such data becomes ‘information’ when it answers a question that the learner cares to ask.  For the most part, a teacher, who must somehow motivate the student to care enough to seek the answers found in the data, supplies these questions.  Studying is like ‘panning for gold’ where the answers are the ‘nuggets’ buried in a ton of otherwise uninteresting gravel.  Once we have our ‘nuggets of information’ how do we organize them into a ‘body of knowledge’? We may think of ‘information’ as the pieces of an unassembled jigsaw puzzle, whereas ‘knowledge’ is the assembled jigsaw puzzle. That is, the question-answer pairs are organized into a coherent structure, in the logical and natural order in which new questions arise as soon as old ones are answered.

 


Figure 2. Old Model: Supports Rule-based Learning

 

The assembled ‘jigsaw puzzle of knowledge’ reveals a previously hidden picture—a ‘big picture,’ if you will. Or to put it another way, the assembled ‘jigsaw puzzle of knowledge’ is a tapestry into which is woven many otherwise hidden and previously unrevealed stories.

The novel model shown below in Figure 3 goes beyond the current model shown in Figure 2. The focii of attention shifts to the construction of ‘knowledge’ and to the extraction of meaningful ‘insights’ from the ‘big picture.’ When ‘knowledge’ is coupled with a personal or cultural value system, ‘wisdom’ emerges.  In other words, wisdom allows us to harness the power of knowledge for beneficial purposes.

 


Figure 3. New Model: Supports Model-based Reasoning

 

‘Wisdom’ affords us the possibility of extracting the stories woven into the tapestry of knowledge. So from ‘wisdom’ we craft the bardic arts of story making and story telling. The ancients crafted myths and legends. These were the prototypical stories of their cultures, which were intended to impart ‘wisdom.’ A story is thus an anecdote drawn from the culture. A well-crafted anecdote or story has value both as an amusement and as a source of insight into the world from which it is drawn.  And the plural of ‘anecdote’ is data—a collection of anecdotal stories or evidence.  This observation closes the loop in Figure 3.

Figure 3 suggests a novel model that, on a fundamental level, supports an improved educational pedagogy. This will serve as a foundation for the next part of our model—how a learner’s affective state should be incorporated into the overall model.

 

4. Models of Emotions and Learning

The extent to which emotional upsets can interfere with mental life is no news to teachers. Students who are anxious, angry, or depressed don’t learn; people who are caught in these states do not take in information efficiently or deal with it well.

- Daniel Goleman, Emotional Intelligence

 

In an attempt to install/build/re-engineer the current state of educational pedagogy, educators should first look to expert teachers who are adept at recognizing the emotional state of learners, and, based upon their observations, take some action that scaffolds learning in a positive manner. But what do these expert teachers see and how do they decide upon a course of action? How do students who have strayed from learning return to a productive path, such as the one that Csikszentmihalyi [1990] refers to as the “zone of flow”? This notion that a student’s affective (emotional) state impacts learning and that appropriate intervention based upon that affective state would facilitate learning is the concept that we propose to explore in-depth.

To prove our point, note that skilled humans can assess emotional signals with varying degrees of precision. For example, researchers are beginning to make progress giving computers similar abilities to accurately recognize affective expressions [Picard, 2000; Scheirer, et. al., 1999], facial expressions [Bartlett, 1999; Cohn, et al., 1999; Donato, 1999; DeSilva, 1997; Ekman, 1997; Essa, 1995], and gestural expression [Chen, et al., 1998; Huang, 1998]. Although computers only perform as well as people in highly restricted domains, we believe that:

  • accurately identifying a learner’s cognitive-emotive state is a critical observation that will enable teachers to provide learners with an efficient and pleasurable learning experience, and,
  • unobtrusive highly accurate technology will be developed to accurately assess actions in less restricted domains (see e.g., Kapoor, et al., 2001).

Our own preliminary pilot studies with elementary school children suggest that a human observer can assess the affective emotional state of a student with reasonable reliability based on observation of facial expressions, gross body language, and the content and tone of speech.  If the human observer is also acting in the role of coach or mentor, these assessments can be confirmed or refined by direct conversation (e.g. simply asking the student if she is confused or frustrated before offering to provide coaching or hints). Moreover, successful learning is frequently marked by an unmistakable elation, often jointly celebrated with “high fives.”  In some cases, the “Aha!” moment is so dramatic, it verges on the epiphanetic. One of the great joys for an educator is to bring a student to such a moment of triumph. But how can computers acquire this same level of proficiency as that of gifted coaches, mentors, and teachers?

Our first step is to offer a model of a learning cycle, which integrates affect. Figure 4 suggests six possible emotion axes that may arise in the course of learning. Figures 5a and 5b interweave the emotion axes shown in Figure 4 with the cognitive dynamics of the learning process. In Figure 5, the positive valence (more pleasurable) emotions are on the right; the negative valence (more unpleasant) emotions are on the left.  The vertical axis is what we call the Learning Axis, and symbolizes the construction of knowledge upward, and the discarding of misconceptions downward. 

 


Figure 4. Emotion sets possibly relevant to learning

 


Figure 5a. Four Quadrant model relating phases of learning to emotions in Figure 4

 

Students ideally begin in Quadrant I or II:  they might be curious or fascinated about a new topic of interest (Quadrant I) or they might be puzzled and motivated to reduce confusion (Quadrant II).  In either case, they are in the top half of the space if their focus is on constructing or testing knowledge.  Movement happens in this space as learning proceeds.  For example, when solving a puzzle in The Incredible Machine, a student gets a bright idea how to implement a solution and then builds its simulation. If she runs the simulation and it fails, she sees that her idea has some part that doesn’t work—that needs to be diagnosed and reconstructed.  At this point the student may move down into the lower half of the diagram (Quadrant III) into the ‘dark teatime of the soul’ while discarding misconceptions and unproductive ideas.  As she consolidates her knowledge—what works and what does not—with awareness of a sense of making progress, she advances to Quadrant IV.  Getting another fresh idea propels the student back into the upper half of the space (Quadrant I).  Thus, a typical learning experience involves a range of emotions, cycling the student around the four quadrant cognitive-emotive space as they learn.

If one visualizes a version of Figure 5a (and Figure 5b) for each axis in Figure 4, then at any given instant, the student might be in multiple Quadrants with respect to different axes.  They might be in Quadrant II with respect to feeling frustrated and simultaneously in Quadrant I with respect to interest level.  It is important to recognize that a range of emotions occurs naturally in a real learning process, and it is not simply the case that the positive emotions are the good ones.

We do not foresee trying to keep the student in Quadrant I, but rather to help him see that the cyclic nature is natural in learning science, mathematics, engineering or technology (SMET), and that when he lands in the negative half, it is an inevitable part of the cycle.  Our aim is to help students to keep orbiting the loop, teaching them to propel themselves, especially after a setback.

A third axis (not shown) can be envisioned as extending out of the plane of the page - the cumulative knowledge axis.  If one visualizes the above dynamics of moving from Quadrant I to II to III to IV as an orbit, then, when this third dimension is added, one obtains an excelsior spiral. In Quadrant I, anticipation and expectation are high, as the learner builds ideas and concepts and tries them out.  Emotional mood decays over time either from boredom or from disappointment.  In Quadrant II, the rate of construction of working knowledge diminishes, and negative emotions emerge as progress wanes. In Quadrant III, as the negative affect runs its course, the learner discards misconceptions and ideas that didn't pan out.  In Quadrant IV, the learner recovers hopefulness and positive attitude as the knowledge set is now cleared of unworkable and unproductive concepts, and the cycle begins anew.  In building a complete and correct mental model associated with a learning opportunity, the learner may experience multiple cycles until completion of the learning exercise. Note that the orbit doesn't close on itself, but gradually spirals around the cumulative knowledge axis.

 


Figure 5b. Circular and helical flow of emotion in Four Quadrant model

 

We are in the process of performing empirical research on this model. We have conducted several pilot research projects, which appear to confirm the model. (Note: Interested readers can find more about this work in our reference list.)

 

5. Conclusion

Our models are inspired by theory often used to describe complex dynamic interactions in engineering systems.  As such, they are not intended to explain how learning works, but rather to provide a framework for thinking and posing questions about the role of emotions in learning.  As with any metaphor, the model has its limits.  The model does not encompass all aspects of the complex interaction between emotions and learning, but begins to describe some of the key phenomena that needs to be considered in metacognition.

These models go beyond previous research studies not just in the range of emotions addressed, but also in an attempt to formalize an analytical model that describes the dynamics of a learner’s emotional states, and does so in a language that supports metacognitive analysis.

 

6. Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant No. 0087768. Any opinions, findings, or conclusions or recommendations expressed in this material are those of the author(s) and does not necessarily reflect the views of the National Science Foundation.

 

7. References

Barlett, M., Hager, J. C., Ekman, P., & Sejnowski, T. (1999). Measuring Facial Expression by Computer Image Analysis. Psychophysiology, 36, 253-263.

Bransford, J., Brown, A. L., & Cocking, R. (1999). How People Learn: Brain, Mind, Experience, and School, WashingtonDC: NationalAcademy Press.

Chen, L. S., Huang, T. S., Miyasato, T., & Nakatsu, R. (1998). Multimodal Human Emotion/Expression Recognition. Paper presented at the 3rd International Conference on Automated Face and Gesture Recognition,April 14-16, 1998, Nara, Japan.

Cohn, J. F., Zlochower, A. J., Lien, J., & Kanade, T. (1999). Automated Face Analysis by Feature Point Tracking has High Concurrent Validity with Manual FACS Coding. Psychophysiology, 36, 35-43.

Csikszentmihalyi, M. (1990). Flow: The Psychology of Optimal Experience, New York: Harper-Row.

Damasio, A. R., (1994). Descartes Error: Emotion, Reason and the Human Brain, New York: G.P. Putnam Sons.

Del Soldato, T. (1994). Motivation in Tutoring Systems, Tech. Rep. CSRP 303, School of Cognitive and Computing Science, The University of Sussex, UK.

Desilva, L. C., Miyasato, T., & Nakatsu, R. (1997). Facial emotion recognition using multi-modal information.Paper presented at the IEEE International Conference on Information, Communication and Signal Processing, October 15-18, 1997, Singapore.

De Vincente, A., & Pain, H. (1999) Motivation Self-Report in ITS. In Lajoie, S. P. and Vivet, M. (Eds.) Proceedings of the Ninth World Conference on Artificial Intelligence in Education, Amsterdam: IOS Press, 651-653.

Donato, G., Barlett, M. S., Hager, J. C., Ekman, P., & Sejnowski, T. J. (1999).Classifying facial actions, IEEE Pattern Analysis and Matching Intelligence, 21, 974-989.

Ekman, P. (1997). Facial Action Coding System, Palo Alto: Consulting Psychologists Press.

Essa, I., & Pentland, A. (1997).Coding, analysis, interpretation and recognition of facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 757-763.

Goldman, D. (1995). Emotional Intelligence, New York: Bantam Books.

Haro, A., Essa, I., & Flickner, M. (2000).Detecting and Tracking Eyes by Using their Physiological Properties, Dynamics and Appearance.Paper presented at the IEEE Computer Vision and Pattern Recognition, June 13-15, 2000, Hilton Head, South Carolina, USA.

Huang, T. S., Chen, L. S., & Tao, H. (1998).Bimodal Emotion Recognition by Man and Machine.Paper presented at the ATR Workshop on Virtual Communication Environments, April 14 - 16, 1998, Nara, Japan.

Kapoor, A., Mota, S., & Picard, R. (2001). Towards a Learning Companion that Recognizes Affect. Paper presented at the AAAI Fall Symposium, November 2-4, 2001, North Falmouth, Massachusetts.

Klein, J. (1999). Computer Response to User Frustration, Master’s thesis, MIT Media Lab.

Matsubara, Y., & Nagamachi, M. (1996).Motivation systems and motivation models for intelligent tutoring. Paper presented at the Third International Conference in Intelligent Tutoring Systems, June, Montreal, Canada.

Olson, R. K., & Wise, B. (1987). Computer Speech in Reading Instruction, In D Reinking (Ed.) Computers and Reading: Issues in Theory and Practice, New York: Teachers College Press, 156-177.

Papert, S. (1993). The Children’s Machine: Rethinking School in the Age of the Computer, Basic Books: New York.

Piaget, J. (1952). The Origins of Intelligence in Children, M. Cook translator, New York: International Universities Press.

Picard, R. W. (2000). Toward Computers that Recognize and Respond to User Emotions.IBM Systems Journal, 39 (3-4), 705.

Picard, R. W. (1997). Affective Computing, Cambridge, MA: MIT Press.

Provine, R. (1998). What Questions Are On Psychologist’s Minds Today? Available on-line (August 1, 2001),
http://www.edge.org/3rd_culture/myers/index.html

Rich, C., Waters, R. C., Strohecker, C., Schabes, Y., Fremen, W. T., Torance, M. C., Golding, A., & Roth, M. (1994). A Prototype Interactive Environment for Collaboration and Learning, Technical Report TR-94-06,
http://www.merl.com/projects/emp/index.html

Scheirer, J., Fernandez, R., & Picard, R. W. (1999). Expression Glasses: A Wearable Device for Facial Expression Recognition, Paper presented at the ACM SIGCHI Conference on Human Factors in Computing Systems, May 15-20, 1999, Pittsburgh, Pennsylvania, USA.

Yacoob, Y., & Davis, L. (1996).Recognizing human facial expressions from log image sequences using optical flow. IEEE Transaction on Pattern Analysis and Machine Intelligence, 18, 636-642.

Yingli, T., Kanade, T., & Cohn, J. F. (2001).Recognizing Action Units for Facial Expression Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (2), 97-115.


decoration


Copyright message

Copyright by the International Forum of Educational Technology & Society (IFETS). The authors and the forum jointly retain the copyright of the articles. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear the full citation on the first page. Copyrights for components of this work owned by others than IFETS must be honoured. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from the authors of the articles you wish to copy or kinshuk@massey.ac.nz.