perceptual context classifier

Next: the memory glasses agent Up: software Previous: classifier system Contents

perceptual context classifier

The classifier takes two inputs, the sensor data from camera and microphone, and the label stream from the user or software agents. The goal of the classifier is to extract meaningful features from the sensor data and use these features to detect the events that the user has labeled. The classifier is based on work done by Clarkson [3,2]. The system overview is as follows:

.: Extract basic features from the sensors at approximately 5Hz. We calculate all spatial moments up to order 2 from the images, 10 equally spaced frequency coefficients from 50Hz to 8000Hz from the audio, including measurements of auditory volume and the amount of speech detected in the environment.
.: These features are collected continually as the user goes through his/her day of activities. All of them together are used to build a World Model by training a Hidden Markov Model (HMM) with the above features. The resulting World Model is really a rough description of the user's surrounding sensory dynamics.
.: Next as the user labels various events and contexts around him/her with the equivalent of a clicker trainer (i.e. impulse labels that don't specify duration), Event Models are built by training more HMMs on the feature sequences surrounding each of the impulse labels.
.: The resulting Event Models are compared with the World Model to recognize these events after the training phase.
$L(\mathit{Event\ Model} \vert \mathit{Observations\ at\ t}) > L( \mathit{World\ Model} \vert \mathit{Observations\ at\ t})$ indicates a triggering of the event detector (where L() indicates the log likelihood function). Or, equivalently we can define an activation function for each classifier as $A(t) = L(\mathit{Event~Model} \vert \mathit{Observations~at~t}) - L(\mathit{World} \vert \mathit{Observations~at~t})$ .

Results were obtained for the events such as the following:

Entering/Leaving the office
Entering/Leaving a large common area
Entering/Leaving the kitchen
walking down the stairs
taking the elevator
participating in a conversation

With the above types of events, after labeling for building a World Model for 2 hours, then labeling for 1 hour and testing for 2 hours, we were able to get the results for detecting and rejecting event occurrences shown in Figure 1.

**Figure 1:** Classifier testing results.
4.0in!siuc.eps

For more additional results on this classifier system please refer to http://www.media.mit.edu/~clarkson/autodiary/index.html.

Next: the memory glasses agent Up: software Previous: classifier system Contents

Rich's local hive hacking account
2000-02-01