Wednesday, October 3, 2012

Motion Input in Windows 8



Not very long ago, the researchers used complex infrared cameras or stereo-vision technique to extract the depth information from the digital images. But these methods were computationally very expensive and also hard to calibrate. Microsoft came to rescue from this problem by presenting Kinect to the scientific community, which made 3D reconstruction lot easier and faster. Though the RGB camera of MS Kinect is not very good, but the depth sensor works reasonably well. 

Microsoft is now planning to use motion based inputs as a first order interface. Usually human motions are faster than traditional mouse clicks or keyboard hits. There are some researches in computer vision and also in psychology, which compares the traditional mouse with the cursor moved by eye-gaze. Results of these researches show that the eye-gaze cursor is faster. If Microsoft can detect the eye-gaze and use that information for mouse tracking, eye typing or selecting something, then it will be faster than the traditional methods. Again human gestures and postures are very important information in communication. If these features are detected correctly, then this information also can be used as inputs to the system, which will be more accurate input and faster.

Although, the current detection and tracking methods are not very accurate and fast, it will be nice to see how Windows can overcome these challenges. There may be some issues in the first release, but in my opinion, it will be a great step forward to the intelligent interface.

Monday, October 1, 2012

Paper Blogs 04



Reference Paper
Towards Real-Time Affect Detection Based on Sample Entropy Analysis of Expressive Gesture
Donald Glowinski and Maurizio Mancini
(Digital Object Identifier 10.1007/978-3-642-24600-5_56)

Overview of the Paper
During human-human communication, body movements play a vital role of affective information in nonverbal communication. Different affect detection systems have been developed based on movement direction and kinematics and arm extensions. In this paper, the authors propose a real-time affect detection system based on Sample Entropy (SampEn) method. 

Most of the methods developed to analyze behavior dynamics is fail to handle two main properties of human movements, they are non-linearity and non-stationarity. In this paper the authors try to handle these two properties of human movements. Their model is based on Camurri et al.’s [2] framework of expressive gesture analysis. According to Camurri et al. gesture analysis is performed in three steps,

      1. Low-level physical measures
      2. Overall gesture features
      3. High-level information

In this paper the authors try to focus on first two part of the above framework.
Microsoft Kinect is used here to capture the RGB image here. Three parts are considered here to form the bounding triangle; they are head, left hand and right hand. To extract the dynamic features of this triangle, two indices are used, smoothness index (SmI) and symmetry Index (SyI).

Smoothness index is computed using the curvature and velocity of the movements of left and right part. After computing the left and right smoothness index, the overall smoothness index is calculated by averaging these two values. Symmetry index is calculated from the horizontal and vertical symmetry value of the bounding triangle. The authors derive a dynamic updating formula for SyI and SmI. From these SyI and SmI value, the SampEn value is calculated. If the hand movements are symmetric and smooth for several frames, then the SmapEn value will be zero. On the other hand, if the movements are not symmetric or not smooth, then value of SampEn will be greater than zero. 

Evaluation and Validity of the Paper

One sample output is available for download here. In the following figure, as the movements are smooth in between t1 and 2, so the SampEn(SmI) is zero. But, as there are some abrupt movement is between t2 and t3, SampEn(SmI) increases. Similarly, as the hands symmetry change in between t4 and t5, SampEn(SyI) increases. And, as the symmetry is not changes in t5 and t6, SampEn(SyI) value becomes zero.

 

Improvement Scopes

In my opinion, the future work should include incorporating these features to detect the high-level information in human-human communication. Also, next extension should include other 3D analysis like forward and backward movements, distance from the camera etc.

 

Further Reading

One of the interesting articles, which are cited by this paper, is “Communicating Expressiveness and Affect in Multimodal Interactive Systems”, by A. Camurri, G. Volpe, G. De Poli, M. Leman [2] (Digital Object Identifier: 10.1109/MMUL.2005.2). In the cited article, the authors preset the frame work for expressive gesture analysis.

[1] D. Glowinski and M. Mancini, “Towards real-time affect detection based on sample entropy analysis of expressive gesture,” Affective Computing and Intelligent Interaction, pp. 527–537, 2011. 

[2] A. Camurri, G. Volpe, G. De Poli, and M. Leman, “Communicating expressiveness and affect in multimodal interactive systems,” MultiMedia, IEEE, vol. 12, no. 1, pp. 43 – 53, 2005.

Wednesday, September 26, 2012

Programming on a Mobile Device



In my opinion programming on a mobile phone is a cool idea; but in reality it will have a very limited usage. In fact, I don’t find any important usage of this feature, except having a cool app in my mobile device. 
 
One of the main problems with programming in mobile phone will be that most of the mobile devices have very small screen. Most of the smartphones now have the onscreen touch keyboard, which again reduces the total size of the screen. It will be very hard for the programmers to write something using these screen and keyboard. Next thing will be to compile the code. Compilers now a day are really heavy. They required huge system resource for themselves. If the compiler takes most of the resources of the mobile device, then the mobile device will be very slow. On the other hand, if the mobile OS allocates very limited resources to the compiler, then the compiler can be usable only for developing DOS mode calculator type small applications.
The next problem will be the power. Using most of the resource for compilation needs power, which will be hard for the mobile devices to support.
This cool app will only be helpful for the mobile phone application developers. They will get the benefit of programming on the live platform, rather than using a mobile phone simulator. 

Tuesday, September 25, 2012

Paper Blogs 03



Reference Paper
SHARED UNDERSTANDING AND SYNCHRONY EMERGENCE
Synchrony as an Indice of the Exchange of Meaning between Dialog Partners
Ken Prepin and Catherine Pelachaud


Overview of the Paper

In face-to-face dialog, synchrony one of the most crucial parameters claimed by the psychologist. The quality of the interaction is perceived by the human from the verbal and non-verbal synchrony. An artificial agent should be able to synchronize with its human counterpart to give the human a feeling of natural interaction. In this the authors present a dynamic model of verbal and non-verbal communication. In test simulations, they show that if the partners in dyad understand each other, then the synchrony emerges, else synchrony disrupted.

They design an interaction model between two agents, agent1 and agent2. Each agent’s state is represented by variable S. Speech produced by each agent is represented by V­act and the speech heard by each agent, the perceived signal is presented by Vper. So the dyadic communication model is represented by Figure1. Here, two ‘level of understanding’ parameters are presented by u and u’.  

                                                                    Figure1 

Again, agent’s internal states are reflected by its non-verbal behavior. The no-verbal behavior of the agent is a function of its internal state. Now after incorporating its non-verbal response, the model looks like Figure2. Here each agent shows some non-verbal act, presented by NVact. 

                                                                     Figure2
Moreover, humans are sensitive to perceived behavior and synchrony. So, a modification of this model is needed, which should include the perceived non-verbal behavior of the agent. So, after perceiving the verbal (Vper) and non-verbal (NVper) act of other agent, the agent’s internal state is changed. Figure3 presents the updated model.   

                                                                    Figure3 

These agents have internal dynamics which control their behavior and also they must be influenced by other’s behavior. The model can be expressed by the following equations.
Now, replacing the internal states S, the following equations are obtained. From these equations, it is observed that, the agents are not only influenced by the state of others, but also influenced by their own state. 



Evaluation

To simulate the result, they use neural network simulator Leto/Prometheus. This simulator updates the whole network at each time. Agent’s internal state is presented by relaxation oscillators. In each step, the neuron feeds the oscillator of both agents. The relaxation oscillator value increases linearly and decreases rapidly when it reaches the threshold value 0.95.
  
Then they simulate for 5000 time step simulates. They consider the signals synchronized if the phase shifts becomes near zero before time step 3000, are remains consistent later. A synchronization result is shown in Figure4.

                                                                     Figure4

They also try with different values of the model parameters and simulation. From their simulations, they found that, when the agents understanding do not differ more than 15%, then the agent’s will eventually be synchronized, no matter what their initial phase shift was. And, if the understanding differs more than 15%, then they will be desynchronized.

Validity of the Paper

In this paper, authors show two main results. They show that dis-synchronization happens for misunderstanding and they are very rapid. They also show that synchrony is a proof of good interaction. They use a very useful and strong analysis method, named time lag analysis.

Improvement Scopes

I think the main challenge faced by the researchers in this field is human unpredictability. It will be nice to see how this model will work on human-agent interaction in future. Moreover, the main challenges still remain to implementation this model, understanding the verbal and non-verbal behavior of human subject.

Further Reading

One of the interesting articles, which are cited by this paper, is “Nonverbal synchrony and rapport”, by Marianne LaFrance (Digital Object Identifier: 10.2307/3033875). In the cited article, the author shows that the posture sharing and rapport are positively correlated. She also presents a hypothesis that, posture sharing may be an influential factor of establishing a rapport.

References
[1] K. Prepin and C. Pelachaud, “Shared Understanding and Synchrony Emergence - Synchrony as an Indice of the Exchange of Meaning between Dialog Partners,” In proceedings of ICAART2011, International Conference on Agent and Artificial Intelligence. Rome, It. Vol. 2, pp(25- 34)

[2] M. LaFrance, “Nonverbal synchrony and rapport: Analysis by the cross-lag panel technique,” in Social Psychology Quarterly, 1979, Vol. 1,  pp(66–70)