Wednesday, September 26, 2012

Programming on a Mobile Device



In my opinion programming on a mobile phone is a cool idea; but in reality it will have a very limited usage. In fact, I don’t find any important usage of this feature, except having a cool app in my mobile device. 
 
One of the main problems with programming in mobile phone will be that most of the mobile devices have very small screen. Most of the smartphones now have the onscreen touch keyboard, which again reduces the total size of the screen. It will be very hard for the programmers to write something using these screen and keyboard. Next thing will be to compile the code. Compilers now a day are really heavy. They required huge system resource for themselves. If the compiler takes most of the resources of the mobile device, then the mobile device will be very slow. On the other hand, if the mobile OS allocates very limited resources to the compiler, then the compiler can be usable only for developing DOS mode calculator type small applications.
The next problem will be the power. Using most of the resource for compilation needs power, which will be hard for the mobile devices to support.
This cool app will only be helpful for the mobile phone application developers. They will get the benefit of programming on the live platform, rather than using a mobile phone simulator. 

Tuesday, September 25, 2012

Paper Blogs 03



Reference Paper
SHARED UNDERSTANDING AND SYNCHRONY EMERGENCE
Synchrony as an Indice of the Exchange of Meaning between Dialog Partners
Ken Prepin and Catherine Pelachaud


Overview of the Paper

In face-to-face dialog, synchrony one of the most crucial parameters claimed by the psychologist. The quality of the interaction is perceived by the human from the verbal and non-verbal synchrony. An artificial agent should be able to synchronize with its human counterpart to give the human a feeling of natural interaction. In this the authors present a dynamic model of verbal and non-verbal communication. In test simulations, they show that if the partners in dyad understand each other, then the synchrony emerges, else synchrony disrupted.

They design an interaction model between two agents, agent1 and agent2. Each agent’s state is represented by variable S. Speech produced by each agent is represented by V­act and the speech heard by each agent, the perceived signal is presented by Vper. So the dyadic communication model is represented by Figure1. Here, two ‘level of understanding’ parameters are presented by u and u’.  

                                                                    Figure1 

Again, agent’s internal states are reflected by its non-verbal behavior. The no-verbal behavior of the agent is a function of its internal state. Now after incorporating its non-verbal response, the model looks like Figure2. Here each agent shows some non-verbal act, presented by NVact. 

                                                                     Figure2
Moreover, humans are sensitive to perceived behavior and synchrony. So, a modification of this model is needed, which should include the perceived non-verbal behavior of the agent. So, after perceiving the verbal (Vper) and non-verbal (NVper) act of other agent, the agent’s internal state is changed. Figure3 presents the updated model.   

                                                                    Figure3 

These agents have internal dynamics which control their behavior and also they must be influenced by other’s behavior. The model can be expressed by the following equations.
Now, replacing the internal states S, the following equations are obtained. From these equations, it is observed that, the agents are not only influenced by the state of others, but also influenced by their own state. 



Evaluation

To simulate the result, they use neural network simulator Leto/Prometheus. This simulator updates the whole network at each time. Agent’s internal state is presented by relaxation oscillators. In each step, the neuron feeds the oscillator of both agents. The relaxation oscillator value increases linearly and decreases rapidly when it reaches the threshold value 0.95.
  
Then they simulate for 5000 time step simulates. They consider the signals synchronized if the phase shifts becomes near zero before time step 3000, are remains consistent later. A synchronization result is shown in Figure4.

                                                                     Figure4

They also try with different values of the model parameters and simulation. From their simulations, they found that, when the agents understanding do not differ more than 15%, then the agent’s will eventually be synchronized, no matter what their initial phase shift was. And, if the understanding differs more than 15%, then they will be desynchronized.

Validity of the Paper

In this paper, authors show two main results. They show that dis-synchronization happens for misunderstanding and they are very rapid. They also show that synchrony is a proof of good interaction. They use a very useful and strong analysis method, named time lag analysis.

Improvement Scopes

I think the main challenge faced by the researchers in this field is human unpredictability. It will be nice to see how this model will work on human-agent interaction in future. Moreover, the main challenges still remain to implementation this model, understanding the verbal and non-verbal behavior of human subject.

Further Reading

One of the interesting articles, which are cited by this paper, is “Nonverbal synchrony and rapport”, by Marianne LaFrance (Digital Object Identifier: 10.2307/3033875). In the cited article, the author shows that the posture sharing and rapport are positively correlated. She also presents a hypothesis that, posture sharing may be an influential factor of establishing a rapport.

References
[1] K. Prepin and C. Pelachaud, “Shared Understanding and Synchrony Emergence - Synchrony as an Indice of the Exchange of Meaning between Dialog Partners,” In proceedings of ICAART2011, International Conference on Agent and Artificial Intelligence. Rome, It. Vol. 2, pp(25- 34)

[2] M. LaFrance, “Nonverbal synchrony and rapport: Analysis by the cross-lag panel technique,” in Social Psychology Quarterly, 1979, Vol. 1,  pp(66–70)

 

Wednesday, September 19, 2012

Wireless Data Caps


In his article Larry Dignan discusses about the usage based pricing of smartphones. Smartphones now become the part of life for most of us. We are using mobile phones not only for voice, but also for browsing, social networking, texting, even as a GPS. Except the voice and text, most other services provided by the operator need data connection. It is also imposed by the operators now to have data connections with smartphones.
Now in most cases, the operators force the smartphone users to enroll for at least a minimum amount of data plan. After this limit, the users have to pay a very high rate for data. Sometimes the data caps are so small for the users that they are exceeded even within a week. After that the users need to pay in a very high rate to access the internet.
In my opinion, usage based charging is reasonable, but the charges beyond the data cap usage should be minimized. To reduce the huge data traffic on the network, if a user exceeds his maximum allowable limit then one option for reducing data traffic is to reduce the data bandwidth of that user.

Monday, September 17, 2012

Paper Blogs 02: Towards Visual and Vocal Mimicry Recognition in Human-Human Interactions



Reference Paper
Towards Visual and Vocal Mimicry Recognition in Human-Human Interactions
Xiaofan Sun, Khiet P. Truong, Maja Pantic, Anton Nijholt
(Digital Object Identifier 10.1109/ICSMC.2011.6083693)

Overview of the Paper

Mimicry occurs in face-to-face conversation both when we agree and when we do not agree. It is found that there is more mimicry when people agree than when they don’t. People try to display shared opinion by displaying similar behavior of his/her counterpart. In this paper, the authors present a method to detect and measure behavioral mimicry in face-to-face conversation by analyzing human actions and human vocal behaviors.

They developed an audiovisual corpus specifically for mimicry research. Their data is drawn from a study session of face-to-face discussion and conversation, of 43 different subjects. The experiment divided into two sessions, in the first session, the participants were asked to present their own stance on a specific topic, and this is the presentation episode. Then they need to discuss with their partner about the topic, and this is the discussion episode. In the second session, they participants were asked to talk about non-task oriented topic among themselves, and this is the conversation episode. For visual recording 7 cameras for each person and 1 camera for both persons were used. Their voices were also recorded. The corpus is later annotated by human behavior science specialist annotators.

For visual mimicry detection, they first extract motion features to identify visual mimicry. They use accumulated motion images (AMI) as feature to represent the motions. In AMI, higher intensity value represents higher degree of complex motions. Then they use hand gesture mimicry behavior of the conversation and present the cross-correlation of movements between two persons. From the similar cross-relation of body movements, they assume that behavioral mimicry probably occurs in that time periods.

For detection of non-verbal vocal mimicry, the authors use the speech rate divided by the length of the signal as the feature. They calculate correlations between the speech patterns in different episodes and compare the correlation to each other. ‘Participant in presentation’ is used as the participant’s base line behavior. Correlations will decrease if the participant adapts the speech behavior of the partner, and will increase if the partner’s speech behavior becomes more similar to the participant’s.   

Evaluation

To validate their result, the correlation curve is presented in the following figure.  The solid line represents the correlation between participants performance in presentation and discussion episode (curve A). The dashed line represents the correlation between participants’ and partners’ performance in discussion episode (curve B).



Three phases are observed in the result. Up to window number 8, both correlations increase. Between window number 8 and 17, curve A decreases and B increases. And, after that, curve A increases and B decreases. In phase 1, correlation A is increasing g because the participant begin with the similar speech style in both presentation and discussion episode. And, curve B is also increasing because the confederate in the discussion phase starts the discussion with more similar speech behavior of the participants. In phase 2, correlation A decreases because the participant starts mimicking the confederate. Correlation B still increases because the participant and confederate are mimicking each other. In phase 3, participant and confederate both know that end of discussion is approaching, the they start express their own opinion in their own style, so correlation A increases and B decreases.   
   

Validity of the Paper

In this paper, authors show that the behavioral information can be extracted from audiovisual data and can be used to measure mimicry. The correlation of visual behavior presented in this paper is not reliable enough to detect visual mimicry. From the result it only can be said that the participants show a similar behavior to a certain degree. They also present some future works for approaching this research problem.

 

Improvement Scopes

I think the main challenge faced by the researchers in this field is the annoted data. They authors generate a face-to-face meeting corpus which is very helpful for the researchers in this field, although the data are collected in highly constrained environment. Some improvement scope exists in making the visual mimicry prediction more reliable. In my opinion, the combination method of results achieved from multiple modalities will be the main improvement scope of this research area.  

Further Reading

One of the interesting articles, which are cited by this paper, is “Histograms of oriented gradients for human detection”, by N. Dalal and B. Triggs (Digital Object Identifier 10.1109/CVPR.2005.177)(PDF Download), which is cited by around 4000 articles. In the cited article, the authors use Histograms of Oriented Gradients (HOG) descriptor for human detection. Their result shows that HOG outperforms most of other existing feature sets in this purpose.
 

References

[1] X. Sun, K. Truong, M. Pantic, and A. Nijholt, “Towards visual and vocal mimicry recognition in human-human interactions,” in Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on. IEEE, 2011, pp. 367–373.

[2] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 886–893.