2014年10月2日星期四

Was That...Sarcasm?



In the past weeks, we discussed about Natural Language Processing (NLP) and Sentiment Analysis. And I learned that the goal of sentiment analysis is to make machines understand the opinions and attitudes behind texts written in natural language.

Then I raised a question. How about sarcasm?

We know that it is even difficult for human to understand ironies. If you are also a fan of the popular sitcom the Big Bang Theory (TBBT), you must know that in this series, Dr. Sheldon Cooper, with an IQ of 187, could hardly recognize sarcasms. His roommate Leonard even made a sarcasm sign to let he know what Penny said to him is sarcastic.



 Actually researchers for NLP are trying to solve this problem since 1990s, and they called this question irony detection.

If we are talking to a person face to face, we may detect irony from his or her facial expression or tone. However, detecting sarcasms from text and recognizing those sarcasms real polarities are absolutely more challenging.

Carvalho et al. (2009) suggest that emoticons, onomatopoeic expressions for laughter, heavy punctuation marks, quotation marks and positive interjections, could help computers  identify ironic sentences in user generate content (UGC).

While another group from Brazil  (Vanin et al. 2013) focus on Tweet, they summarize 15 expression patterns of irony from Portuguese tweets. They combine “Pattern detection” and “Manual tagging” to improve their system’s accuracy for irony detection. 


The results achieved with three classifiers Näıve Bayes (NB), support vector machines (SVM), and decision trees (DT), are satisfactory, both in terms of classification accuracy, as well as precision, recall, and F-measure.

The work on automatic irony processing is scare, a lot of effort still need to be done. How to make computer understand sarcasm? It is a interesting but tough question.



Reference:

CARVALHO, P., SARMENTO, L., SILVA, M.J. AND DE OLIVEIRA, E., 2009. Clues for 
                   Detecting Irony in User-generated Contents: Oh...‼ It’s “So Easy” ;-). In 
                   Proceedings of the 1st International CIKM Workshop on Topic-sentiment 
                   Analysis  for Mass Opinion. TSA ’09. New York, NY, USA: ACM, pp. 53–56.

REYES, A. AND ROSSO, P., 2011. Mining Subjective Knowledge from Customer Reviews: A 
             Specific Case of Irony Detection. In Proceedings of the 2Nd Workshop on 
             Computational Approaches to Subjectivity and Sentiment Analysis. WASSA ’11. 
             Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 118–124.

VANIN, A.A., FREITAS, L.A., VIEIRA, R. AND BOCHERNITSAN, M., 2013. Some Clues on 
            Irony Detection in Tweets. In Proceedings of the 22Nd International Conference on 
           World Wide Web Companion. WWW ’13 Companion. Republic and Canton of Geneva, 
           Switzerland: International World Wide Web Conferences Steering Committee, pp.   
           635– 636.

12 条评论:

  1. Hi Jiangyi,
    I find your topic quite interesting because I am also a big fan of THE BIG BANG Theory. The attempt to make a computer (even as high capacity as Sheldon) to understand the sarcasm or the implication of a sentence is really not easy. In my blogger, I mentioned a case to do the sentiment analysis, in which NLP is adopted to help the opinion mining, maybe we can have some discussion from the simple case first.

    回复删除
    回复
    1. I am glad that you like my post. I am a big fan of TBBT too. However recently I am too busy to watch the new season. I can't wait to watch the new episodes as soon as finish this semester.
      I have read some of your posts. All of them are very useful and informational. Thank you for your sharing.

      删除
  2. Hi,JIANGYi,you blog is the best blog I have ever seen.And the irony detection you mentioned in the blog is very interesting?As sheldon always said in the big bang theory,he always ask himself is that sarcasm.For a genius with much knowledge like him,he cannot detect sarcasm which potentially shows that maybe a database is not helping in recognize sarcasm

    回复删除
    回复
    1. Thank you. I am glad that you love my blog. Social Media is very interesting. However, as IT major students, we should try to look at social media in a engineering way, or a science way. It is not just fun, but demanding and challenging.

      删除
  3. The picture of Sheldon attract me so that I click into this blog. It's really interesting and new for the concept about analyzing sarcasm by using NLP.

    回复删除
    回复
    1. Every one love Dr. Sheldon Copper. Understand sarcasm is even difficult for super genies. However, in the future, computers may even have better understanding than human, and machines may could learn by themself.

      删除
  4. About this topic, I have an idea. Nowadays voice recognition is developed to a new extent. As we can see, when we use Wechat, voice messages can be transformed into text messages quite accurately. I think sarcasm is always related to special voice patterns. We can extract these kinds of voice messages and train their text versions into a dictionary of sarcasm. I think it may help in this problem.

    回复删除
    回复
    1. Interesting idea. Some voice recognition technology have been used for lie detecting. Sometimes people may would like this kinds of function. No one wanted to be cheated. However, this kind of function may be annoying for many people. Sometimes people do not want others know the truth. That's way people do not like whatapp's blue tick.

      删除
  5. From the first time I came to NLP and sentimental analysis, the question also raised to me, I have to admit that sentimental analysis is so impressive and fantastic,but I'm also wondering if it have enough convinced method to support it, because we can easily find something hard for a computer to analyze, like sarcasm you mentioned, but later, I realise that maybe we should not focus on one opinion, we can take an amount of comments as our research data, then we can get the mood of most of the people, or the trend of the social attitude, that seemed to be reasonable and applicated.

    回复删除
    回复
    1. Just like you said, this is very difficult question and very hard to solve. It is demanding and challenging. There are still so many work that need to be done to optimize this technology.

      删除
  6. Hi Johnnu, your blog is really hot because of your interesting topics. Sentiment analysis can be easily misled by the presence of words that have a strong polarity but are used sarcastically, which means that the opposite polarity was intended. Consider the following tweet on Twitter, which includes the words “yay” and “thrilled” but actually expresses a negative sentiment: “yay! it’s a holiday weekend and i’m on call for work! couldn’t be more thrilled! #sarcasm.” In this case, the hashtag #sarcasm reveals the intended sarcasm, but we don’t always have the benefit of an explicit sarcasm label.

    回复删除
  7. To be honest, the picture of Sheldon attract me frist, and after i read about you blog i know what sarcasm is and you explain it quite interesting, thanks for sharing.

    回复删除