Introduction to Natural Language Processing

The landscape of NLP was very different in the beginning of the field. “But it must be recognized that the notion ‘probability of a sentence’ is an entirely useless one, under any known interpretation of this term 1968 p 53. Noam Chomsky. Probability was not seen very well (Chomsky has said many wrong things indeed), and linguists were considered useless. Recently deep learning and computational papers are ubiquitous in major conferences in linguistics, e.g. ACL. ...

2 min Â· Xuanqiang 'Angelo' Huang

Probabilistic Parsing

Language Constituents A constituent is a word or a group of words that function as a single unit within a hierarchical structure This is because there is a lot of evidence pointing towards an hierarchical organization of human language. Example of constituents Let’s have some examples: John speaks [Spanish] fluently John speaks [Spanish and French] fluently Mary programs the homework [in the ETH computer laboratory] Mary programs the homework [in the laboratory] ...

5 min Â· Xuanqiang 'Angelo' Huang

Softmax Function

Softmax is one of the most important functions for neural networks. It also has some interesting properties that we list here. This function is part of The Exponential Family, one can also see that the sigmoid function is a particular case of this softmax, just two variables. Sometimes this could be seen as a relaxation of the action potential inspired by neuroscience (See The Neuron for a little bit more about neurons). This is because we need differentiable, for gradient descent. The action potential is an all or nothing thing. ...

3 min Â· Xuanqiang 'Angelo' Huang

Bag of words

Bag of words only takes into account the count of the words inside a document, ignoring all the syntax and boundaries. This method is very common for email classifications techniques. We can say bag of words can be some sort of pooling, it’s similar to the computer vision analogue. It’s difficult to say what is the best method (also a reason why people say NLP is difficult to teach). Introduction to bag of words Faremo una introduzione di applicazione di NaĂ¯ve Bayes applicato alla classificazione di documenti. ...

2 min Â· Xuanqiang 'Angelo' Huang