Processing language is a very complex neural function which englobes several parts of the brain.
Language has a fundamental characteristic over the other cognitive capabilities, due to it's the conceptualization of the information that the brain uses, and are the labels of the knowledge.
In other words, the information arrives to the brain in different formats (such as visual, smell, touch...), but finally, this information is transformed into language, which is used by the superior cognitive functions to produce responses.
To learn more: thebrain.mcgill.ca ("brain and mind" section; explained in beginner, intermediate and advance level) http://thebrain.mcgill.ca/flash/a/a_10/a_10_cr/a_10_cr_lan/a_10_cr_lan.html
Most of the human communication is through speaking; therefore the language contains the information to represent the knowledge. Unfortunately, deals uniquely with phrases are not enough to understand the whole message.
The sentence the cat shot at the moon is syntactical and grammatical correct, but semantically impossible (in a real situation). Then, to understand correctly this message is a must have specific information about "cat", "moon" even determine which objects can "shoot" [semantic memory]. Also it's necessary to mix this concepts, and probably add other ones, together with the context [logic, deduction,...] to process correctly the message [understanding, learning].
Anyway, the objective of this project is to populate the semantic memory. Fortunately, sentences bring enough information to reach the purpose.
Dealing with sentence analysis is a really hardest problem, with a lot of specific points still unresolved. Such as:
And others one like translation, sentiment analysis, summarization, information extraction, question answering, and so on.
In fact there is a specific field "Natural language processing (NLP)" of computer science (Artificial Intelligence) working and investigating during years to try to defeat these challenges.
It's a challenging aim with a lot of difficulties to solve, such as lexical and syntactic confusions (the word "jump" can be a noun and also a verb) or semantic ambiguity (the sentence I shouted the man with a microphone means I shouted a man USING a microphone or I shouted a man THAT HAS a microphone in his hand)
To learn more: Natural Language Processing
To deep into: NLP course
Stanford University also has an expert group working and publishing a great material about this topic. The Stanford Natural Language Processing Group (Stanford NLP group); Even provides several software tools quite interesting (Stanford NLP software)
This project uses the Stanford Natural Language Processing Core software for obtain fundamental information about syntactical and grammatical English sentences.
Also there is an online demo where you can check how sentences are analysed.
Let's take as example the following sentence "the nice cat with big claws, which is a good pet, jumped over the wall"
Stanford NLP core uses The University of Pennsylvania
(Penn) Treebank
Tag-set, to clasify the words (POS = Part Of the Speech).
Given that only nouns [NN], adjectives [JJ] and action verbs (nor auxiliary nor modals) [VB] have semantic value, and then the key of this process is identify these "semantic words" from sentences and store into the memory in a useful format.
. . .
[the process is detailed here]
With this extracted information is possible to describe the concepts as:
[the internal language that the system understand is described here]
Is s a must store (learning) this information in an organized way into the memory.
Therefore we will be able to do some inquires about those concepts. As for example:
[how the system resolve the that questions is described here]