Skip to content
stephantul edited this page Feb 21, 2020 · 3 revisions

The pattern.fr module contains a fast part-of-speech tagger for French (identifies nouns, adjectives, verbs, etc. in a sentence), sentiment analysis, and tools for French verb conjugation and noun singularization & pluralization.

It can be used by itself or with other pattern modules: web | db | en | search | vector | graph.


Documentation

The functions in this module take the same parameters and return the same values as their counterparts in pattern.en. Refer to the documentation there for more details.  

Noun singularization & pluralization

For French nouns there is singularize() and pluralize(). The implementation uses a statistical approach with 93% accuracy for singularization and 92% for pluralization.

>>> from pattern.fr import singularize, pluralize
>>>  
>>> print singularize('chats')
>>> print pluralize('chat')

chat
chats 

Verb conjugation

For French verbs there is conjugate(), lemma(), lexeme() and tenses(). The lexicon for verb conjugation contains about 1,750 common French verbs (constructed with Bob Salita's verb conjugation rules). For unknown verbs it will fall back to regular expressions with an accuracy of about 83%. 

French verbs have more tenses than English verbs. In particular, the plural differs for each person, and there are additional forms for the FUTURE tense, the IMPERATIVE, CONDITIONAL and SUBJUNCTIVE mood and the PERFECTIVE aspect:

>>> from pattern.fr import conjugate
>>> from pattern.fr import INFINITIVE, PRESENT, PAST, SG, SUBJUNCTIVE, PERFECTIVE
>>>  
>>> print conjugate('suis', INFINITIVE)
>>> print conjugate('suis', PRESENT, 1, SG, mood=SUBJUNCTIVE)
>>> print conjugate('suis', PAST, 3, SG) 
>>> print conjugate('suis', PAST, 3, SG, aspect=PERFECTIVE) 

être
sois
était 
fut   

For PAST tense + PERFECTIVE aspect we can also use PRETERITE (passé simple). For PAST tense + IMPERFECTIVE aspect we can also use IMPERFECT (imparfait):

>>> from pattern.fr import conjugate
>>> from pattern.fr import IMPERFECT, PRETERITE
>>>  
>>> print conjugate('suis', IMPERFECT, 3, SG)
>>> print conjugate('suis', PRETERITE, 3, SG)

était
fut   

 The conjugate() function takes the following optional parameters:

Tense Person Number Mood Aspect Alias Example
INFINITVE None None None None "inf" être
PRESENT 1 SG INDICATIVE IMPERFECTIVE "1sg" je __suis__
PRESENT 2 SG INDICATIVE IMPERFECTIVE "2sg" tu __es__
PRESENT 3 SG INDICATIVE IMPERFECTIVE "3sg" il __est__
PRESENT 1 PL INDICATIVE IMPERFECTIVE "1pl" nous __sommes__
PRESENT 2 PL INDICATIVE IMPERFECTIVE "2pl" vous __êtes__
PRESENT 3 PL INDICATIVE IMPERFECTIVE "3pl" ils __sont__
PRESENT None None INDICATIVE PROGRESSIVE "part" étant
 
PRESENT 2 SG IMPERATIVE IMPERFECTIVE "2sg!" sois
PRESENT 1 PL IMPERATIVE IMPERFECTIVE "1pl!" soyons
PRESENT 2 PL IMPERATIVE IMPERFECTIVE "2pl!" soyez
 
PRESENT 1 SG CONDITIONAL IMPERFECTIVE "1sg->" je __serais__
PRESENT 2 SG CONDITIONAL IMPERFECTIVE "2sg->" tu __serais__
PRESENT 3 SG CONDITIONAL IMPERFECTIVE "3sg->" il __serait__
PRESENT 1 PL CONDITIONAL IMPERFECTIVE "1pl->" nous __serions__
PRESENT 2 PL CONDITIONAL IMPERFECTIVE "2pl->" vous __seriez__
PRESENT 3 PL CONDITIONAL IMPERFECTIVE "3pl->" ils __seraient__
 
PRESENT 1 SG SUBJUNCTIVE IMPERFECTIVE "1sg?" je __sois__
PRESENT 2 SG SUBJUNCTIVE IMPERFECTIVE "2sg?" tu __sois__
PRESENT 3 SG SUBJUNCTIVE IMPERFECTIVE "3sg?" il __soit__
PRESENT 1 PL SUBJUNCTIVE IMPERFECTIVE "1pl?" nous __soyons__
PRESENT 2 PL SUBJUNCTIVE IMPERFECTIVE "2pl?" vous __soyez__
PRESENT 3 PL SUBJUNCTIVE IMPERFECTIVE "3pl?" ils __soient__
 
PAST 1 SG INDICATIVE IMPERFECTIVE "1sgp" j' __étais__
PAST 2 SG INDICATIVE IMPERFECTIVE "2sgp" tu __étais__
PAST 3 SG INDICATIVE IMPERFECTIVE "3sgp" il __était__
PAST 1 PL INDICATIVE IMPERFECTIVE "1ppl" nous __étions__
PAST 2 PL INDICATIVE IMPERFECTIVE "2ppl" vous __étiez__
PAST 3 PL INDICATIVE IMPERFECTIVE "3ppl" ils __étaient__
PAST None None INDICATIVE PROGRESSIVE "ppart" été
 
PAST 1 SG INDICATIVE PERFECTIVE "1sgp+" je __fus__
PAST 2 SG INDICATIVE PERFECTIVE "2sgp+" tu __fus__
PAST 3 SG INDICATIVE PERFECTIVE "3sgp+" il __fut__
PAST 1 PL INDICATIVE PERFECTIVE "1ppl+" nous __fûmes__
PAST 2 PL INDICATIVE PERFECTIVE "2ppl+" vous __fûtes__
PAST 3 PL INDICATIVE PERFECTIVE "3ppl+" ils __furent__
 
PAST 1 SG SUBJUNCTIVE IMPERFECTIVE "1sgp?" je __fusse__
PAST 2 SG SUBJUNCTIVE IMPERFECTIVE "2sgp?" tu __fusses__
PAST 3 SG SUBJUNCTIVE IMPERFECTIVE "3sgp?" il __fût__
PAST 1 PL SUBJUNCTIVE IMPERFECTIVE "1ppl?" nous __fussions__
PAST 2 PL SUBJUNCTIVE IMPERFECTIVE "2ppl?" vous __fussiez__
PAST 3 PL SUBJUNCTIVE IMPERFECTIVE "3ppl?" ils __fussent__
 
FUTURE 1 SG INDICATIVE IMPERFECTIVE "1sgf" je __serai__
FUTURE 2 SG INDICATIVE IMPERFECTIVE "2sgf" tu __seras__
FUTURE 3 SG INDICATIVE IMPERFECTIVE "3sgf" il __sera__
FUTURE 1 PL INDICATIVE IMPERFECTIVE "1plf" nous __serons__
FUTURE 2 PL INDICATIVE IMPERFECTIVE "2plf" vous __serez__
FUTURE 3 PL INDICATIVE IMPERFECTIVE "3plf" ils __seron__

Instead of optional parameters, a single short alias, or PARTICIPLE or PAST+PARTICIPLE can also be given. With no parameters, the infinitive form of the verb is returned.

Reference: Salita, B. (2011). French Verb Conjugation Rules. Retrieved from: http://fvcr.sourceforge.net.

Attributive & predicative adjectives 

French adjectives inflect with an -e-s  or -es suffix depending on gender. There are many irregular cases (e.g., curieux → une fille curieuse). You can get the base form with the predicative() function. A statistical approach is used with an accuracy of 95%.

>>> from pattern.fr import predicative
>>> print predicative('curieuse') 

curieux  

Sentiment analysis

For opinion mining there is sentiment(), which returns a (polarity, subjectivity)-tuple, based on a lexicon of adjectives. Polarity is a value between -1.0 and +1.0, subjectivity between 0.0 and 1.0. The accuracy is around 74% (P 0.77, R 0.73) for book reviews:

>>> from pattern.fr import sentiment
>>> print sentiment('Un livre magnifique!')

(1.0, 1.0) 

Parser

For parsing there is parse(), parsetree() and split(). The parse() function annotates words in the given string with their part-of-speech tags (e.g., NN for nouns and VB for verbs). The parsetree() function takes a string and returns a tree of nested objects (Text → Sentence → Chunk → Word). The split() function takes the output of parse() and returns a Text. See the pattern.en documentation (here) how to manipulate Text objects. 

>>> from pattern.fr import parse, split
>>>  
>>> s = parse(u"Le chat noir s'était assis sur le tapis.")
>>> for sentence in split(s):
>>>     print sentence

Sentence('Le/DT/B-NP/O chat/NN/I-NP/O noir/JJ/I-NP/O'
         "s'/PRP/B-NP/O était/VB/B-VP/O assis/VBN/I-VP/O"
         'sur/IN/B-PP/B-PNP le/DT/B-NP/I-PNP tapis/NN/I-NP/I-PNP ././O/O')

The parser is based on Lefff. For words in Lefff that can have multiple part-of-speech tags, we used Lexique to find the most frequent POS-tag. 

References

Sagot, B. (2010). The Lefff, a freely available and large-coverage morphological and syntantic lexicon for French. Proceedings of LREC'10.

New, B., Pallier, C., Ferrand, L. & Matos, R. (2001). A lexical database for contemporary french: LEXIQUE. L'année Psychologique