Combinatorics on words (~1900) is a subdiscipline of mathematics and
theoretical computer science whose object is the study (search for patterns, computation of complexities, etc.) of [possibly infinite] words. An infinite word is a string containing an infinite number of characters, for example
w_1 = 0100101001001010010100100101001001...
or
w_2 = DUMLOQUIMURFUGERITINVIDAAETAS...
(the ellipsis represents the continuation of the infinite word, which of course cannot be displayed here). The combinatorics of words feeds on problems coming from genomics (both RNA and DNA sequences are words on the alphabet ATGC),
text algorithms, or even linguistics. In mathematics, infinite words appear naturally when one seeks to represent numbers (for example, the decimal expansion of pi is 3.14159265358979323846...);
or – and this is one topic I specifically work on – to encode dynamical systems (for example, if one decides to encode by "1" a rebound on a
vertical face, and by "2" a rebound on a horizontal face, the word 11211121... represents the
trajectory of a ball thrown on a square billiard table according to the figure below).
Domain of expertise: Sturmian words and their generalizations
Sturmian words form a class of infinite words over the binary alphabet which sheds light, through their equivalent definitions, on remarkable interactions between combinatorics, dynamical systems, and number theory. These definitions give rise to various classes of words over the d-letter alphabet, and a large program, initiated in the 80s, consists in determining whether some of these classes of words coincide, and which properties of Sturmian words they still satisfy.
Keywords:
[Objects] Words generated by multidimensional continued fraction algorithms (Arnoux-Rauzy & episturmian, Cassaine-Selmer...), hypercubic billiard words. Associated subshifts and languages.
[Properties] Growth of complexity (~entropy), imbalance and abelian complexity (~discrepancy), algebraic properties of frequencies, stability under renormalization processes.
[Tools] Substitutive structure and S-adicity, topology, graphs and automata, numerical experimentation.
Some results:
Contruction of an Arnoux-Rauzy word whose Rauzy fractal is unbounded in all directions of the plane.
This conflicts the intuition given by the Oseledets theorem for infinite products of matrices, which suggests that there should be a contractant direction.
Let d>0. Any d-ary strict episturmian word has rationally independent letter frequencies.
More generally, there is an elegant link between the dimension of the rational linear space spanned by the letter frequencies, and the S-adic structure of episturmian words.
[With L. Vivion] Proof of a conjecture of Rauzy from 1983.
Namely: there exists no ternary words with rationally independent letter frequencies and abelian complexity constant, equal to 3.
[With L. Vivion] Computation of the abelian complexity of hypercubic billiard words.
One of the few abelian complexity we explicitely know. Moreover, the expression is surprisingly simple, and echoes Sturmian words.
PhD Thesis:
Exceptional trajectories in the symbolic dynamics of multidimensional continued
fraction algorithms
The continued fraction is a numeration system (i.e., a way of representing numbers by
infinite sequences of integers, like the decimal expansion) based on Euclid's algorithm. This
numeration system is extremely interesting: it systematically provides the best approximations of
real numbers (for example: Pi, squareroot of 2, the golden ratio...) by ratios of integers - “better” meaning
that these are the most precise approximations that we can hope with a limited number of digits.
This property of best approximation has many applications, both in mathematics and beyond.
Since the 19th century, one has sought to generalize the continued fraction in higher
dimension. Several algorithms have been proposed in order to approach two real numbers by two
rational numbers with the same “small” denominator. My thesis is a part of a large program,
consisting in studying these algorithms from the standpoint of the symbolic dynamical systems they
generate. More precisely, I develop tools to detect and study a convergence anomaly identified for
the first time in 2000: the imbalance. This anomaly is liable to affect the quality of the simultaneous
approximations provided.