The way sentences containing the German character ß get longer when uppercased was specially designed to create memory problems in C programs doing string handling
@foone I was part of an upgrade project where program parts written in C was moved to Java. The DB layer of the program used two columns VARCHAR(n) for text - one as-is, and one in upper case for indices; both with the same n. The client truncated the string.
The upgrade was a long project and tested extensively, but on the day of the go live the DB connection suddenly hang.
Reason: Java did The Right Thing and converted ß → SS, and the DB interface didn't deal well with too long strings.
@foone In some official capacities, where things have to be uppercased (from typewritert days), the "ß" has to be transformed into "(SS)" (so as to differentiate "Assman" -> "ASSMAN" and "Aßman" -> "A(SS)MAN", and yes it is hilarious for English readers).
It is its own pumping lemma of sorts.
@foone I was gonna say "just use ẞ" but depending on encoding, that might also add another byte or two I guess? Then again, is this really the only case where the uppercase variant of a character would require more bytes than the lowercase variant?
@foone Probably not related that much, but I remember that in the early days of mobile phones, when every text message was expensive, there was an an outrage that Polish diacritics (ąęźćńż) were counted as more than one character within the 140 characters limit.