Alexander Bass

C Caesar Cipher

I’ve been curious about the C programming language for a while, so I decided to try my fragmented knowledge of it to make a Caesar cipher transcoder. The Caesar cipher takes an input message and shifts each character of the message by a certain alphabetical offset. For example, with an offset of 1, abcd would become bcde. With an offset of two, it would become cdef. An example message hello world, shifted by 1 would be ifmmp xpsme. To decode ifmmp xpsme, it is fed back into the cipher but this time with the negative of the offset used to encode it.

Illustration from wikipedia showing a Caesar Cipher with an offset of -3

Beyond basic arithmetic and printf("Hello World!"), this is the first real program I’ve done in C. The basic idea behind the program is that you provide a message and an offset as parameters, and it prints the transcoded message out. While the idea is basic, I had one design criteria which made the program complicated: the program must work with any user specified alphabet, even those with special characters.

Supporting custom alphabets is not particularly hard, but the hard part came with the special characters. To support more exotic characters like β and 📯, I had to support Unicode, UTF-8 characters. In most any other language, Unicode support wouldn’t be terribly hard, but C came from an era before Unicode and it does not have good standardized support for it. While a normal (wise) person would reach for a library to support special characters, I decided to write my own (terrible) UTF-8 handling library.

The program works as follows

> acaesar -o 3 "hello world"
khoor zruog

By default, the program uses the lowercase Latin alphabet, but any sequence of characters can be supplied for the charset using the -c <charset> flag

> acaesar -o 3 -c 123456789 8675309
2918603

Even strings containing UTF-8 can be used

> acaesar -o 3 -c 😀😃😄😁😆😅😂🤣🥲🥹😊😇🙂🙃😉😌 😀😊😆🙂😁
😁🙃🤣😌😂

When the program encounters a character not found in the charset, it prints it as-is to the output

> acaesar -o 3 "Hello... World!"
Hhoor... Wruog!

The source code to this project can be found at git.alexanderbass.com/Alexander/ACaesar

Missing Features

I ran out of time making this project, and wish I could’ve added more to it. Here are some of the things:

Reflections

C feels as if it’s barely a programming language. In JavaScript, ’everything is an object’, but in C, everything is a number. Characters are numbers, booleans are numbers, strings are really just pointers, pointers are numbers pointing to an address in memory, arrays are syntactic sugar around pointers. Everything is very loose in C. Implicit type conversion happen without warning, unless you activate all the warnings on the compiler. The language allows you to do things you should never do. The language has around five features: numbers, pointers, structs, and functions. Everything else in C has to be made by careful composition of those fundamental features. When writing my hacky utf-8 parser, I found myself constantly taking shortcuts to make my code work. This project has certainly made me respect the people who can build huge complex systems only with C.

I’ve pointed out some of my gripes with C, and some of its flaws, but I don’t mean to say that it’s all bad. I really did enjoy making the cipher transcoder. I constantly found myself surprised at the things the C compiler would happily let me do.