QR code encoding is the process of converting input data into the matrix pattern of black and white modules that constitutes a QR code. The encoding process involves selecting the optimal data mode, applying error correction, arranging modules according to the specification, and adding structural patterns.
QR codes support four data modes, each optimized for different character sets: Numeric mode (0-9, most efficient), Alphanumeric mode (0-9, A-Z, and nine symbols), Byte mode (any 8-bit data, typically UTF-8), and Kanji mode (Shift JIS encoded Japanese characters). The encoder selects the most efficient mode for the input data, and can switch between modes within a single QR code. Data encoding books on Amazon explain the algorithms.
For URL shortening, the encoding mode affects QR code complexity. A short URL containing only lowercase letters and numbers can be encoded in Byte mode, while a URL with only uppercase letters and numbers could use the more efficient Alphanumeric mode. Some URL shortening services offer uppercase-only short codes specifically to enable Alphanumeric encoding for smaller QR codes.
The encoding process also includes adding error correction codewords, masking the data pattern to ensure even distribution of black and white modules, and adding format and version information. Information theory books on Amazon discuss the theoretical foundations.