Gem #83: Type-Based Security 2: Validating the Input
by Yannick Moy —AdaCore
Let's get started…
Input validation consists of checking a set of properties on the input which guarantee it is well-formed. This usually involves excluding a set of ill-formed inputs (black-list) or matching the input against an exhaustive set of well-formed patterns (white-list).
Here, we consider the task of validating an input for inclusion in an SQL command. This is a well-known defense against SQL injection attacks, where an attacker passes in a specially crafted string that is interpreted as a command rather than a plain string when executing the initial SQL command.
The basic idea is to define a new type SQL_Input derived from type String. Function Validate checks that the input is properly validated and fails if not. Function Valid_String returns the raw data inside a validated string, as follows:
package Inputs is type SQL_Input is new String; function Validate (Input : String) return SQL_Input; function Valid_String (Input : SQL_Input) return String; end Inputs;
The implementation of Validate simply checks that the input string does not contain a dangerous character before returning it as an SQL_Input, while Valid_String is a simple type conversion:
with Ada.Strings.Fixed; use Ada.Strings.Fixed; with Ada.Strings.Maps; use Ada.Strings.Maps; package body Inputs is Dangerous_Characters : constant Character_Set := To_Set ("""*^';&><</"); function Validate (Input : String) return SQL_Input is begin if Index (Input, Dangerous_Characters) /= 0 then raise Constraint_Error with "Invalid input " & Input & " for an SQL query "; else return SQL_Input (Input); end if; end Validate; function Valid_String (Input : SQL_Input) return String is begin return String (Input); end Valid_String; end Inputs;
Now, this does not prevent future uses of such type conversions in the program, whether malicious or unintended. To guard against such possibilities, we must make type SQL_Input private. To make sure we do not ourselves inadvertently convert an input string into a valid one in the implementation of package Inputs, we use this opportunity to make SQL_Input a discriminated record parameterized by the validation status.
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded; package Inputs is type SQL_Input (<>) is private; function Validate (Input : String) return SQL_Input; function Valid_String (Input : SQL_Input) return String; function Is_Valid (Input : SQL_Input) return Boolean; private type SQL_Input (Validated : Boolean) is record case Validated is when True => Valid_Input : Unbounded_String; when False => Raw_Input : Unbounded_String; end case; end record; end Inputs;
Each time we access field Valid_Input, a discriminant check will be performed to ensure that the operand of type SQL_Input has been validated. Observe the use of Unbounded_String for the type of the input component, which is more convenient and flexible than using a constrained string.
Note in the implementation of Validate, that instead of raising an exception when the string cannot be validated, as in the first implementation, here we create corresponding validated or invalid input values based on the result of the check against dangerous characters. Also, an Is_Valid function has been added to allow clients to query validity of an SQL_Input value.
with Ada.Strings.Fixed; use Ada.Strings.Fixed; with Ada.Strings.Maps; use Ada.Strings.Maps; package body Inputs is Dangerous_Characters : constant Character_Set := To_Set ("""*^';&><</"); function Validate (Input : String) return SQL_Input is Local_Input : constant Unbounded_String := To_Unbounded_String (Input); begin if Index (Input, Dangerous_Characters) /= 0 then return (Validated => False, Raw_Input => Local_Input); else return (Validated => True, Valid_Input => Local_Input); end if; end Validate; function Valid_String (Input : SQL_Input) return String is begin return To_String (Input.Valid_Input); end Valid_String; function Is_Valid (Input : SQL_Input) return Boolean is begin return Input.Validated; end Is_Valid; end Inputs;
That's it! As long as this interface is used, no errors can result in improper input being interpreted as a command, while ensuring that future maintainers of the code won't inadvertently be able to insert inappropriate conversions.
Of course, this minimal interface does not really provide anything other than the validation of the input. Simply having an Is_Valid function to tell whether a string is valid input data would seem to give you much the same functionality. However, you can now safely extend this package with additional capabilities, such as transformations on valid SQL inputs (for example, to optimize queries before sending them to the database), or to resolve queries faster using a local cache, and so forth. By using the private encapsulation, you are guaranteed that no client package will tamper with the validity of the SQL inputs you are manipulating.
Incidentally, the similar but distinct problem of input sanitization, where possibly invalid data is transformed into something that is known valid prior to use, can be handled in the same way.
About the Author
Yannick Moy’s work focuses on software source code analysis, mostly to detect bugs or verify safety/security properties. Yannick previously worked for PolySpace (now The MathWorks) where he started the project C++ Verifier. He then joined INRIA Research Labs/Orange Labs in France to carry out a PhD on automatic modular static safety checking for C programs. Yannick joined AdaCore in 2009, after a short internship at Microsoft Research.Yannick holds an engineering degree from the Ecole Polytechnique, an MSc from Stanford University and a PhD from Université Paris-Sud. He is a Siebel Scholar.