Input Interaction Modeling

As a Software Engineer, it is appropriate to consider each input for more than threats. Many Software Engineers do this already without extra effort. However, it may only be informed by their isolated experiences. This post will examine some formalization to enrich this existing practice. This way, with a little practice we can significantly improve our initial software security posture and overall quality.

Input

We will call the process “Input Modeling”. The goal is to understand the value, its context, and usage. This can take place to a limited degree at design time. However, this is often the responsibility of the programmer at the time of implementation. Please note this is explicitly not a security practice, even though there are security implications, as with any engineering activity.

There are many similarities between human and application interfaces when it comes to input. At the most fundamental level most input is either data to be reflected to the user (human or application) or values that are used to carry out tasks like flow control.

Text

Text may seem to be complex to take as input and act on. However, when this must be done, it is good to examine how the input is to be used. If the data is to be computed, the scope of what is allowed is automatically reduced. It is easy to get caught in the snare of creating strict rules that make enhancement and maintenance difficult. However, normalizing the text (ex. ToUpper()) and comparing values to expected input before propagating the value as a strong type or Enum value without direct conversion is a solid method of extracting reliable actionable values.

When the data is something like a formatted blogpost for the user, it is important to encode and escape it appropriately for a particular persistence. Improper escaping can result in data loss or corruption. Then as the data leaves the code instance to go elsewhere, it must likewise be escaped or encoded for that particular production. That is to say that the encoding must be decided upon at the point in code that the output destination is known. No method of processing should be considered to be universal, or data could lose integrity and cause issues on the receiving end.

It is helpful to be consistent for each context and avoid multiple implementations for each. More paths lead to less adoption. It is important to make it clear what is expected when processing the data. This leads to a quicker turnaround when flaws are later found.

Numbers

Numbers can be deceptively simple to deal with. When the value is typed by a user or the transport of the input text the conversion back to a number can introduce issues when not done carefully. It is good to avoid creating custom code for these conversions as the framework likely has something already.

The second thing that should be considered is the bounds of the number used. Even if the number is big, or the size of the datatype, set it to a max if it is over your value. The stability gained is worth having to recompile when this sort of bound needs changing.

Whenever possible the smallest type should be used for the variable taking this input. Setting realistic bounds on the value of that number can also prevent later performance issues. It should also be considered if any numbers less than zero are needed. If not reject them or use absolute values.

The bottom line is to take what will be used and leave everything else to pass from the memory of the request. Thoroughly examine the system’s use of the data and be specific about what is accepted.

Complex Types

Fundamentally speaking, all values are built on these. For this reason and in the context of the system itself, further examination should be done on how this input value will interact and be combined with other values. If methods of serialization are to be used, safety should be given at least the same importance as performance and ease of implementation. Further constraints may need to be employed prior to these interactions to ensure the integrity of the outcome.

Successfully accepting input in a consistent manner is the same as an animal learning to walk. If they are not willing and capable of doing this, they are a liability to the herd and will be culled by predators. In the software world, this liability doesn’t go away when a coder does, they persist in code until attacked. This attack could be a clumsy user or an attacker after your valuable data. Either of these could result in instability of the application and loss of credibility and money.

Not Threat Modeling

If you are familiar with threat modeling, you may be able to see the relationship with Input Modeling. In the context of input, if adequate Input Modeling happened many threats would be ineffective without the need to consider threats specifically. Further Threat Modeling is often done without intimate working knowledge of the system. Because of this, the exercise would need to start with widely known existing threats forming less than creative threat theories. Then extra work is required to investigate the threat theories.

…But First

Before you leave the Input Model, once you determine how to consistently receive it cleanly from the user, determine the sensitivity of that value. This is a small thought exercise with huge value. Ask: Does it identify a person? If so, how directly does it identify a person (think PII, PHI and GDPR)? How would the user feel if someone else saw it? Is there regulation around how it is used? Should you audit changes made to it?

Data sensitivity is something that should be considered from the beginning because it should determine what is done with the value. This level of detail does not always emerge from the architectural step.

When I get a chance, I will publish a checklist to help you build your security sensitivity.

Input Interaction Modeling

One thought on “Input Interaction Modeling

Leave a comment