Web API Security Considerations

This is the second post in a series on Web APIs and Security. Why should Web APIs matter to security? What makes them different?

First, they have been around for a long time, and they are not going to go away. So, it is not something that can be ignored and there is plenty of legacy folly to go around.

Secondly, since a Web API is easy to expose and consume through many flexible means, it is also easy to do so without understanding everything exposed. Nearly anything can interact with a Web API, so the caller should not be trusted to send clean data or even act the right way, not even after an Auth process, not even on a trusted network.

Then, there is the fact that they are meant to be automation friendly. This means that any cobbled together bot can quickly examine the attack surface of an API, probe for more information and attack with little investment from the bot’s human owner.

In addition, because any function exposed can be directly called, all functions must enforce auth and properly handle raw user input; two things that are often glossed over in Software Engineering education and training. In complex Web API offerings, it is easy to see how areas can be neglected or missed entirely. For some, just having an inventory of what is there seems difficult.

A Dash of Unlearn

When software engineers enter the workplace, their more experienced coworkers are usually buried with their day-to-day grind that does not explicitly include herding new talent. While this is a fundamental flaw in many business’s software engineering cultures, it leads to teams creating rules to address common mistakes programmers make early on. This is itself a mistake. One made by instructors all the time as well.

Some examples are: “If you have more than 7 parameters in your constructor, you need to divide it into two classes.” or “No fields, auto-properties only.” or my favorite “Never use sequential Ids.” The issue with all these rules (especially that last one) is that they often hide the real issue without giving an opportunity to explain why. What is worse is when these arbitrary rules came from somewhere outside engineering. From good intending security workers for example.

Then there are the rules that clearly came out of a catastrophic failure of some sort, that someone handed down the command to “Never, ever, ever do X”. These find their way into software engineering and stink up the culture. Don’t do it. An ironic rule for how to treat rules is: if there is not a clear time to break the rule, it is probably invalid. If you come across a blanket rule, the effort in finding out why and when to break the rule will be well worth your time.

Dismantle – an exercise in mature engineering

There are better ways to work through the intricacies of protecting APIs, but first lets dismantle one of those rules we were talking about: “Never use sequential identifiers.”

Personally, I think this one continues to proliferate because it is something even the most non-technical person can superficially understand. It has even made it into high profile documents and findings descriptions.

The basis for this rule seems to be simplicity itself, if not elegance: People and machines can count. Therefore, if I can count to the next number I can guess the id of the next object, even if it is not mine. Assuming the system does not want you to access certain objects.

However, the fact that you have the ID, regardless of how you got it, should not give you access. The system must manage this authorization. It must be acknowledged that an ID can be obtained via shoulder-surfing, various browser and man-in-the-middle attacks, in addition to guessing. Therefore, the act of guessing is not the whole exploit, which also means it is not the mitigation.

The mitigation is to properly (read securely) verify the authorization of the requestor to access the resource in the requested way. This is a suitable time to point out that there is a significant amount of implicit functionality to do so. These Implicit Requirements are specific to the platforms and technologies involved in the Web API.

“All good?”

Does this mean it is always ok to use sequential IDs? No. Without a broad rule, how do you determine if it is ok to use a numeric and sequential ID? It helps to understand why sequential IDs are used in the first place. One may assume that it is simply convenient, but that would be oversimplifying.

Outside of a relational database, sequential IDs may be difficult to reliably implement. Because of the high performance of numeric indexes, relational data management systems (RDMS) usually offer a way to quickly create unique number generating fields. The most performant way to do this is to generate numbers sequentially. So, it is convenient to simply use this contextually unique identifier. More importantly, it offers performance much greater than full text indexes or even special GUID indexes that some RDMS support.

Ideas are more powerful than guns. We would not let our enemies have guns, why should we let them have ideas?

From the perspective that anyone can understand, seeing a number signals the possibility of sequence. This can motivate anyone curious to test and see if that is true. If an attacker puts in another number in the sequence and is granted access to values they are not authorized to view the Web API is said to have Broken Object Level Authorization. As the name indicates, the sequential id is not a vulnerability. The vulnerability is that the requester is not authorized to access the data (aka the Object).

For the sake of discussion, imagine the insecurity is fixed. In this case, what would be gained by not using sequential IDs? If the system would deny them access (with a ‘not found’ error) the attack would fail. On one hand we wasted an attacker’s time. On the other hand, we got them drooling. This could be seen as a net wash. You may be concerned that you got an attacker’s attention. It is unfortunate that poor software engineering has tutored attackers that this is a straightforward way to get at what should otherwise be inaccessible. Be assured it is possible to teach them the opposite.

When considering performance, the situation could be that the performance hit between sequential id and GUID/UUID is worth avoiding attracting attention. Or it may be possible to use a sequential ID for referential integrity within the relational data and another unique identifier in the Web API for a hybrid approach. This is how some object databases store data.

Not All IDs Have the Same Value

What if, for example, a sequential primary key might be used as the ‘Customer Number’ which may be printed on statements and used over the phone. This makes them both more sensitive and easier to obtain. It is good practice to have a well-defined, single purpose for each value, especially IDs.

Using a primary key as a ‘Customer Number’ could cause issues when performing data migration or schema changes. Object IDs (often called Primary Keys in relational databases) have the purpose of referential integrity and are used by the application. Other identifiers with the purpose of being used outside the system should be generated separately and be secured as an object value. URIs are logged as a reference in many situations in Web APIs. If the purposes are not kept separate it could lead to sensitive identifiers being leaked via standard logging.

To sum it all up, there is a case for not using sequential identifiers. However sequential identifiers are not themselves a vulnerability. Therefore, an absolute ban on them driven by security brings the solution to a problem that does not exist. Software Engineering should drive the decision, weighing performance with exposure with cost of engineering effort.

TL;DR

In many cases Software Engineering has been plagued by rules that hide genuine issues and rob coders of the opportunity to learn.

Where significant code is being created the Software Industry is slow to mature when it comes to Security. This is despite significant advances in quality and process. The Application Security space is broken and steadily alienates itself from the rest of the process. Using buzzwords like DevSecOps is not going to fix it. It is time for development teams to bring security home to where it belongs.

What about Threat Modeling? Unfortunately, Threat Modeling is inadequate as a software engineering practice. As a Software Engineer, it is more appropriate to model values in the broader sense of usage, using standards handling those situations. Many Software Engineers do this already without thinking too much about it. However, it may only be informed by one’s own experiences. Next, we will add some formalization to broaden and inform this exercise. This way, with a little practice we can significantly improve our initial software security posture.

The Code Snob

My adventures in snobbish Code Huffing

Web API Security Considerations

A Dash of Unlearn

Dismantle – an exercise in mature engineering

“All good?”

Not All IDs Have the Same Value

TL;DR

Next

Leave a comment Cancel reply

A Dash of Unlearn

Dismantle – an exercise in mature engineering

“All good?”

Not All IDs Have the Same Value

TL;DR

Next

Rate this:

Related

Leave a comment Cancel reply