How to Prevent Web Attacks Using Input Sanitization

Despite all the security measures you might take, a codebase can be the weakest link for any business’s cybersecurity. Sanitizing and validating inputs is usually the first layer of defense. Sanitizing consists of removing any unsafe character from user inputs, and validating will check if the data is in the expected format and type.

Attackers have been using classic flaws for years with a pretty high success rate. While advanced threat actors have more sophisticated approaches such as adversarial machine learning, advanced obfuscation, or even zero-day exploits, classic approaches such as SQL injection, XSS, RFI (remote file inclusion), or directory traversal are still the most common attacks.

And these attacks are often the first step that allow privilege escalation and lateral movements. That’s why developers must sanitize and validate data correctly before saving any entry in a database or processing transactions.

While this guide focuses on sanitizing and validating inputs, other elements such as the server’s configurations must also be taken into account to secure forms.

See the Top Web Application Firewall (WAF) Solutions

Never Trust User Inputs

Some websites don’t bother with checking user inputs, which exposes the app to the maximum level of danger. Fortunately, that’s getting more and more rare thanks to security awareness and code analysis. However, incomplete sanitization is not much better.

Here are a few of the possible attack paths to think about.

GET requests

If developers don’t sanitize strings correctly, attackers can take advantage of XSS flaws such as:

https://mysite.com/?s=<script>console.log('you are in trouble!');</script>

Classic cybersecurity awareness usually highlights the above example with a simple console.log or even an alert; however, it shows that anyone can execute arbitrary JavaScript on your page by simply sending a shortened version of the malformed URL to unsuspecting victims.

Some XSS flaws can even be persistent (stored in the database, for example), which removes the hassle from attackers of making the victim click on something by automatically serving malicious payloads to the website’s users.

Cookies

Websites often use HTTP cookies for session management, customization, and tracking. For example, developers can log in users, remember their preferences, and analyze their behaviors.

The server generates a cookie, or an approximate piece of data, and sends it to the browser to save it for later uses. As a result, stealing cookies allows attackers to be able to impersonate the victims by providing them with immediate access to the targeted accounts without login.

Moreover, hackers don’t have to compromise the victim’s computer. Because HTTP cookies are sent along with each request, attackers can intercept those requests to steal data during man-in-the-middle (MITM) attacks, for example.

A more sophisticated approach can use an XSS attack to insert malicious code into the targeted website to ultimately copy users’ cookies and perform harmful actions in their name.

While Google plans to phase out cookies in its Chrome browser next year, it’s still important to develop best practices for cybersecurity. For example, as of 2022, SSL (Secure Sockets Layer) is no longer an optional layer. However, if the code sends non-SSL requests, cookies will be sent in plain text, so ensure you are using SSL everywhere.

Another good practice is to always use the httpOnly attribute to prevent hijacking with JavaScript. The SameSite attribute is also recommended for developers.

While cookies are convenient for both users and developers, modern authentication and APIs allow better approaches. As storing data in client-side databases allows for many safety and privacy vulnerabilities, it’s better to implement other more secure practices instead.

Also read: Top Code Debugging and Code Security Tools

POST requests

POST requests are server-side requests, so they do not expose data in the URL, for example, when you upload an image on your online account or when you submit a contact form, such as <form action=”https://my-website.com/contact” method=”POST”>.

A common misconception is that POST requests are more secure than GET requests. However, at most, POST requests are security through obscurity. While it is better to use POST requests for user modifications, it’s not great for security-related purposes, and it won’t harden security magically.

One very simple way to sanitize POST data from inputs in PHP could be through the commands:

filter_var($_POST['message'], FILTER_SANITIZE_STRING);

filter_var('[email protected]', FILTER_VALIDATE_EMAIL)

Another good practice in PHP is to use htmlentities() to escape any unwanted HTML character in a string.

As with cookies, always use SSL to encrypt data, so only TCP/IP information will be left unencrypted.

Directory traversal

If the codebase includes an image tag such as <img src=”/getImages?filename=image12.png”>, then hackers may try using https://yourwebsite.com/getImages?filename=../../../etc/passwd to gain access to users’ information.

However, if your server is configured correctly, such attempts to disclose confidential information will be blocked. You should also consider filtering user inputs and ensuring only the expected formats and data types are transmitted.

Don’t Trust Client-Side Validation

A common misconception, especially for beginners, is to rely on HTML and JavaScript only to validate forms data. Indeed, HTML allows defining patterns and required fields, such as setting a character limit or requiring specific fields to be filled; however, there is no HTML attribute or JavaScript code that can’t be modified on the client side.

Hackers might also submit the form using cURL or any HTTP client, so the client side is absolutely not a secure layer to validate forms.

Enable Strict Mode

Whenever you can, enable strict mode, whether it’s PHP, JavaScript or SQL, or any other language. Although, since strict mode prevents lots of convenient syntaxes, it might be difficult to enable it if you have a significant technical debt and legacy.

On the other hand, if you don’t code in strict mode, the engine starts making guesses and can even modify values automatically to make the code work. This opens up vulnerabilities hackers can utilize to inject malicious commands.

For example, in 2015, Andrew Nacin, a major contributor to WordPress, explained how a critical security bug could have been avoided just by enabling strict mode in SQL. During the conference, he demonstrated how hackers could exploit a critical vulnerability by using four-byte characters to force MySQL truncation and then inject malicious code in the database.

While a simple solution to prevent such an attack would be to execute the command SET SESSION sql_mode = "STRICT_ALL_TABLES"; it is impossible to enable this without breaking all websites powered by WordPress.

Read the OWASP WTSG

The Open Web Application Security Project, or OWASP, maintains a comprehensive documentation called the Web Security Testing Guide (WTSG) that includes input validation.

This guide offers information on how to test various injections and other sneaky attacks on inputs. The content is frequently updated, and there are detailed explanations for various scenarios.

For example, you can check out their page on Testing for Stored Cross Site Scripting to learn how persistent XSS works and how to reproduce the exploit.

XSS example

Also read: OWASP Names a New Top Vulnerability for First Time in Years

Learn Your Craft, Escape Late

Sanitizing and validating inputs is mandatory but you cannot apply a generic solution to all entries. You have to consider the specific contexts to be able to block injections. Moreover, don’t store anything in the database without validating it, but also escape values before displaying them, as some injections can poison database records.

Another essential practice is to escape data as late as possible, preferably just before display. This way, you perfectly know the final context and there’s no way to leave data unescaped.

Lastly, spend time on fine-tuning static code analysis. This process can tend to generate a lot of false positives, such as XSS flaws that can’t be exploited; however, every single HTML attribute and tag that gets its value dynamically should be escaped.

While hackers won’t be able to exploit all tags to grab sensitive data or trick logged in users, you should still incorporate static analysis to prevent as many vulnerabilities as possible.

Read next: Top Vulnerability Management Tools

Julien Maury
Julien Maury is a backend developer, a mentor and a technical writer. He loves sharing his knowledge and learning new concepts.

Top Products

Top Cybersecurity Companies

Related articles