Somehow technology seems to evolve at a rapid pace, even when the standards bodies that help define it do not. Consider that most of today's websites are built on HTML4, a standard that was introduced in 1997. In the thirteen years since, the way we use the Web has changed dramatically, even if the underlying standard has not.

To bridge the gap, Web developers have adopted and embraced a variety of additional technologies, everything from using client-side JavaScript to build needed features, relying on server-side scripts to process data in ways the browser could not, and using third-party plug-ins, such as Flash, to extend the browser even further. All of these developments reflect the shift from browser as document delivery platform to browser as Web application platform.

Now, with the nearly-complete standard for HTML5 being implemented (at least in part) in the latest or beta versions of all the major browsers, including Internet Explorer, Firefox, Safari, Chrome, and Opera, many of the advanced Web app features developers need will be available in native HTML.

But with any major introduction of new features, HTML5 also brings with it potential security vulnerabilities – which is not to say that HTML5 is "flawed," but that, invariably, there will be new attack vectors for hackers to exploit. Some originate from elements of the standard itself, some from implementations of the standard in each browser, and some from the care that developers do (or do not) take in building their HTML5 code.

We haven't yet seen real-world attacks on HTML5, but among security researchers, several areas of the sprawling new feature set are emerging as the most likely targets for potential threats.

1. Cross-Document Messaging

In an earlier effort to promote security on the Web, HTML4 does not allow pages from one domain to pass or access data in pages from another domain. For example, if a page loaded from domain1.com contains JavaScript code that reads the position of the mouse pointer after a click, it cannot pass that data to a page loaded from domain2.com, which may be in another window (say a pop-up spawned by the first page).

This prevents a malicious site from intercepting data from a legitimate page, but it also presents an obstacle when legitimate pages hosted at different domains need to exchange information with each other. Today, many Web apps consolidate content from multiple domains, but their ability to do so without third-party means is limited by this constraint, requiring cumbersome workaround like Flash or complicated tricks that can expose new vulnerabilities.

HTML5 introduces an API called postMessage that creates a framework for a script in one domain to pass data to a script running on another domain. To help ensure that requests are not malicious, postMessage includes object properties that the developer can use to verify the origin of the request, to ensure that it matches the expected domain.

But HTML5 does not itself enforce this origin check, meaning that a careless developer might not actually implement origin verification, essentially leaving the script exposed to postMessage requests from malicious sites.

2. Local Storage

New to HTML5 is offline storage, a client-side SQL database that can be accessed by JavaScript in a Web page. Like many other HTML5 features, local storage is something that has existed by virtue of third-party development (Google Gears), but is now being adopted into the HTML standard.

Providing access to local storage can significantly accelerate Web applications, especially when they need to query from the same set of data repeatedly. But it also presents several possible threats that can be exposed by careless developers.

When storing sensitive data in an offline database, such as email messages or passwords, developers need to use SSL and they need to generate unique database names so that hackers cannot formulate a predictable attack. Also, developers should use prepared SQL statements, rather than constructing queries in JavaScript code, or else hackers could intercept or emulate these queries to execute "SQL injection" attacks.

3. Attribute Abuse

In addition to providing many new tags, HTML5 also introduces new attributes, some of which apply to familiar tags and may be subject to abuse. A particular threat is when attributes can be used to trigger automatic script execution.

For example, the new HTML5 attribute "autofocus" will automatically switch browser focus to the specified element—a trick that is sometimes useful for user interface design and previously had been implemented using JavaScript. But a malicious site could use the autofocus attribute to steal focus unwittingly from the end user, possibly giving focus to a window which is rigged to execute malicious code when active.

Likewise, other new attributes, including "poster" and "srcdoc," allow page elements to point to external resources—resources that may be malicious in nature. Again, it is not that these attributes are flawed—they exist to enable richer functionality in Web applications—but that they also could be abused by bad actors.

4. Inline Multimedia and SVG

HTML5 is significantly more multimedia-savvy than its predecessor. Until now, browsers needed to rely on third-party plug-ins (such as Flash) to embed most major media formats, including MP3 audio and MP4 video.

With its new <audio>, <video>, and <svg> tags, HTML5 can natively render popular formats and vector graphics without external plug-ins that consume extra resources and sometimes add instability to the browser. But this puts the onus on browser developers to implement complex multimedia rendering that may result in bugs that open new vulnerabilities.

For example, an earlier version of Google Chrome contained a documented bug in its SVG parser which, if tickled a certain way, could allow scripts to access the object properties of a page hosted on a different domain¬—in other words, in violation of cross-domain security policy.

Because each browser will need to implement native multimedia handling to support the new tags, it is possible for different bugs to crop up in each and, therefore, multiple attack vectors could be exposed.

5. Input Validation

Managing user input always requires care, especially when that input will later be displayed or rendered. Malicious users can sometimes exploit poor input validation to sneak executable code or other bug triggers into a page, potentially exposing the site to attack.

Web developers have had to rely on server-side processing to implement rigorous input validation, but this method provides a poor user experience, even though Ajax practices have improved the situation. HTML5 provides rich client-side input validation, empowering Web developers to define input boundaries alongside the forms themselves, with instant feedback provided to users.

But input validation can also give developers a false sense of security. Flawed validation definitions could allow users to bypass the checks. While this problem is not specific to HTML5, because input validation syntax is new to HTML5, developers may be more prone to make mistakes in their validation code.

Additionally, hackers may be able to exploit client-side validation—for example, flawed regular expression (regex) syntax in page code—to, for example, create a Denial of Service (DoS) attack by sending the browser into an infinite loop.

The price of progress

Let's emphasize again that the security threats in HTML5 do not necessarily represent a flawed standard, but they do represent a new standard. Between vendor implementation and developer expertise, introducing new features always brings with it a cost, and that price is new threats.

The good news is that by talking about HTML5 vulnerabilities early and often, we can minimize the most harmful attacks, and force hackers to find more obscure exploits.

Aaron Weiss is a technology writer and frequent contributor to eSecurityPlanet and other Internet.com sites, such as Wi-Fi Planet where he is the Wi-Fi Guru.

Keep up with security news; Follow eSecurityPlanet on Twitter: @eSecurityP.