Security Archives - Mozilla Hacks - the Web developer blog hacks.mozilla.org Wed, 14 Jul 2021 09:00:09 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 Getting lively with Firefox 90 https://hacks.mozilla.org/2021/07/getting-lively-with-firefox-90/ https://hacks.mozilla.org/2021/07/getting-lively-with-firefox-90/#comments Tue, 13 Jul 2021 15:02:10 +0000 https://hacks.mozilla.org/?p=47323 As the summer rolls around for those of us in the northern hemisphere, temperatures are high and unwinding with a cool ice tea is high on the agenda. Isn't it lucky then that Background Update is here for Windows, which means Firefox can update even if it's not running. We can just sit back and relax!

Also this release we see a few nice JavaScript additions, including private fields and methods for classes, and the at() method for Array, String and TypedArray global objects.

This blog post just provides a set of highlights.

The post Getting lively with Firefox 90 appeared first on Mozilla Hacks - the Web developer blog.

]]>
Getting lively with Firefox 90

As the summer rolls around for those of us in the northern hemisphere, temperatures are high and unwinding with a cool ice tea is high on the agenda. Isn’t it lucky then that Background Update is here for Windows, which means Firefox can update even if it’s not running. We can just sit back and relax!

Also this release we see a few nice JavaScript additions, including private fields and methods for classes, and the at() method for Array, String and TypedArray global objects.

This blog post just provides a set of highlights; for all the details, check out the following:

Classes go private

A feature JavaScript has lacked since its inception, private fields and methods are now enabled by default In Firefox 90. This allows you to declare private properties within a class. You can not reference these private properties from outside of the class; they can only be read or written within the class body.

Private names must be prefixed with a ‘hash mark’ (#) to distinguish them from any public properties a class might hold.

This shows how to declare private fields as opposed to public ones within a class:

class ClassWithPrivateProperties {

  #privateField;
  publicField;

  constructor() {

    // can be referenced within the class, but not accessed outside
    this.#privateField = 42;

    // can be referenced within the class aswell as outside
    this.publicField = 52;
}

  // again, can only be used within the class
  #privateMethod() {
    return 'hello world';
  }

  // can be called when using the class
  getPrivateMessage() {
    return this.#privateMethod();
  }
}

Static fields and methods can also be private. For a more detailed overview and explanation, check out the great guide: Working with private class features. You can also read what it takes to implement such a feature in our previous blog post Implementing Private Fields for JavaScript.

JavaScript at() method

The relative indexing method at() has been added to the Array, String and TypedArray global objects.

Passing a positive integer to the method returns the item or character at that position. However the highlight with this method, is that it also accepts negative integers. These count back from the end of the array or string. For example, 1 would return the second item or character and -1 would return the last item or character.

This example declares an array of values and uses the at() method to select an item in that array from the end.

const myArray = [5, 12, 8, 130, 44];

let arrItem = myArray.at(-2);

// arrItem = 130

It’s worth mentioning there are other common ways of doing this, however this one looks quite neat.

Conic gradients for Canvas

The 2D Canvas API has a new createConicGradient() method, which creates a gradient around a point (rather than from it, like createRadialGradient() ). This feature allows you to specify where you want the center to be and in which direction the gradient should start. You then add the colours you want and where they should begin (and end).

This example creates a conic gradient with 5 colour stops, which we use to fill a rectangle.

var canvas = document.getElementById('canvas');

var ctx = canvas.getContext('2d');

// Create a conic gradient
// The start angle is 0
// The centre position is 100, 100
var gradient = ctx.createConicGradient(0, 100, 100);

// Add five color stops
gradient.addColorStop(0, "red");
gradient.addColorStop(0.25, "orange");
gradient.addColorStop(0.5, "yellow");
gradient.addColorStop(0.75, "green");
gradient.addColorStop(1, "blue");

// Set the fill style and draw a rectangle
ctx.fillStyle = gradient;
ctx.fillRect(20, 20, 200, 200);

The result looks like this:

Rainbow radial gradient

New Request Headers

Fetch metadata request headers provide information about the context from which a request originated. This allows the server to make decisions about whether a request should be allowed based on where the request came from and how the resource will be used. Firefox 90 enables the following by default:

The post Getting lively with Firefox 90 appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2021/07/getting-lively-with-firefox-90/feed/ 2 47323
Eliminating Data Races in Firefox – A Technical Report https://hacks.mozilla.org/2021/04/eliminating-data-races-in-firefox-a-technical-report/ https://hacks.mozilla.org/2021/04/eliminating-data-races-in-firefox-a-technical-report/#comments Tue, 06 Apr 2021 15:21:46 +0000 https://hacks.mozilla.org/?p=47163 We successfully deployed ThreadSanitizer in the Firefox project to eliminate data races in our remaining C/C++ components. In the process, we found several impactful bugs and can safely say that data races are often underestimated in terms of their impact on program correctness. We recommend that all multithreaded C/C++ projects adopt the ThreadSanitizer tool to enhance code quality.

The post Eliminating Data Races in Firefox – A Technical Report appeared first on Mozilla Hacks - the Web developer blog.

]]>
We successfully deployed ThreadSanitizer in the Firefox project to eliminate data races in our remaining C/C++ components. In the process, we found several impactful bugs and can safely say that data races are often underestimated in terms of their impact on program correctness. We recommend that all multithreaded C/C++ projects adopt the ThreadSanitizer tool to enhance code quality.

What is ThreadSanitizer?

ThreadSanitizer (TSan) is compile-time instrumentation to detect data races according to the C/C++ memory model on Linux. It is important to note that these data races are considered undefined behavior within the C/C++ specification. As such, the compiler is free to assume that data races do not happen and perform optimizations under that assumption. Detecting bugs resulting from such optimizations can be hard, and data races often have an intermittent nature due to thread scheduling.

Without a tool like ThreadSanitizer, even the most experienced developers can spend hours on locating such a bug. With ThreadSanitizer, you get a comprehensive data race report that often contains all of the information needed to fix the problem.

An example for a ThreadSanitizer report, showing where each thread is reading/writing, the location they both access and where the threads were created. ThreadSanitizer Output for this example program (shortened for article)

One important property of TSan is that, when properly deployed, the data race detection does not produce false positives. This is incredibly important for tool adoption, as developers quickly lose faith in tools that produce uncertain results.

Like other sanitizers, TSan is built into Clang and can be used with any recent Clang/LLVM toolchain. If your C/C++ project already uses e.g. AddressSanitizer (which we also highly recommend), deploying ThreadSanitizer will be very straightforward from a toolchain perspective.

Challenges in Deployment

Benign vs. Impactful Bugs

Despite ThreadSanitizer being a very well designed tool, we had to overcome a variety of challenges at Mozilla during the deployment phase. The most significant issue we faced was that it is really difficult to prove that data races are actually harmful at all and that they impact the everyday use of Firefox. In particular, the term “benign” came up often. Benign data races acknowledge that a particular data race is actually a race, but assume that it does not have any negative side effects.

While benign data races do exist, we found (in agreement with previous work on this subject [1] [2]) that data races are very easily misclassified as benign. The reasons for this are clear: It is hard to reason about what compilers can and will optimize, and confirmation for certain “benign” data races requires you to look at the assembler code that the compiler finally produces.

Needless to say, this procedure is often much more time consuming than fixing the actual data race and also not future-proof. As a result, we decided that the ultimate goal should be a “no data races” policy that declares even benign data races as undesirable due to their risk of misclassification, the required time for investigation and the potential risk from future compilers (with better optimizations) or future platforms (e.g. ARM).

However, it was clear that establishing such a policy would require a lot of work, both on the technical side as well as in convincing developers and management. In particular, we could not expect a large amount of resources to be dedicated to fixing data races with no clear product impact. This is where TSan’s suppression list came in handy:

We knew we had to stop the influx of new data races but at the same time get the tool usable without fixing all legacy issues. The suppression list (in particular the version compiled into Firefox) allowed us to temporarily ignore data races once we had them on file and ultimately bring up a TSan build of Firefox in CI that would automatically avoid further regressions. Of course, security bugs required specialized handling, but were usually easy to recognize (e.g. racing on non-thread safe pointers) and were fixed quickly without suppressions.

To help us understand the impact of our work, we maintained an internal list of all the most serious races that TSan detected (ones that had side-effects or could cause crashes). This data helped convince developers that the tool was making their lives easier while also clearly justifying the work to management.

In addition to this qualitative data, we also decided for a more quantitative approach: We looked at all the bugs we found over a year and how they were classified. Of the 64 bugs we looked at, 34% were classified as “benign” and 22% were “impactful” (the rest hadn’t been classified).

We knew there was a certain amount of misclassified benign issues to be expected, but what we really wanted to know was: Do benign issues pose a risk to the project? Assuming that all of these issues truly had no impact on the product, are we wasting a lot of resources on fixing them? Thankfully, we found that the majority of these fixes were trivial and/or improved code quality.

The trivial fixes were mostly turning non-atomic variables into atomics (20%), adding permanent suppressions for upstream issues that we couldn’t address immediately (15%), or removing overly complicated code (20%). Only 45% of the benign fixes actually required some sort of more elaborate patch (as in, the diff was larger than just a few lines of code and did not just remove code).

We concluded that the risk of benign issues being a major resource sink was not an issue and well acceptable for the overall gains that the project provided.

False Positives?

As mentioned in the beginning, TSan does not produce false positive data race reports when properly deployed, which includes instrumenting all code that is loaded into the process and avoiding primitives that TSan doesn’t understand (such as atomic fences). For most projects these conditions are trivial, but larger projects like Firefox require a bit more work. Thankfully this work largely amounted to a few lines in TSan’s robust suppression system.

Instrumenting all code in Firefox isn’t currently possible because it needs to use shared system libraries like GTK and X11. Fortunately, TSan offers the “called_from_lib” feature that can be used in the suppression list to ignore any calls originating from those shared libraries. Our other major source of uninstrumented code was build flags not being properly passed around, which was especially problematic for Rust code (see the Rust section below).

As for unsupported primitives, the only issue we ran into was the lack of support for fences. Most fences were the result of a standard atomic reference counting idiom which could be trivially replaced with an atomic load in TSan builds. Unfortunately, fences are fundamental to the design of the crossbeam crate (a foundational concurrency library in Rust), and the only solution for this was a suppression.

We also found that there is a (well known) false positive in deadlock detection that is however very easy to spot and also does not affect data race detection/reporting at all. In a nutshell, any deadlock report that only involves a single thread is likely this false positive.

The only true false positive we found so far turned out to be a rare bug in TSan and was fixed in the tool itself. However, developers claimed on various occasions that a particular report must be a false positive. In all of these cases, it turned out that TSan was indeed right and the problem was just very subtle and hard to understand. This is again confirming that we need tools like TSan to help us eliminate this class of bugs.

Interesting Bugs

Currently, the TSan bug-o-rama contains around 20 bugs. We’re still working on fixes for some of these bugs and would like to point out several particularly interesting/impactful ones.

Beware Bitfields

Bitfields are a handy little convenience to save space for storing lots of different small values. For instance, rather than having 30 bools taking up 240 bytes, they can all be packed into 4 bytes. For the most part this works fine, but it has one nasty consequence: different pieces of data now alias. This means that accessing “neighboring” bitfields is actually accessing the same memory, and therefore a potential data race.

In practical terms, this means that if two threads are writing to two neighboring bitfields, one of the writes can get lost, as both of those writes are actually read-modify-write operations of all the bitfields:

If you’re familiar with bitfields and actively thinking about them, this might be obvious, but when you’re just saying myVal.isInitialized = true you may not think about or even realize that you’re accessing a bitfield.

We have had many instances of this problem, but let’s look at bug 1601940 and its (trimmed) race report:

When we first saw this report, it was puzzling because the two threads in question touch different fields (mAsyncTransformAppliedToContent vs. mTestAttributeAppliers). However, as it turns out, these two fields are both adjacent bitfields in the class.

This was causing intermittent failures in our CI and cost a maintainer of this code valuable time. We find this bug particularly interesting because it demonstrates how hard it is to diagnose data races without appropriate tooling and we found more instances of this type of bug (racy bitfield write/write) in our codebase. One of the other instances even had the potential to cause network loads to supply invalid cache content, another hard-to-debug situation, especially when it is intermittent and therefore not easily reproducible.

We encountered this enough that we eventually introduced a MOZ_ATOMIC_BITFIELDS macro that generates bitfields with atomic load/store methods. This allowed us to quickly fix problematic bitfields for the maintainers of each component without having to redesign their types.

Oops That Wasn’t Supposed To Be Multithreaded

We also found several instances of components which were explicitly designed to be single-threaded accidentally being used by multiple threads, such as bug 1681950:

The race itself here is rather simple, we are racing on the same file through stat64 and understanding the report was not the problem this time. However, as can be seen from frame 10, this call originates from the PreferencesWriter, which is responsible for writing changes to the prefs.js file, the central storage for Firefox preferences.

It was never intended for this to be called on multiple threads at the same time and we believe that this had the potential to corrupt the prefs.js file. As a result, during the next startup the file would fail to load and be discarded (reset to default prefs). Over the years, we’ve had quite a few bug reports related to this file magically losing its custom preferences but we were never able to find the root cause. We now believe that this bug is at least partially responsible for these losses.

We think this is a particularly good example of a failure for two reasons: it was a race that had more harmful effects than just a crash, and it caught a larger logic error of something being used outside of its original design parameters.

Late-Validated Races

On several occasions we encountered a pattern that lies on the boundary of benign that we think merits some extra attention: intentionally racily reading a value, but then later doing checks that properly validate it. For instance, code like:

See for example, this instance we encountered in SQLite.

Please Don’t Do This. These patterns are really fragile and they’re ultimately undefined behavior, even if they generally work right. Just write proper atomic code — you’ll usually find that the performance is perfectly fine.

What about Rust?

Another difficulty that we had to solve during TSan deployment was due to part of our codebase now being written in Rust, which has much less mature support for sanitizers. This meant that we spent a significant portion of our bringup with all Rust code suppressed while that tooling was still being developed.

We weren’t particularly concerned with our Rust code having a lot of races, but rather races in C++ code being obfuscated by passing through Rust. In fact, we strongly recommend writing new projects entirely in Rust to avoid data races altogether.

The hardest part in particular is the need to rebuild the Rust standard library with TSan instrumentation. On nightly there is an unstable feature, -Zbuild-std, that lets us do exactly that, but it still has a lot of rough edges.

Our biggest hurdle with build-std was that it’s currently incompatible with vendored build environments, which Firefox uses. Fixing this isn’t simple because cargo’s tools for patching in dependencies aren’t designed for affecting only a subgraph (i.e. just std and not your own code). So far, we have mitigated this by maintaining a small set of patches on top of rustc/cargo which implement this well-enough for Firefox but need further work to go upstream.

But with build-std hacked into working for us we were able to instrument our Rust code and were happy to find that there were very few problems! Most of the things we discovered were C++ races that happened to pass through some Rust code and had therefore been hidden by our blanket suppressions.

We did however find two pure Rust races:

The first was bug 1674770, which was a bug in the parking_lot library. This Rust library provides synchronization primitives and other concurrency tools and is written and maintained by experts. We did not investigate the impact but the issue was a couple atomic orderings being too weak and was fixed quickly by the authors. This is yet another example that proves how difficult it is to write bug-free concurrent code.

The second was bug 1686158, which was some code in WebRender’s software OpenGL shim. They were maintaining some hand-rolled shared-mutable state using raw atomics for part of the implementation but forgot to make one of the fields atomic. This was easy enough to fix.

Overall Rust appears to be fulfilling one of its original design goals: allowing us to write more concurrent code safely. Both WebRender and Stylo are very large and pervasively multi-threaded, but have had minimal threading issues. What issues we did find were mistakes in the implementations of low-level and explicitly unsafe multithreading abstractions — and those mistakes were simple to fix.

This is in contrast to many of our C++ races, which often involved things being randomly accessed on different threads with unclear semantics, necessitating non-trivial refactorings of the code.

Conclusion

Data races are an underestimated problem. Due to their complexity and intermittency, we often struggle to identify them, locate their cause and judge their impact correctly. In many cases, this is also a time-consuming process, wasting valuable resources. ThreadSanitizer has proven to be not just effective in locating data races and providing adequate debug information, but also to be practical even on a project as large as Firefox.

Acknowledgements

We would like to thank the authors of ThreadSanitizer for providing the tool and in particular Dmitry Vyukov (Google) for helping us with some complex, Firefox-specific edge cases during deployment.

The post Eliminating Data Races in Firefox – A Technical Report appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2021/04/eliminating-data-races-in-firefox-a-technical-report/feed/ 5 47163
Browser fuzzing at Mozilla https://hacks.mozilla.org/2021/02/browser-fuzzing-at-mozilla/ Tue, 09 Feb 2021 16:50:56 +0000 https://hacks.mozilla.org/?p=46947 Mozilla has been fuzzing Firefox and its underlying components for a while. It has proven itself to be one of the most efficient ways to identify quality and security issues. In general, we apply fuzzing on different levels: there is fuzzing the browser as a whole but a significant amount of time is also spent on fuzzing isolated code (e.g. with libFuzzer) or even whole components such as the JS engine using separate shells with various fuzzers. For the purpose of this blog post, we will talk specifically about browser fuzzing only, and go into detail on the pipeline we’ve developed.

The post Browser fuzzing at Mozilla appeared first on Mozilla Hacks - the Web developer blog.

]]>
Introduction

Mozilla has been fuzzing Firefox and its underlying components for a while. It has proven to be one of the most efficient ways to identify quality and security issues. In general, we apply fuzzing on different levels: there is fuzzing the browser as a whole, but a significant amount of time is also spent on fuzzing isolated code (e.g. with libFuzzer) or whole components such as the JS engine using separate shells. In this blog post, we will talk specifically about browser fuzzing only, and go into detail on the pipeline we’ve developed. This single pipeline is the result of years of work that the fuzzing team has put into aggregating our browser fuzzing efforts to provide consistently actionable issues to developers and to ease integration of internal and external fuzzing tools as they become available.

Diagram showing interaction of systems used in Mozilla's browser fuzzing workflow

Build instrumentation

To be as effective as possible we make use of different methods of detecting errors. These include sanitizers such as AddressSanitizer (with LeakSanitizer), ThreadSanitizer, and UndefinedBehaviorSanitizer, as well as using debug builds that enable assertions and other runtime checks. We also make use of debuggers such as rr and Valgrind. Each of these tools provides a different lens to help uncover specific bug types, but many are incompatible with each other or require their own custom build to function or provide optimal results. Besides providing debugging and error detection, some tools cannot work without build instrumentation, such as code coverage and libFuzzer. Each operating system and architecture combination requires a unique build and may only support a subset of these tools.

Last, each variation has multiple active branches including Release, Beta, Nightly, and Extended Support Release (ESR). The Firefox CI Taskcluster instance builds each of these periodically.

Downloading builds

Taskcluster makes it easy to find and download the latest build to test. We discussed above the number of variants created by different instrumentation types, and we need to fuzz them in automation. Because of the large number of combinations of builds, artifacts, architectures, operating systems, and unpacking each, downloading is a non-trivial task.

To help reduce the complexity of build management, we developed a tool called fuzzfetch. Fuzzfetch makes it easy to specify the required build parameters and it will download and unpack the build. It also supports downloading specified revisions to make it useful with bisection tools.

How we generate the test cases

As the goal of this blog post is to explain the whole pipeline, we won’t spend much time explaining fuzzers. If you are interested, please read “Fuzzing Firefox with WebIDL” and the in-tree documentation. We use a combination of publicly available and custom-built fuzzers to generate test cases.

How we execute, report, and scale

For fuzzers that target the browser, Grizzly manages and runs test cases and monitors for results. Creating an adapter allows us to easily run existing fuzzers in Grizzly.

Simplified Python code for a Grizzly adaptor using an external fuzzer.

To make full use of available resources on any given machine, we run multiple instances of Grizzly in parallel.

For each fuzzer, we create containers to encapsulate the configuration required to run it. These exist in the Orion monorepo. Each fuzzer has a configuration with deployment specifics and resource allocation depending on the priority of the fuzzer. Taskcluster continuously deploys these configurations to distribute work and manage fuzzing nodes.

Grizzly Target handles the detection of issues such as hangs, crashes, and other defects. Target is an interface between Grizzly and the browser. Detected issues are automatically packaged and reported to a FuzzManager server. The FuzzManager server provides automation and a UI for triaging the results.

Other more targeted fuzzers use JS shell and libFuzzer based targets use the fuzzing interface. Many third-party libraries are also fuzzed in OSS-Fuzz. These deserve mention but are outside of the scope of this post.

Managing results

Running multiple fuzzers against various targets at scale generates a large amount of data. These crashes are not suitable for direct entry into a bug tracking system like Bugzilla. We have tools to manage this data and get it ready to report.

The FuzzManager client library filters out crash variations and duplicate results before they leave the fuzzing node. Unique results are reported to a FuzzManager server. The FuzzManager web interface allows for the creation of signatures that help group reports together in buckets to aid the client in detecting duplicate results.

Fuzzers commonly generate test cases that are hundreds or even thousands of lines long. FuzzManager buckets are automatically scanned to queue reduction tasks in Taskcluster. These reduction tasks use Grizzly Reduce and Lithium to apply different reduction strategies, often removing the majority of the unnecessary data. Each bucket is continually processed until a successful reduction is complete. Then an engineer can do a final inspection of the minimized test case and attach it to a bug report. The final result is often used as a crash test in the Firefox test suite.

Animation showing an example testcase reduction using Grizzly

Code coverage of the fuzzer is also measured periodically. FuzzManager is used again to collect code coverage data and generate coverage reports.

Creating optimal bug reports

Our goal is to create actionable bug reports to get issues fixed as soon as possible while minimizing overhead for developers.

We do this by providing:

  • crash information such as logs and a stack trace
  • build and environment information
  • reduced test case
  • Pernosco session
  • regression range (bisections via Bugmon)
  • verification via Bugmon

Grizzly Replay is a tool that forms the basic execution engine for Bugmon and Grizzly Reduce, and makes it easy to collect rr traces to submit to Pernosco. It makes re-running browser test cases easy both in automation and for manual use. It simplifies working with stubborn test cases and test cases that trigger multiple results.

As mentioned, we have also been making use of Pernosco. Pernosco is a tool that provides a web interface for rr traces and makes them available to developers without the need for direct access to the execution environment. It is an amazing tool developed by a company of the same name which significantly helps to debug massively parallel applications. It is also very helpful when test cases are too unreliable to reduce or attach to bug reports. Creating an rr trace and uploading it can make stalled bug reports actionable.

The combination of Grizzly and Pernosco have had the added benefit of making infrequent, hard to reproduce issues, actionable. A test case for a very inconsistent issue can be run hundreds or thousands of times until the desired crash occurs under rr. The trace is automatically collected and ready to be submitted to Pernosco and fixed by a developer, instead of being passed over because it was not actionable.

How we interact with developers

To request new features get a proper assessment, the fuzzing team can be reached at fuzzing@mozilla.com or on Matrix. This is also a great way to get in touch for any reason. We are happy to help you with any fuzzing related questions or ideas. We will also reach out when we receive information about new initiatives and features that we think will require attention. Once fuzzing of a component begins, we communicate mainly via Bugzilla. As mentioned, we strive to open actionable issues or enhance existing issues logged by others.

Bugmon is used to automatically bisect regression ranges. This notifies the appropriate people as quickly as possible and verifies bugs once they are marked as FIXED. Closing a bug automatically removes it from FuzzManager, so if a similar bug finds its way into the code base, it can be identified again.

Some issues found during fuzzing will prevent us from effectively fuzzing a feature or build variant. These are known as fuzz-blockers, and they come in a few different forms. These issues may seem benign from a product perspective, but they can block fuzzers from targeting important code paths or even prevent fuzzing a target altogether. Prioritizing these issues appropriately and getting them fixed quickly is very helpful and much appreciated by the fuzzing team.

PrefPicker manages the set of Firefox preferences used for fuzzing. When adding features behind a pref, consider adding it to the PrefPicker fuzzing template to have it enabled during fuzzing. Periodic audits of the PrefPicker fuzzing template can help ensure areas are not missed and resources are used as effectively as possible.

Measuring success

As in other fields, measurement is a key part of evaluating success. We leverage the meta bug feature of Bugzilla to help us keep track of the issues identified by fuzzers. We strive to have a meta bug per fuzzer and for each new component fuzzed.

For example, the meta bug for Domino lists all the issues (over 1100!) identified by this tool. Using this Bugzilla data, we are able to show the impact over the years of our various fuzzers.

Bar graph showing number of bugs reported by Domino over time

Number of bugs reported by Domino over time

These dashboards help evaluate the return on investment of a fuzzer.

Conclusion

There are many components in the fuzzing pipeline. These components are constantly evolving to keep up with changes in debugging tools, execution environments, and browser internals. Developers are always adding, removing, and updating browser features. Bugs are being detected, triaged, and logged. Keeping everything running continuously and targeting as much code as possible requires constant and ongoing efforts.

If you work on Firefox, you can help by keeping us informed of new features and initiatives that may affect or require fuzzing, by prioritizing fuzz-blockers, and by curating fuzzing preferences in PrefPicker. If fuzzing interests you, please take part in the bug bounty program. Our tools are available publicly, and we encourage bug hunting.

The post Browser fuzzing at Mozilla appeared first on Mozilla Hacks - the Web developer blog.

]]>
46947
Changes to SameSite Cookie Behavior – A Call to Action for Web Developers https://hacks.mozilla.org/2020/08/changes-to-samesite-cookie-behavior/ https://hacks.mozilla.org/2020/08/changes-to-samesite-cookie-behavior/#comments Tue, 04 Aug 2020 14:45:24 +0000 https://hacks.mozilla.org/?p=46523 Browsers are changing the default value of the SameSite attribute for cookies from None to Lax. This will greatly improve security for users. However, some web sites may depend (even unknowingly) on the old default, potentially resulting in site breakage. At Mozilla, we are slowly introducing this change. And we urge web developers to test their sites with the new default.

The post Changes to SameSite Cookie Behavior – A Call to Action for Web Developers appeared first on Mozilla Hacks - the Web developer blog.

]]>
We are changing the default value of the SameSite attribute for cookies from None to Lax. This will greatly improve security for users. However, some web sites may depend (even unknowingly) on the old default, potentially resulting in breakage for those sites. At Mozilla, we are slowly introducing this change. And we are strongly encouraging all web developers to test their sites with the new default.

Background

SameSite is an attribute on cookies that allows web developers to declare that a cookie should be restricted to a first-party, or same-site, context. The attribute can have any of the following values:

  • None – The browser will send cookies with both cross-site and same-site requests.
  • Strict – The browser will only send cookies for same-site requests (i.e., requests originating from the site that set the cookie).
  • Lax – Cookies will be withheld on cross-site requests (such as calls to load images or frames). However, cookies will be sent when a user navigates to the URL from an external site; for example, by following a link.

Currently, the absence of the SameSite attribute implies that cookies will be attached to any request for a given origin, no matter who initiated that request. This behavior is equivalent to setting SameSite=None. However, this “open by default” behavior leaves users vulnerable to Cross-Site Request Forgery (CSRF) attacks. In a CSRF attack, a malicious site attempts to use valid cookies from legitimate sites to carry out attacks.

Making the Web Safer

To protect users from CSRF attacks, browsers need to change the way cookies are handled. The two primary changes are:

  • When not specified, cookies will be treated as SameSite=Lax by default
  • Cookies that explicitly set SameSite=None in order to enable cross-site delivery must also set the Secure attribute. (In other words, they must require HTTPS.)

Web sites that depend on the old default behavior must now explicitly set the SameSite attribute to None. In addition, they are required to include the Secure attribute. Once this change is made inside of Firefox, if web sites fail to set SameSite correctly, it is possible those sites could break for users.

Introducing the Change

The new SameSite behavior has been the default in Firefox Nightly since Nightly 75 (February 2020). At Mozilla, we’ve been able to explore the implications of this change. Starting with Firefox 79 (June 2020), we rolled it out to 50% of the Firefox Beta user base. We want to monitor the scope of any potential breakage.

There is currently no timeline to ship this feature to the release channel of Firefox. We want to see that the Beta population is not seeing an unacceptable amount of site breakage—indicating most sites have adapted to the new default behavior. Since there is no exact definition of “breakage” and it can be difficult to determine via telemetry, we are watching for reports of site breakage in several channels (e.g. Bugzilla, social media, blogs).

Additionally, we’d like to see the proposal advance further in the IETF. As proponents of the open web, it is important that changes to the web ecosystem are properly standardized.

Industry Coordination

This is an industry-wide change for browsers and is not something Mozilla is undertaking alone. Google has been rolling this change out to Chrome users since February 2020, with SameSite=Lax being the default for a certain (unpublished) percentage of all their channels (release, beta, canary).

Mozilla is cooperating with Google to track and share reports of website breakage in our respective bug tracking databases. Together, we are encouraging all web developers to start explicitly setting the SameSite attribute as a best practice.

Call to Action for Web Developers

Testing in the Firefox Nightly and Beta channels has shown that website breakage does occur. While we have reached out to those sites we’ve encountered and encouraged them to set the SameSite attribute on their web properties, the web is clearly too big to do this on a case-by-case basis.

It is important that all web developers test their sites against this new default. This will prepare you for when both Firefox and Chrome browsers make the switch in their respective release channels.

Test your site in Firefox

To test in Firefox:

  1. Enable the new default behavior (works in any version past 75):
    1. In the URL bar, navigate to about:config. (accept the warning prompt, if shown).
    2. Type SameSite into the “Search Preference Name” bar.
    3. Set network.cookie.sameSite.laxByDefault to true using the toggle icon.
    4. Set network.cookie.sameSite.noneRequiresSecure to true using the toggle icon.
    5. Restart Firefox.
  2. Verify the browser is using the new SameSite default behavior:
    1. Navigate to https://samesite-sandbox.glitch.me/.
    2. Verify that all rows are green.

At this point, test your site thoroughly. In particular, pay attention to anything involving login flows, multiple domains, or cross-site embedded content (images, videos, etc.). For any flows involving POST requests, you should test with and without a long delay. This is because both Firefox and Chrome implement a two-minute threshold that permits newly created cookies without the SameSite attribute to be sent on top-level, cross-site POST requests (a common login flow).

Check your site for breakage

To see if your site is impacted by the new cookie behavior, examine the Firefox Web Console and look for either of these messages:

  • Cookie rejected because it has the “sameSite=none” attribute but is missing the “secure” attribute.
  • Cookie has “sameSite” policy set to “lax” because it is missing a “sameSite” attribute, and “sameSite=lax” is the default value for this attribute.

Seeing either of these messages does not necessarily mean your site will no longer work, as the new cookie behavior may not be important to your site’s functionality. It is critical, therefore, that each site test under the new conditions. Then, verify that the new SameSite behavior does not break anything. As a general rule, explicitly setting the SameSite attribute for cookies is the best way to guarantee that your site continues to function predictably.

Additional Resources

SameSite cookies explained

SameSite Cookies – Are you Ready?

MDN – SameSite Cookies and Common Warnings

Tracking Chrome’s rollout of the SameSite change

 

The post Changes to SameSite Cookie Behavior – A Call to Action for Web Developers appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/08/changes-to-samesite-cookie-behavior/feed/ 3 46523
Safely reviving shared memory https://hacks.mozilla.org/2020/07/safely-reviving-shared-memory/ https://hacks.mozilla.org/2020/07/safely-reviving-shared-memory/#comments Tue, 21 Jul 2020 14:46:19 +0000 https://hacks.mozilla.org/?p=46439 At Mozilla, we want the web to be capable of running high-performance applications so that users and content authors can choose the safety, agency, and openness of the web platform. Shared-memory multi-threading is an essential low-level building block for high-performance applications. However, keeping users safe is paramount, which is why shared memory and high-resolution timers were effectively disabled at the start of 2018, in light of Spectre. Until now...

The post Safely reviving shared memory appeared first on Mozilla Hacks - the Web developer blog.

]]>
At Mozilla, we want the web to be capable of running high-performance applications so that users and content authors can choose the safety, agency, and openness of the web platform. One essential low-level building block for many high-performance applications is shared-memory multi-threading. That’s why it was so exciting to deliver shared memory to JavaScript and WebAssembly in 2016. This provided extremely fast communication between threads.

However, we also want the web to be secure from attackers. Keeping users safe is paramount, which is why shared memory and high-resolution timers were effectively disabled at the start of 2018, in light of Spectre. Unfortunately, Spectre-attacks are made significantly more effective with high-resolution timers. And such timers can be created with shared memory. (This is accomplished by having one thread increment a shared memory location in a tight loop that another thread can sample as a nanosecond-resolution timer.)

Back to the drawing board

Fundamentally, for a Spectre attack to work, an attacker and victim need to reside in the same process. Like most applications on your computer, browsers used to use a single process. This would allow two open sites, say attacker.example and victim.example, to Spectre-attack each other’s data as well as other data the browser might keep such as bookmarks or history. Browsers have long since become multi-process. With Chrome’s Site Isolation and Firefox’s Project Fission, browsers will isolate each site into its own process. This is possible due to the web platform’s retrofitted same-origin policy.

Unfortunately, isolating each site into its own process is still not sufficient for these reasons:

  1. The same-origin policy has a number of holes, two of which strongly informed our thinking during the design process:
    1. attacker.example can fetch arbitrary victim.example resources into attacker.example’s process, e.g., through the <img> element.
    2. Due to the existence of document.domain, the minimal isolation boundary is a site (roughly the scheme and registrable domain of a website’s host) and not an origin (roughly a website’s scheme, host, and port).
  2. At this point, we don’t know if it’s feasible to isolate each site into its own process across all platforms. It is still a challenging endeavor on mobile. While possibly not a long-term problem, we would prefer a solution that allows reviving shared memory on mobile soon.

Distilling requirements

We need to address the issues above to revive shared memory and high-resolution timers. As such, we have been working on a system that meets the following requirements:

  1. It allows a website to process-isolate itself from attackers and thereby shield itself from intra-process high-resolution timer attacks.
  2. If a website wants to use these high-performance features, it also needs to process-isolate itself from victims. In particular, this means that it has to give up the ability to fetch arbitrary subresources from any site (e.g., through an <img> element) because these end up in the same process. Instead, it can only fetch cross-origin resources from consenting origins.
  3. It allows a browser to run the entire website, including all of its frames and popups, in a single process. This is important to keep the web platform a consistent system across devices.
  4. It allows a browser to run each participating origin (i.e., not site) in its own process. This is the ideal end state across devices and it is important for the design to not prevent this.
  5. The system maintains backwards compatibility. We cannot ask billions of websites to rewrite their code.

Due to these requirements, the system must provide an opt-in mechanism. We cannot forbid websites from fetching cross-origin subresources, as this would not be backwards compatible. Sadly, restricting document.domain is not backwards compatible either. More importantly, it would be unsafe to allow a website to embed cross-origin documents via an <iframe> element and have those cross-origin resources end up in the same process without opting in.

Cross-origin isolated

New headers

Together with others in the WHATWG community, we designed a set of headers that meet these requirements.

The Cross-Origin-Opener-Policy header allows you to process-isolate yourself from attackers. It also has the desirable effect that attackers cannot have access to your global object if they were to open you in a popup. This prevents XS-Leaks and various navigation attacks. Adopt this header even if you have no intention of using shared memory!

The Cross-Origin-Embedder-Policy header with value require-corp tells the browser to only allow this document to fetch cross-origin subresources from consenting websites. Technically, the way that this works is that those cross-origin resources need to specify the Cross-Origin-Resource-Policy header with value cross-origin to indicate consent.

Impact on documents

If the Cross-Origin-Opener Policy and Cross-Origin-Embedder-Policy headers are set for a top-level document with the same-origin and require-corp values respectively, then:

  1. That document will be cross-origin isolated.
  2. Any descendant documents that also set Cross-Origin-Embedder-Policy to require-corp will be cross-origin isolated. (Not setting it results in a network error.)
  3. Any popups these documents open will either be cross-origin isolated or will not have a direct relationship with these documents. This is to say that there is no direct access through window.opener or equivalent (i.e., it’s as if they were created using rel="noopener").

A document that is cross-origin isolated will have access to shared memory, both in JavaScript and WebAssembly. It will only be able to share memory with same-origin documents and dedicated workers in the same “tab” and its popups (technically, same-origin agents in a single browsing context group). It will also have access to the highest-resolution performance.now() available. Evidently, it will not have access to a functional document.domain.

The way these headers ensure mutual consent between origins gives browsers the freedom to put an entire website into a single process or put each of the origins into their own process, or something in between. While process-per-origin would be ideal, this is not always feasible on all devices. So having everything that is pulled into these one-or-more processes consent is a decent middle ground.

Safety backstop

We created a safety backstop to be able to deal with novel cross-process attacks. And used an approach that avoids having to disable shared memory entirely to remain web compatible.

The result is Firefox’s JSExecutionManager. This allows us to regulate the execution of different JavaScript contexts with relation to each other. The JSExecutionManager can be used to throttle CPU and power usage by background tabs. Using the JSExecutionManager, we created a dynamic switch (dom.workers.serialized-sab-access in about:config) that prevents all JavaScript threads that share memory from ever running code concurrently, effectively executing these threads as if on a single-core machine. Because creating a high-resolution timer using shared memory requires two threads to run simultaneously, this switch effectively prevents the creation of a high-resolution timer without breaking websites.

By default, this switch is off, but in the case of a novel cross-process attack, we could quickly flip it on. With this switch as a backstop, we can feel confident enabling shared memory in cross-origin isolated websites even when considering unlikely future worst-case scenarios.

Acknowledgments

Many thanks to Bas Schouten and Luke Wagner for their contributions to this post. And also, in no particular order, many thanks to Nika Layzell, Tom Tung, Valentin Gosu, Eden Chuang, Jens Manuel Stutte, Luke Wagner, Bas Schouten, Neha Kochar, Andrew Sutherland, Andrew Overholt, 蔡欣宜 (Hsin-Yi Tsai), Perry Jiang, Steve Fink, Mike Conca, Lars Thomas Hansen, Jeff Walden, Junior Hsu, Selena Deckelmann, and Eric Rescorla for their help getting this done in Firefox!

The post Safely reviving shared memory appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/07/safely-reviving-shared-memory/feed/ 5 46439
Securing Gamepad API https://hacks.mozilla.org/2020/07/securing-gamepad-api/ https://hacks.mozilla.org/2020/07/securing-gamepad-api/#comments Wed, 01 Jul 2020 14:44:20 +0000 https://hacks.mozilla.org/?p=46287 As part of Mozilla’s ongoing commitment to improve the privacy and security of the web platform, over the next few months, we will be making some changes to the Gamepad API. Starting with Firefox 81, the Gamepad API will be restricted to what are known as “secure contexts.”

The post Securing Gamepad API appeared first on Mozilla Hacks - the Web developer blog.

]]>
Firefox release dates for Gamepad API updates

As part of Mozilla’s ongoing commitment to improve the privacy and security of the web platform, over the next few months we will be making some changes to how the Gamepad_API works.

Here are the important dates to keep in mind:

25 of August 2020 (Firefox 81 Beta/Developer Edition):
.<a href="https://developer.mozilla.org/en-US/docs/Web/API/Navigator/getGamepads">getGamepads()</a> method will only return game pads if called in a “secure context” (e.g., https://).
22 of September 2020 (Firefox 82 Beta/Developer Edition):
Switch to requiring a permission policy for third-party contexts/iframes.

We are collaborating on making these changes with folks from the Chrome team and other browser vendors. We will update this post with links to their announcements as they become available.

Restricting gamepads to secure contexts

Starting with Firefox 81, the Gamepad API will be restricted to what are known as “secure contexts” (bug 1591329). Basically, this means that Gamepad API will only work on sites served as “https://”. But don’t worry, it also works on http://localhost too while you are debugging!

For the next few months, we will show a developer console warning whenever .getGamepads() method is called from an insecure context.

From Firefox 81, we plan to require secure context for .getGamepads() by default. To avoid significant code breakage, calling .getGamepads() will return an empty array. We will display this console warning indefinitely:

Firefox developer console

The developer console nows shows a warning when .getGamepads() method is called from insecure contexts

Permission Policy integration

From Firefox 82, third-party contexts (i.e., <iframe>s that are not same origin) that require access to the Gamepad API will have to be explicitly granted access by the hosting website via a Permissions Policy.

In order for a third-party context to be able to use the Gamepad API, you will need to add an “allow” attribute to your HTML like so:

  <iframe allow="gamepad" src="https://example.com/">
  </iframe>

Once this ships, calling .getGamepads() from a disallowed third-party context will throw a JavaScript security error.

You can our track our implementation progress in bug 1640086.

WebVR/WebXR

As WebVR and WebXR already require a secure context to work, these changes
shouldn’t affect any sites relying on .getGamepads(). In fact, everything should continue to work as it does today.

Future improvements to privacy and security

When we ship APIs we often find that sites use them in unintended ways – mostly creatively, sometimes maliciously. As new privacy and security capabilities are added to the web platform, we retrofit those solutions to better protect users from malicious sites and third-party trackers.

Adding “secure contexts” and “permission policy” to the Gamepad API is part of this ongoing effort to improve the overall privacy and security of the web. Although we know these changes can be a short-term inconvenience to developers, we believe it’s important to constantly evolve the web to be as secure and privacy-preserving as it can be for all users.

The post Securing Gamepad API appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/07/securing-gamepad-api/feed/ 6 46287
Fuzzing Firefox with WebIDL https://hacks.mozilla.org/2020/04/fuzzing-with-webidl/ https://hacks.mozilla.org/2020/04/fuzzing-with-webidl/#comments Thu, 30 Apr 2020 15:12:23 +0000 https://hacks.mozilla.org/?p=46071 Fuzzing, or fuzz testing, is an automated approach for testing the safety and stability of software. For the past 3 years, the Firefox fuzzing team has been developing a new fuzzer to identify security vulnerabilities in the implementation of WebAPIs in Firefox. This fuzzer leverages the WebAPIs’ own WebIDL definitions as a fuzzing grammar.

The post Fuzzing Firefox with WebIDL appeared first on Mozilla Hacks - the Web developer blog.

]]>
TL;DR, An Introduction

Fuzzing, or fuzz testing, is an automated approach for testing the safety and stability of software. It’s typically performed by supplying specially crafted inputs to identify unexpected or even dangerous behavior.  If you’re unfamiliar with the basics of fuzzing, you can find lots more information in the Firefox Fuzzing Docs and the Fuzzing Book.

For the past 3 years, the Firefox fuzzing team has been developing a new fuzzer to help identify security vulnerabilities in the implementation of WebAPIs in Firefox.  This fuzzer, which we’re calling Domino, leverages the WebAPIs’ own WebIDL definitions as a fuzzing grammar.  Our approach has led to the identification of over 850 bugs. 116 of those bugs have received a security rating.  In this post, I’d like to discuss some of Domino’s key features and how they differ from our previous WebAPI fuzzing efforts.

Fuzzing Basics

Before we begin discussing what Domino is and how it works, we first need to discuss the types of fuzzing techniques available to us today.

Types of Fuzzers

Fuzzers are typically classified as either blackbox, greybox, or whitebox.  These designations are based upon the level of communication between the fuzzer and the target application.  The two most common types are blackbox and greybox fuzzers.

Blackbox Fuzzing

Blackbox fuzzing submits data to the target application with essentially no knowledge of how that data affects the target. Because of this restriction, the effectiveness of a blackbox fuzzer is based entirely on the fitness of the generated data.

Blackbox fuzzing is often used for large, non-deterministic applications or those which process highly structured data.

Whitebox Fuzzing

Whitebox fuzzing enables direct correlation between the fuzzer and the target application in order to generate data that satisfies the application’s “requirements”.  This typically involves the use of theorem solvers to evaluate branch conditions and generate data to intentionally exercise all branches.  In doing so, the fuzzer can test hard-to-reach branches that might never be tested by blackbox or greybox fuzzers.

The downside of this type of fuzzing—it is computationally expensive. Large applications with complex branching may require a significant amount of time to solve. This greatly reduces the number of inputs tested.  Outside of academic exercises, whitebox fuzzing is often not feasible for real-world applications.

Greybox Fuzzing

Greybox fuzzing has emerged as one of the most popular and effective fuzzing techniques.  These fuzzers implement a feedback mechanism, typically via instrumentation, to inform decisions on what data to generate in the future.  Inputs which appear to cover more code are reused as the basis for later tests.  Inputs which decrease coverage are discarded.

This method is incredibly popular due to its speed and efficiency in reaching obscure code paths.  However, not all targets are good candidates for greybox fuzzing.  Greybox fuzzing typically works best with smaller, deterministic targets that can process a large number of inputs quickly (several hundred a second).

We often use these types of fuzzers to test individual components within Firefox such as media parsers.  If you’re interested in learning how to leverage these fuzzers to test your code, take a look at the Fuzzing Interface documentation here.

Unfortunately, we are somewhat limited in the techniques that we can use when fuzzing WebAPIs.  The browser by nature is non-deterministic and the input is highly structured. Additionally, the process of starting the browser, executing tests, and monitoring for faults is slow (several seconds to minutes per test).  With these limitations, blackbox fuzzing is the most appropriate solution.

However, since the inputs expected by these APIs are highly structured, we need to ensure that our fuzzer generates data that is considered valid.

Grammar-Based Fuzzing

Grammar-based fuzzing is a fuzzing technique that uses a formal language grammar to define the structure of the data to be generated.  These grammars are typically represented in plain-text and use a combination of symbols and constants to represent the data.  The fuzzer can then parse the grammar and use it to generate fuzzed output.

A screenshot showing a side-by-side comparison of the grammars of two fuzzers, Domato and Dharma

The examples here demonstrate two simplified grammar excerpts from the Domato and Dharma fuzzers. These grammars describe the process of creating an <a href="https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement" target="_blank" rel="noopener noreferrer">HTMLCanvasElement</a> and manipulating its properties and operations.

Issues with Traditional Grammars

Unfortunately, the level of effort required to develop a grammar is directly proportional to the size and complexity of the data you’re attempting to represent. This is the biggest downside of grammar-based fuzzing. For reference, WebAPIs in Firefox expose over 730 interfaces with approximately 6300 members. Keep in mind, this number does not account for other required data structures like callbacks, enums, or dictionaries, to name a few.  Creating a grammar to describe these APIs accurately would be a huge undertaking; not to mention error-prone and difficult to maintain.

To more effectively fuzz these APIs, we wanted to avoid as much manual grammar development as possible.

 

WebIDL as a Fuzzing Grammar

typedef (BufferSource or Blob or USVString) BlobPart;

[Exposed=(Window,Worker)]
interface Blob {
 [Throws]
 constructor(optional sequence blobParts,
             optional BlobPropertyBag options = {});

 [GetterThrows]
 readonly attribute unsigned long long size;
 readonly attribute DOMString type;

 [Throws]
 Blob slice(optional [Clamp] long long start,
            optional [Clamp] long long end,
            optional DOMString contentType);
 [NewObject, Throws] ReadableStream stream();
 [NewObject] Promise text();
 [NewObject] Promise arrayBuffer();

};

enum EndingType { "transparent", "native" };

dictionary BlobPropertyBag {
 DOMString type = "";
 EndingType endings = "transparent";
};

A simplified example of the Blob WebIDL definition

WebIDL, is an interface description language (IDL) for describing the APIs implemented by browsers. It lists the interfaces, members, and values exposed by those APIs as well as the syntax.

The WebIDL definitions are well known among the browser fuzzing community because of the wealth of information contained within them.  Previous work has been done in this area to extract the data from these IDLs for use as a fuzzing grammar, namely the WADI fuzzer from Sensepost.  However, in each example we investigated, we found that the information from these definitions was extracted and re-implemented using the fuzzer’s native grammar syntax.  This approach still requires a significant amount of manual effort. And further, the fuzzing grammars’ syntax make it difficult, if not impossible in some instances, to describe behaviors specific to WebAPIs.

Based on these issues, we decided to use the WebIDL definitions directly, rather than converting them to an existing fuzzing grammar syntax. This approach provides us with a number of benefits.

Standardized Grammar

First and foremost, the WebIDL specification defines a standardized grammar to which these definitions must adhere.  This lets us leverage existing tools, such as WebIDL2.js, for parsing the raw WebIDL definitions and converting them into an abstract syntax tree (AST).  Then this AST can be interpreted by the fuzzer to generate testcases.

Simplified Grammar Development

Second, the WebIDL defines the structure and behavior of the APIs we intend to target. Thus, we significantly reduce the amount of required rule development.  In contrast, if we were to describe these APIs using one of the previously mentioned grammars, we would have to create individual rules for each interface, member, and value defined by the API.

ECMAScript Extended Attributes

Unlike traditional grammars, which only define the structure of data, the WebIDL specification provides additional information regarding the interface’s behavior via ECMAScript extended attributes. Extended attributes can describe a variety of behaviors including:

  • The contexts where a particular interface can be used.
  • Whether the returned object is a new or duplicate instance.
  • If the member instance can be replaced.

These types of behaviors are not typically represented by traditional grammars.

Automatic Detection of API Changes

Finally, since the WebIDL files are linked with the interfaces implemented by the browser, we can ensure that updates to the WebIDL reflect updates to the interface.

 

Transforming IDL to JavaScript

screenshot of an AST generated using the WebIDL2.js library to parse the IDL

In order to leverage WebIDL for fuzzing, we first need to parse it.  Fortunately for us, we can use the WebIDL2.js library to convert the raw IDL files into an abstract-syntax tree (AST).  The AST generated by WebIDL2.js describes the data as a series of nodes on a tree. Each of these nodes defines some construct of the WebIDL syntax.

Further information on the WebIDL2 AST structure can be found here.

Once we have our AST, we simply need to define translations for each of these constructs.  In Domino, we’ve implemented a series of tools for traversing the AST and translating AST nodes into JavaScript.  The diagram above demonstrates a few of these translations.

Most of these nodes can be represented using a static translation. This means that a construct in the AST will always have the same representation in JavaScript.  For example, the constructor keyword will always be replaced with the JavaScript “new” operator in combination with the interface name.  There are however, several instances where the WebIDL construct can have many meanings and must be generated dynamically.

Generic Types

The WebIDL specification lists a number of types used for representing generic values.  For each of these types, Domino implements a function that will either return a randomly generated value matching the requested type or a previously recorded object of the same type.  For example, when iterating over the AST, occurrences of the numeric types octet, short, and long will return values within those numeric ranges.

Object References

In places where the construct type references another IDL definition and is used as an argument, these values require an object instance of that IDL type.  When one of these values is identified, Domino will attempt to create a new instance of the object (via its constructor). Or, it will attempt to do so by identifying and accessing another member which returns an object of that type.

Callback Handlers

The WebIDL specification also defines a number of types which represent functions (i.e., promises, callbacks, and event listeners).  For each of these types, Domino will generate a unique function that performs random operations on the supplied arguments (if present.

Of course the steps above only account for a small fraction of what is necessary to fully translate the IDLs to JavaScript. Domino’s generator implements support for the entire WebIDL specification.  Let’s take a look at what our output might look like using the Blob WebIDL as a fuzzing grammar.

Zero Configuration Fuzzing

> const { Domino } = require('~/domino/dist/src/index.js')
> const { Random } = require('~/domino/dist/src/strategies/index.js')
> const domino = new Domino(blob, { strategy: Random, output: '~/test/' })
> domino.generateTestcase()
…

const o = []
o[2] = new ArrayBuffer(8484)
o[1] = new Float64Array(o[2])
o[0] = new Blob([o[1]])
o[0].text().then(function (arg0) {
 o[0].text().then(function (arg1) {
   o[3] = o[0].slice()
   o[3].stream()
   o[3].slice(65535, 1, ‘foobar’)
 })
})
o[0].arrayBuffer().then(function (arg2) {
 o[3].text().then(function (arg3) {
   O[4] = arg3
   o[0].slice()
 })
})

As we can see here, the information provided by the IDL is enough to generate valid testcases. These cases exercise a fairly large portion of the Blob-related code. In turn, this allows us to quickly develop baseline fuzzers for new APIs with zero manual intervention.

Unfortunately, not everything is as precise as we would prefer.  Take, for instance, the values supplied to the slice operation.  After reviewing the Blob specification, we see that the start and end arguments are expected to be byte-order positions relative to the size of the Blob.  We’re currently generating these numbers at random. As such, it seems unlikely that we’ll be able to return values within the limits of the Blob length.

Furthermore, both the contentType argument of the slice operation and the type property on the BlobPropertyBag dictionary are defined as <a href="https://developer.mozilla.org/en-US/docs/Web/API/DOMString" target="_blank" rel="noopener noreferrer">DOMString</a>.  Similar to our numeric values, we generate strings at random.  However, further review of the specification indicates that these values are used to represent the media type of the Blob data.  Now, it doesn’t appear that this value has much effect on the Blob object directly. Nevertheless, we can’t be certain that these values won’t have an effect on the APIs which consume these Blobs.

To address these issues, we needed to develop a way of differentiating between these generic types.

Rule Patching with GrIDL

diagram showing the relationship between Domino and GriDL

Out of this need, we developed another tool named GrIDL.  GrIDL leverages the WebIDL2.js library for converting our IDL definitions into an AST.  It also makes several optimizations to the AST to better support its use as a fuzzing grammar.

However, the most interesting feature of GrIDL is this: We can dynamically patch IDL declarations where a more precise value is required.  Using a rule-based matching system, GrIDL identifies the target value and inserts a unique identifier.  Those identifiers correspond with a matching generator implemented by Domino.  While iterating over the AST, if one of these identifiers is encountered, Domino calls the matching generator and emits the value returned.

diagram showing the correlation between GrIDL identifiers and Domino generators, by defining two generators

The diagram above demonstrates the correlation between GrIDL identifiers and Domino generators.  Here we’ve defined two generators.  One returns byte offsets and the other returns a valid MIME type.

It’s important to note that each generator will also receive access to a live representation of the current object being fuzzed.  This provides us with the ability to generate values informed by the current state of the object.

In the example above, we leverage this object to generate byte offsets for the slice function that are relative to its length.  However, consider any of the attributes or operations associated with the WebGLRenderingContextBase interface.  This interface could be implemented by either a WebGL or WebGL2 context. The arguments required by each may vary drastically.  By referencing the current object being fuzzed, we can determine the context type and return values accordingly.

> domino.generateTestcase()
…
const o = []
o[1] = new Uint8Array(14471)
o[0] = new Blob([null, null, o[1]], {
'type': 'image/*',
'endings': 'transparent'
})
o[2] = o[0].slice((1642420336 % o[0].size), (3884321603 % o[0].size), 'application/xhtml+xml')
o[0].arrayBuffer().then(function (arg0) {
  setTimeout(function () { o[0].text().then(function (arg1) { o[0].stream() }) }, 180)
  o[2].arrayBuffer().then(function (arg2) {
    o[0].slice((3412050218 % o[0].size), (646665894 % o[0].size), 'text/plain')
    o[0].stream()
  })
  o[2].text().then(function (arg3) {
    o[2].slice((2025414481 % o[2].size), (2615146387 % o[2].size), 'text/html')
    o[3] = o[0].slice((753872984 % o[0].size), (883984089 % o[0].size), 'text/xml')
    o[3].stream()
  })
})

With our newly created rules, we’re now able to generate values that more closely resemble those described by the specification.

 

Real-World Examples

The examples included in this post have been greatly simplified.  It can often be hard to see how an approach like this might be applied to more complex APIs.  With that, I’d like to leave you with an example of one of the more complex vulnerabilities uncovered by Domino.

screenshot of the code involved in a complex vulnerability identified by Domino, as described more fully in bug #1558522

In bug 1558522, we identified a critical use-after-free vulnerability affecting the IndexedDB API.  This vulnerability is very interesting from a fuzzing perspective due to the level of complexity required to trigger the issue.  Domino was able to trigger this vulnerability by creating a file in the global context, then passing the file object to a worker context where an IndexedDB database connection is established.

This level of coordination between contexts would often be difficult to describe using traditional grammars.  However, due to the detailed descriptions of these APIs provided by the WebIDL, Domino can identify vulnerabilities like this with ease.

Contributing

A final note: Domino continues to find security-sensitive vulnerabilities in our code. Unfortunately, this means we cannot release it yet for public use.  However, we have plans to release a more generic version in the near future. Stay tuned. If you’d like to get started contributing code to the development of Firefox, there are plenty of open opportunities. And, if you are a Mozilla employee or NDA’d code contributor and you’d like to work on Domino, feel free to reach out to the team in the Fuzzing room on Riot (Matrix)!

The post Fuzzing Firefox with WebIDL appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/04/fuzzing-with-webidl/feed/ 4 46071
Twitter Direct Message Caching and Firefox https://hacks.mozilla.org/2020/04/twitter-direct-message-caching-and-firefox/ Fri, 03 Apr 2020 21:55:10 +0000 https://hacks.mozilla.org/?p=45983 Distinguished engineer Martin Thomson explains how this problem occurred, the implications for people who might be affected, and how problems of this nature might be avoided in future. To get there, we need to dig a little into how web caching works.

The post Twitter Direct Message Caching and Firefox appeared first on Mozilla Hacks - the Web developer blog.

]]>
Editor’s Note: April 6, 7:00pm pt – After some more investigation into this problem, it appears that the initial analysis pointing to the Content-Disposition was based on bad information.  The reason that some browsers were not caching direct messages was that Twitter includes the non-standard Pragma: no-cache header in responses. Using Pragma is invalid as it is defined to be equivalent to Cache-Control: no-cache only for requests. Though it is counter-intuitive, ‘no-cache’ does not prevent a cache from storing content; ‘no-cache’ only means that the cache needs to check with the server before reusing that response. That doesn’t change the conclusion: limited observations of behavior are no substitute for building to standards.

Twitter is telling its users that their personal direct messages might be stored in Firefox’s web cache.

This problem affects anyone who uses Twitter on Firefox from a shared computer account. Those users should clear their cache.

This post explains how this problem occurred, what the implications are for those people who might be affected, and how problems of this nature might be avoided in future. To get there, we need to dig a little into how web caching works.

Over on The Mozilla Blog, Eric Rescorla, the CTO of Firefox, shares insights on What you need to know about Twitter on Firefox, with this important reminder:

The web is complicated and it’s hard to know everything about it. However, it’s also a good reminder of how important it is to have web standards rather than just relying on whatever one particular browser happens to do.

Web Caching Privacy Basics

Caching is critical to performance on the web. Browsers cache content so that it can be reused without talking to servers, which can be slow. However, the way that web content is cached can be quite confusing.

The Internet Engineering Task Force published RFC 7234, which defines how web caching works. A key mechanism is the Cache-Control header, which allows web servers to say how they want caches to treat content.

Sites can to use Cache-Control to let browsers know what is safe to store in caches. Some content needs to be fetched every time; other content is only valid for a short time. Cache-Control tells the browser what can be cached and for how long. Or, as is relevant to this case, Cache-Control can tell the browser that content is sensitive and that it should not be stored.

Separately, in the absence of Cache-Control instructions from sites, browsers often make guesses about what can be cached. Sites often do not provide any caching information for content. But caching content makes the web faster. So browsers cache most content unless they are told not to. This is referred to as “heuristic caching”, and differs from browser to browser.

Heuristic caching involves the browsing guessing which content is cached, and for how long. Firefox heuristic caching stores most content without explicit caching information for 7 days.

There are a bunch of controls that Cache-Control provides, but most relevant to this case is a directive called ‘no-store’. When a site says ‘no-store’, that tells the browser never to save a copy of the content in its cache. Using ‘no-store’ is the only way to guarantee that information is never cached.

The Case with Twitter

In this case, Twitter did not include a ‘no-store’ directive for direct messages. The content of direct messages is sensitive and so should not have been stored in the browser cache. Without Cache-Control or Expires, however, browsers used heuristic caching logic.

Testing from Twitter showed that the request was not being cached in other browsers. This is because some other browsers disable heuristic caching if an unrelated HTTP header, Content-Disposition, is present. Content-Disposition is a feature that allows sites to identify content for download and to suggest a name for the file to save that content to.

In comparison, Firefox legitimately treats Content-Disposition as unrelated and so does not disable heuristic caching when it is present.

The HTTP messages Twitter used for direct messages did not include any Cache-Control directives. For Firefox users, that meant that even when a Twitter user logged out, direct messages were stored in the browser cache on their computer.

Who is Affected?

As much as possible, Firefox maintains separate caches.

People who have different user accounts on the same computer will have their own caches that are completely inaccessible to each other. People who share an account but use different Firefox profiles will have different caches.

Firefox also provides controls that allow control over what is stored. Using Private Browsing means that cached data is not stored to permanent storage and any cache is discarded when the window is closed. Firefox also provides other controls, like Clear Recent History, Forget About This Site, and automatic clearing of history. These options are all documented here.

This problem only affects people who share an account on the same computer and who use none of these privacy techniques to clear their cache. Though they might have logged out of Twitter, their direct messages will remain in their stored cache.

It is not likely that other users who later use the same Firefox profile would inadvertently access the cached direct messages. However, a user that shares the same account on the computer might be able to find and access the cache files that contain those messages.

What Users Can Do

People who don’t share accounts on their computer with anyone else can be assured that their direct messages are safe. No action is required.

People who do use shared computer accounts can clear their Firefox cache. Clearing just the browser cache using Clear Recent History will remove any Twitter direct messages.

What Website Developers Can Do

We recommend that sites carefully identify information that is private using Cache-Control: no-store.

A common misconception here is that Cache-Control: private will address this problem. The ‘private’ directive is used for shared caches, such as those provided by CDNs. Marking content as ‘private’ will not prevent browser caching.

More generally, developers that build sites need to understand the difference between standards and observed behavior. What browsers do today can be observed and measured, but unless behavior is based on a documented standard, there is no guarantee that it will remain that way forever.

The post Twitter Direct Message Caching and Firefox appeared first on Mozilla Hacks - the Web developer blog.

]]>
45983
Security means more with Firefox 74 https://hacks.mozilla.org/2020/03/security-means-more-with-firefox-74-2/ https://hacks.mozilla.org/2020/03/security-means-more-with-firefox-74-2/#comments Tue, 10 Mar 2020 15:13:53 +0000 https://hacks.mozilla.org/?p=45930 The release of Firefox 74 is focused on security enhancements: Feature Policy, the Cross-Origin-Resource-Policy header, and removal of TLS 1.0/1.1 support. We’ve also got some new CSS text property features, the JS optional chaining operator, and additional 2D canvas text metric features, along with the usual wealth of DevTools enhancements and bug fixes.

The post Security means more with Firefox 74 appeared first on Mozilla Hacks - the Web developer blog.

]]>
Today sees the release of Firefox number 74. The most significant new features we’ve got for you this time are security enhancements: Feature Policy, the Cross-Origin-Resource-Policy header, and removal of TLS 1.0/1.1 support. We’ve also got some new CSS text property features, the JS optional chaining operator, and additional 2D canvas text metric features, along with the usual wealth of DevTools enhancements and bug fixes.

As always, read on for the highlights, or find the full list of additions in the following articles:

Note: In the Security enhancements section below, we detail the removal of TLS 1.0/1.1 in Firefox 74, however we reverted this change for an undetermined amount of time, to better enable access to critical government sites sharing COVID19 information. We are keeping the infomation below intact because it is still useful to give you an idea of future intents. (Updated Monday, 30 March.)

Security enhancements

Let’s look at the security enhancement we’ve got in 74.

Feature Policy

We’ve finally enabled Feature Policy by default. You can now use the <iframe> allow attribute and the Feature-Policy HTTP header to set feature permissions for your top level documents and IFrames. Syntax examples follow:

<iframe src="https://example.com" allow="fullscreen"></iframe>
Feature-Policy: microphone 'none'; geolocation 'none'

CORP

We’ve also enabled support for the Cross-Origin-Resource-Policy (CORP) header, which allows web sites and applications to opt in to protection against certain cross-origin requests (such as those coming from <script> and <img> elements). This can help to mitigate speculative side-channel attacks (like Spectre and Meltdown) as well as Cross-Site Script Inclusion attacks.

The available values are same-origin and same-site. same-origin only allows requests that share the same scheme, host, and port to read the relevant resource. This provides an additional level of protection beyond the web’s default same-origin policy. same-site only allows requests that share the same site.

To use CORP, set the header to one of these values, for example:

Cross-Origin-Resource-Policy: same-site

TLS 1.0/1.1 removal

Last but not least, Firefox 74 sees the removal of TLS 1.0/1.1 support, to help raise the overall level of security of the web platform. This is vital for moving the TLS ecosystem forward, and getting rid of a number of vulnerabilities that existed as a result of TLS 1.0/1.1 not being as robust as we’d really like — they’re in need of retirement.

The change was first announced in October 2018 as a shared initiative of Mozilla, Google, Microsoft, and Apple. Now in March 2020 we are all acting on our promises (with the exception of Apple, who will be making the change slightly later on).

The upshot is that you’ll need to make sure your web server supports TLS 1.2 or 1.3 going forward. Read TLS 1.0 and 1.1 Removal Update to find out how to test and update your TLS/SSL configuration. From now on, Firefox will return a Secure Connection Failed error when connecting to servers using the older TLS versions. Upgrade now, if you haven’t already!

secure connection failed error message, due to connected server using TLS 1.0 or 1.1

Note: For a couple of release cycles (and longer for Firefox ESR), the Secure Connection Failed error page will feature an override button allowing you to Enable TLS 1.0 and 1.1 in cases where a server is not yet upgraded, but you won’t be able to rely on it for too long.

To find out more about TLS 1.0/1.1 removal and the background behind it, read It’s the Boot for TLS 1.0 and TLS 1.1.

Other web platform additions

We’ve got a host of other web platform additions for you in 74.

New CSS text features

For a start, the text-underline-position property is enabled by default. This is useful for positioning underlines set on your text in certain contexts to achieve specific typographic effects.

For example, if your text is in a horizontal writing mode, you can use text-underline-position: under; to put the underline below all the descenders, which is useful for ensuring legibility with chemical and mathematical formulas, which make frequent use of subscripts.

.horizontal {
  text-underline-position: under;
}

In text with a vertical writing-mode set, we can use values of left or right to make the underline appear to the left or right of the text as required.

.vertical {
  writing-mode: vertical-rl;
  text-underline-position: left;
}

In addition, the text-underline-offset and text-decoration-thickness properties now accept percentage values, for example:

text-decoration-thickness: 10%;

For these properties, this is a percentage of 1em in the current font’s size.

Optional chaining in JavaScript

We now have the JavaScript optional chaining operator (?.) available. When you are trying to access an object deep in a chain, this allows for implicit testing of the existence of the objects higher up in the chain, avoiding errors and the need to explicitly write testing code.

let nestedProp = obj.first?.second;

New 2D canvas text metrics

The TextMetrics interface (retrieved using the CanvasRenderingContext2D.measureText() method) has been extended to contain four more properties measuring the actual bounding box — actualBoundingBoxLeft, actualBoundingBoxRight, actualBoundingBoxAscent, and actualBoundingBoxDescent.

For example:

const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
const text = ctx.measureText('Hello world');

text.width;                    // 56.08333206176758
text.actualBoundingBoxAscent;  // 8
text.actualBoundingBoxDescent; // 0
text.actualBoundingBoxLeft;    // 0
text.actualBoundingBoxRight;   // 55.733333333333334

DevTools additions

Next up, DevTools additions.

Device-like rendering in Responsive Design Mode

While Firefox for Android is being relaunched with GeckoView to be faster and more private, the DevTools need to stay ahead. Testing on mobile should be as frictionless as possible, both when using Responsive Design Mode on your desktop and on-device with Remote Debugging.

Correctness is important for Responsive Design Mode, so developers can trust the output without a device at hand. Over the past releases, we rolled out major improvements that ensure meta viewport is correctly applied with Touch Simulation. This ties in with improved device presets, which automatically enable touch simulation for mobile devices.

animated gif showing how responsive design mode now represents view meta settings better

Fun fact: The team managed to make this simulation so accurate that it has already helped to identify and fix rendering bugs for Firefox on Android.

DevTools Tip: Open Responsive Design Mode without DevTools via the tools menu or Ctrl + Shift + M on Windows/Cmd + Opt + M on macOS.

We’d love to hear about your experiences when giving your site a spin in RDM or on your Android phone with Firefox Nightly for Developers.

CSS tools that work for you

The Page Inspector’s new in-context warnings for inactive CSS rules have received a lot of positive feedback. They help you solve gnarly CSS issues while teaching you about the intricate interdependencies of CSS rules.

Since its launch, we have continued to tweak and add rules, often based on user feedback. One highlight for 74 is a new detection setting that warns you when properties depend on positioned elements – namely z-index, top, left, bottom, and right.

Firefox Page Inspector now showing inactive position-related properties such as z-index and top

Your feedback will help to further refine and expand the rules. Say hi to the team in the DevTools chat on Mozilla’s Matrix instance or follow our work via @FirefoxDevTools.

Debugging for Nested Workers

Firefox’s JavaScript Debugger team has been focused on optimizing Web Workers over the past few releases to make them easier to inspect and debug. The more developers and frameworks that use workers to move processing off the main thread, the easier it will be for browsers to prioritize running code that is fired as a result of user input actions.

Nested web workers, which allow workers to spawn and control their own worker instances, are now displayed in the Debugger:

Firefox JavaScript debugger now shows nested workers

Improved React DevTools integration

The React Developer Tools add-on is one of many developer add-ons that integrate tightly with Firefox DevTools. Thanks to the WebExtensions API, developers can create and publish add-ons for all browsers from the same codebase.

In collaboration with the React add-on maintainers, we worked to re-enable and improve the context menus in the add-on, including Go to definition. This action lets developers jump from React Components directly to their source files in the Debugger. The same functionality has already been enabled for jumping to elements in the Inspector. We want to build this out further, to make framework workflows seamless with the rest of the tools.

Early-access DevTools features in Developer Edition

Developer Edition is Firefox’s pre-release channel which gets early access to tooling and platform features. Its settings also enable more functionality for developers by default. We like to bring new features quickly to Developer Edition to gather your feedback, including the following highlights.

Instant evaluation for Console expressions

Exploring JavaScript objects, functions, and the DOM feels like magic with instant evaluation. As long as expressions typed into the Web Console are side-effect free, their results will be previewed while you type, allowing you to identify and fix errors more rapidly than before.

Async Stack Traces for Debugger & Console

Modern JavaScript code depends heavily upon stacking async/await on top of other async operations like events, promises, and timeouts. Thanks to better integration with the JavaScript engine, async execution is now captured to give a more complete picture.

Async call stacks in the Debugger let you step through events, timeouts, and promise-based function calls that are executed over time. In the Console, async stacks make it easier to find the root causes of errors.

async call stack shown in the Firefox JavaScript debugger

Sneak-peek Service Worker Debugging

This one has been in Nightly for a while, and we are more than excited to get it into your hands soon. Expect it in Firefox 76, which will become Developer Edition in 4 weeks.

The post Security means more with Firefox 74 appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/03/security-means-more-with-firefox-74-2/feed/ 5 45930
Securing Firefox with WebAssembly https://hacks.mozilla.org/2020/02/securing-firefox-with-webassembly/ https://hacks.mozilla.org/2020/02/securing-firefox-with-webassembly/#comments Tue, 25 Feb 2020 14:04:18 +0000 https://hacks.mozilla.org/?p=34163 Protecting the security and privacy of individuals is a central tenet of Mozilla’s mission. While we continue to make extensive use of both sandboxing and Rust in Firefox to address security challenges in the browser, each has its limitations. Today we’re adding a third approach to our arsenal. RLBox, a new sandboxing technology developed by researchers at the University of California, San Diego, and the University of Texas, Austin, allows us to quickly and efficiently convert existing Firefox components to run inside a WebAssembly sandbox.

The post Securing Firefox with WebAssembly appeared first on Mozilla Hacks - the Web developer blog.

]]>
Protecting the security and privacy of individuals is a central tenet of Mozilla’s mission, and so we constantly endeavor to make our users safer online. With a complex and highly-optimized system like Firefox, memory safety is one of the biggest security challenges. Firefox is mostly written in C and C++. These languages are notoriously difficult to use safely, since any mistake can lead to complete compromise of the program. We work hard to find and eliminate memory hazards, but we’re also evolving the Firefox codebase to address these attack vectors at a deeper level. Thus far, we’ve focused primarily on two techniques:

A new approach

While we continue to make extensive use of both sandboxing and Rust in Firefox, each has its limitations. Process-level sandboxing works well for large, pre-existing components, but consumes substantial system resources and thus must be used sparingly. Rust is lightweight, but rewriting millions of lines of existing C++ code is a labor-intensive process.

Consider the Graphite font shaping library, which Firefox uses to correctly render certain complex fonts. It’s too small to put in its own process.  And yet, if a memory hazard were uncovered, even a site-isolated process architecture wouldn’t prevent a malicious font from compromising the page that loaded it. At the same time, rewriting and maintaining this kind of domain-specialized code is not an ideal use of our limited engineering resources.

So today, we’re adding a third approach to our arsenal. RLBox, a new sandboxing technology developed by researchers at the University of California, San Diego, the University of Texas, Austin, and Stanford University, allows us to quickly and efficiently convert existing Firefox components to run inside a WebAssembly sandbox. Thanks to the tireless efforts of Shravan Narayan, Deian Stefan, Tal Garfinkel, and Hovav Shacham, we’ve successfully integrated this technology into our codebase and used it to sandbox Graphite.

This isolation will ship to Linux users in Firefox 74 and to Mac users in Firefox 75, with Windows support following soon after. You can read more about this work in the press releases from UCSD and UT Austin along with the joint research paper.  Read on for a technical overview of how we integrated it into Firefox.

Building a wasm sandbox

The core implementation idea behind wasm sandboxing is that you can compile C/C++ into wasm code, and then you can compile that wasm code into native code for the machine your program actually runs on.  These steps are similar to what you’d do to run C/C++ applications in the browser, but we’re performing the wasm to native code translation ahead of time, when Firefox itself is built.  Each of these two steps rely on significant pieces of software in their own right, and we add a third step to make the sandboxing conversion more straightforward and less error prone.

First, you need to be able to compile C/C++ into wasm code.  As part of the WebAssembly effort, a wasm backend was added to Clang and LLVM.  Having a compiler is not enough, though; you also need a standard library for C/C++.  This component is provided via wasi-sdk.  With those pieces, we have enough to translate C/C++ into wasm code.

Second, you need to be able to convert the wasm code into native object files.  When we first started implementing wasm sandboxing, we were often asked, “why do you even need this step?  You could distribute the wasm code and compile it on-the-fly on the user’s machine when Firefox starts.” We could have done that, but that method requires the wasm code to be freshly compiled for every sandbox instance.  Per-sandbox compiled code is unnecessary duplication in a world where every origin resides in a separate process. Our chosen approach enables sharing compiled native code between multiple processes, resulting in significant memory savings.  This approach also improves the startup speed of the sandbox, which is important for fine-grained sandboxing, e.g. sandboxing the code associated with every font accessed or image loaded.

Ahead-of-time compilation with Cranelift and friends

This approach does not imply that we have to write our own wasm to native code compiler!  We implemented this ahead-of-time compilation using the same compiler backend that will eventually power the wasm component of Firefox’s JavaScript engine: Cranelift, via the Bytecode Alliance’s Lucet compiler and runtime.  This code sharing ensures that improvements benefit both our JavaScript engine and our wasm sandboxing compiler.  These two pieces of code currently use different versions of Cranelift for engineering reasons. As our sandboxing technology matures, however, we expect to modify them to use the exact same codebase.

Now that we’ve translated the wasm code into native object code, we need to be able to call into that sandboxed code from C++.  If the sandboxed code was running in a separate virtual machine, this step would involve looking up function names at runtime and managing state associated with the virtual machine.  With the setup above, however, sandboxed code is native compiled code that respects the wasm security model. Therefore, sandboxed functions can be called using the same mechanisms as calling regular native code.  We have to take some care to respect the different machine models involved: wasm code uses 32-bit pointers, whereas our initial target platform, x86-64 Linux, uses 64-bit pointers. But there are other hurdles to overcome, which leads us to the final step of the conversion process.

Getting sandboxing correct

Calling sandboxed code with the same mechanisms as regular native code is convenient, but it hides an important detail.  We cannot trust anything coming out of the sandbox, as an adversary may have compromised the sandbox.

For instance, for a sandboxed function:

/* Returns values between zero and sixteen.  */

int return_the_value();

We cannot guarantee that this sandboxed function follows its contract.  Therefore, we need to ensure that the returned value falls in the range that we expect.

Similarly, for a sandboxed function returning a pointer:

extern const char* do_the_thing();

We cannot guarantee that the returned pointer actually points to memory controlled by the sandbox.  An adversary may have forced the returned pointer to point somewhere in the application outside of the sandbox.  Therefore, we validate the pointer before using it.

There are additional runtime constraints that are not obvious from reading the source.  For instance, the pointer returned above may point to dynamically allocated memory from the sandbox.  In that case, the pointer should be freed by the sandbox, not by the host application. We could rely on developers to always remember which values are application values and which values are sandbox values.  Experience has shown that approach is not feasible.

Tainted data

The above two examples point to a general principle: data returned from the sandbox should be specifically identified as such.  With this identification in hand, we can ensure the data is handled in appropriate ways.

We label data associated with the sandbox as “tainted”.  Tainted data can be freely manipulated (e.g. pointer arithmetic, accessing fields) to produce more tainted data.  But when we convert tainted data to non-tainted data, we want those operations to be as explicit as possible. Taintedness is valuable not just for managing memory returned from the sandbox.  It’s also valuable for identifying data returned from the sandbox that may need additional verification, e.g. indices pointing into some external array.

We therefore model all exposed functions from the sandbox as returning tainted data.  Such functions also take tainted data as arguments, because anything they manipulate must belong to the sandbox in some way.  Once function calls have this interface, the compiler becomes a taintedness checker. Compiler errors will occur when tainted data is used in contexts that want untainted data, or vice versa.  These contexts are precisely the places where tainted data needs to be propagated and/or data needs to be validated. RLBox handles all the details of tainted data and provides features that make incremental conversion of a library’s interface to a sandboxed interface straightforward.

Next Steps

With the core infrastructure for wasm sandboxing in place, we can focus on increasing its impact across the Firefox codebase – both by bringing it to all of our supported platforms, and by applying it to more components. Since this technique is lightweight and easy to use, we expect to make rapid progress sandboxing more parts of Firefox in the coming months. We’re focusing our initial efforts on third-party libraries bundled with Firefox.  Such libraries generally have well-defined entry points and don’t pervasively share memory with the rest of the system. In the future, however, we also plan to apply this technology to first-party code.

Acknowledgements

We are deeply grateful for the work of our research partners at UCSD, UT Austin, and Stanford, who were the driving force behind this effort. We’d also like to extend a special thanks to our partners at the Bytecode Alliance – particularly the engineering team at Fastly, who developed Lucet and helped us extend its capabilities to make this project possible.

The post Securing Firefox with WebAssembly appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/02/securing-firefox-with-webassembly/feed/ 9 34163
It’s the Boot for TLS 1.0 and TLS 1.1 https://hacks.mozilla.org/2020/02/its-the-boot-for-tls-1-0-and-tls-1-1/ https://hacks.mozilla.org/2020/02/its-the-boot-for-tls-1-0-and-tls-1-1/#comments Thu, 06 Feb 2020 14:35:56 +0000 https://hacks.mozilla.org/?p=34135 The Transport Layer Security (TLS) protocol is the de facto means for establishing security on the Web. The newest version, TLS 1.3, improves efficiency and remedies the flaws and weaknesses present in earlier versions. In October 2018, we announced our plans regarding TLS 1.0 and TLS 1.1 deprecation. Now's the time for us to make this change together and move the TLS ecosystem forward.

The post It’s the Boot for TLS 1.0 and TLS 1.1 appeared first on Mozilla Hacks - the Web developer blog.

]]>
Editor’s Update: June 24, 11:40am PDT – We will be moving ahead with disabling TLS 1.0 and TLS 1.1 by default in Firefox 78, releasing June 30th. If you see a “Secure Connection Failed” message as displayed in the post below, then hit the button to re-enable TLS 1.0 and TLS 1.1. You should only need to hit this button once, the change will be global.

Earlier Update: March 23, 10:43am PDT – We have re-enabled TLS 1.0 and 1.1 in Firefox 74 and 75 Beta to better enable access to sites sharing critical and important information during this time.

Coming to a Firefox near you in March

The Transport Layer Security (TLS) protocol is the de facto means for establishing security on the Web. The protocol has a long and colourful history, starting with its inception as the Secure Sockets Layer (SSL) protocol in the early 1990s, right up until the recent release of the jazzier (read faster and safer) TLS 1.3. The need for a new version of the protocol was born out of a desire to improve efficiency and to remedy the flaws and weaknesses present in earlier versions, specifically in TLS 1.0 and TLS 1.1. See the BEAST, CRIME and POODLE attacks, for example.

With limited support for newer, more robust cryptographic primitives and cipher suites, it doesn’t look good for TLS 1.0 and TLS 1.1. With the safer TLS 1.2 and TLS 1.3 at our disposal to adequately project web traffic, it’s time to move the TLS ecosystem into a new era, namely one which doesn’t support weak versions of TLS by default. This has been the abiding sentiment of browser vendors – Mozilla, Google, Apple and Microsoft have committed to disabling TLS 1.0 and TLS 1.1 as default options for secure connections. In other words, browser clients will aim to establish a connection using TLS 1.2 or higher. For more on the rationale behind this decision, see our earlier blog post on the subject.

What does this look like in Firefox?

We deployed this in Firefox Nightly, the experimental version of our browser, towards the end of 2019. It is now also available in Firefox Beta 73. In Firefox, this means that the minimum TLS version allowable by default is TLS 1.2. This has been executed in code by setting security.tls.version.min=3, a preference indicating the minimum TLS version supported. Previously, this value was set to 1. If you’re connecting to sites that support TLS 1.2 and up, you shouldn’t notice any connection errors caused by TLS version mismatches.

What if a site only supports lower versions of TLS?

In cases where only lower versions of TLS are supported, i.e., when the more secure TLS 1.2 and TLS 1.3 versions cannot be negotiated, we allow for a fallback to TLS 1.0 or TLS 1.1 via an override button. As a Firefox user, if you find yourself in this position, you’ll see this:

screenshot showing "Secure Connection Failed" message that allows user to override the TLS 1.0 and 1.1 deprecation

As a user, you will have to actively initiate this override. But the override button offers you a choice. You can, of course, choose not to connect to sites that don’t offer you the best possible security.

This isn’t ideal for website operators. We would like to encourage operators to upgrade their servers so as to offer users a secure experience on the Web. We announced our plans regarding TLS 1.0 and TLS 1.1 deprecation over a year ago, in October 2018, and now the time has come to make this change. Let’s work together to move the TLS ecosystem forward.

Deprecation timeline

We plan to monitor telemetry over two Firefox Beta cycles, and then we’re going to let this change ride to Firefox Release. So, expect Firefox 74 to offer TLS 1.2 as its minimum version for secure connections when it ships on 10 March 2020. We plan to keep the override button for now; the telemetry we’re collecting will tell us more about how often this button is used. These results will then inform our decision regarding when to remove the button entirely. It’s unlikely that the button will stick around for long. We’re committed to completely eradicating weak versions of TLS because at Mozilla we believe that user security should not be treated as optional.

Again, we would like to stress the importance of upgrading web servers over the coming months, as we bid farewell to TLS 1.0 and TLS 1.1. R.I.P, you’ve served us well.

The post It’s the Boot for TLS 1.0 and TLS 1.1 appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2020/02/its-the-boot-for-tls-1-0-and-tls-1-1/feed/ 6 34135
TLS 1.0 and 1.1 Removal Update https://hacks.mozilla.org/2019/05/tls-1-0-and-1-1-removal-update/ https://hacks.mozilla.org/2019/05/tls-1-0-and-1-1-removal-update/#comments Wed, 15 May 2019 14:01:57 +0000 https://hacks.mozilla.org/?p=33453 As you may have read last year, Safari, Firefox, Edge and Chrome browsers are removing support for TLS 1.0 and 1.1 in March of 2020. That means there’s less than a year to enable TLS 1.2 (and, ideally, 1.3) on your servers, otherwise all major browsers will display error pages, rather than the content your users came to see.

The post TLS 1.0 and 1.1 Removal Update appeared first on Mozilla Hacks - the Web developer blog.

]]>
tl;dr Enable support for Transport Layer Security (TLS) 1.2 today!

 

Editor’s Note: We updated this post on July 1, 2019 to mention the newly updated SSL Configuration Generator. This service from Mozilla provides boilerplate SSL configurations for the most popular server software setups, with multiple TLS compatibility options. It’s a great starting place for updating your existing servers configs, or for standing up new servers.

As you may have read last year in the original announcement posts, Safari, Firefox, Edge and Chrome are removing support for TLS 1.0 and 1.1 in March of 2020. If you manage websites, this means there’s less than a year to enable TLS 1.2 (and, ideally, 1.3) on your servers, otherwise all major browsers will display error pages, rather than the content your users were expecting to find.

Screenshot of a Secure Connection Failed error page

In this article we provide some resources to check your sites’ readiness, and start planning for a TLS 1.2+ world in 2020.

Check the TLS “Carnage” list

Once a week, the Mozilla Security team runs a scan on the Tranco list (a research-focused top sites list) and generates a list of sites still speaking TLS 1.0 or 1.1, without supporting TLS ≥ 1.2.

Tranco list top sites with TLS <= 1.1

As of this week, there are just over 8,000 affected sites from the one million listed by Tranco.

There are a few potential gotchas to be aware of, if you do find your site on this list:

  • 4% of the sites are using TLS ≤ 1.1 to redirect from a bare domain (https://example.com) to www (https://www.example.com) on TLS ≥ 1.2 (or vice versa). If you were to only check your site post-redirect, you might miss a potential footgun.
  • 2% of the sites don’t redirect from bare to www (or vice versa), but do support TLS ≥ 1.2 on one of them.

The vast majority (94%), however, are just bad—it’s TLS ≤ 1.1 everywhere.

If you find that a site you work on is in the TLS “Carnage” list, you need to come up with a plan for enabling TLS 1.2 (and 1.3, if possible). However, this list only covers 1 million sites. Depending on how popular your site is, you might have some work to do regardless of whether you’re not listed by Tranco.

Run an online test

Even if you’re not on the “Carnage” list, it’s a good idea to test your servers all the same. There are a number of online services that will do some form of TLS version testing for you, but only a few will flag not supporting modern TLS versions in an obvious way. We recommend using one or more of the following:

Check developer tools

Another way to do this is open up Firefox (versions 68+) or Chrome (versions 72+) DevTools, and look for the following warnings in the console as you navigate around your site.

Firefox DevTools console warning

Chrome DevTools console warning

Update your SSL Configuration

Now that you know which servers need to be updated, it’s time to start the work.

Mozilla maintains an SSL Configuration Generator service that provides boilerplate SSL configurations for the most popular server software setups, with multiple TLS compatibility options. It’s a great starting place for updating your existing servers configs, or for standing up new servers.

What’s Next?

This October, we plan on disabling old TLS in Firefox Nightly and you can expect the same for Chrome and Edge Canaries. We hope this will give enough time for sites to upgrade before affecting their release population users.

The post TLS 1.0 and 1.1 Removal Update appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2019/05/tls-1-0-and-1-1-removal-update/feed/ 2 33453
Implications of Rewriting a Browser Component in Rust https://hacks.mozilla.org/2019/02/rewriting-a-browser-component-in-rust/ https://hacks.mozilla.org/2019/02/rewriting-a-browser-component-in-rust/#comments Thu, 28 Feb 2019 14:10:27 +0000 https://hacks.mozilla.org/?p=33198 There have been 69 security bugs in Firefox’s style component since the browser was first released in 2002. If we'd had a time machine and could have written this component in Rust from the start, 51 (73.9%) of these bugs would not have been possible. Rust isn't foolproof, but by removing the burden of memory safety, Rust lets programmers focus on logical correctness and soundness.

The post Implications of Rewriting a Browser Component in Rust appeared first on Mozilla Hacks - the Web developer blog.

]]>
The previous posts in this Fearless Security series examine memory safety and thread safety in Rust. This closing post uses the Quantum CSS project as a case study to explore the real world impact of rewriting code in Rust.

The style component is the part of a browser that applies CSS rules to a page. This is a top-down process on the DOM tree: given the parent style, the styles of children can be calculated independently—a perfect use-case for parallel computation. By 2017, Mozilla had made two previous attempts to parallelize the style system using C++. Both had failed.

Quantum CSS resulted from a need to improve page performance. Improving security is a happy byproduct.

Rewrites code to make it faster; also makes it more secure

There’s a large overlap between memory safety violations and security-related bugs, so we expected this rewrite to reduce the attack surface in Firefox. In this post, I will summarize the potential security vulnerabilities that have appeared in the styling code since Firefox’s initial release in 2002. Then I’ll look at what could and could not have been prevented by using Rust.

Over the course of its lifetime, there have been 69 security bugs in Firefox’s style component. If we’d had a time machine and could have written this component in Rust from the start, 51 (73.9%) of these bugs would not have been possible. While Rust makes it easier to write better code, it’s not foolproof.

Rust

Rust is a modern systems programming language that is type- and memory-safe. As a side effect of these safety guarantees, Rust programs are also known to be thread-safe at compile time. Thus, Rust can be a particularly good choice when:

✅ processing untrusted input safely.
✅ introducing parallelism to improve performance.
✅ integrating isolated components into an existing codebase.

However, there are classes of bugs that Rust explicitly does not address—particularly correctness bugs. In fact, during the Quantum CSS rewrite, engineers accidentally reintroduced a critical security bug that had previously been patched in the C++ code, regressing the fix for bug 641731. This allowed global history leakage via SVG image documents, resulting in bug 1420001. As a trivial history-stealing bug, this is rated security-high. The original fix was an additional check to see if the SVG document was being used as an image. Unfortunately, this check was overlooked during the rewrite.

While there were automated tests intended to catch :visited rule violations like this, in practice, they didn’t detect this bug. To speed up our automated tests, we temporarily turned off the mechanism that tested this feature—tests aren’t particularly useful if they aren’t run. The risk of re-implementing logic errors can be mitigated by good test coverage (and actually running the tests). There’s still a danger of introducing new logic errors.

As developer familiarity with the Rust language increases, best practices will improve. Code written in Rust will become even more secure. While it may not prevent all possible vulnerabilities, Rust eliminates an entire class of the most severe bugs.

Quantum CSS Security Bugs

Overall, bugs related to memory, bounds, null/uninitialized variables, or integer overflow would be prevented by default in Rust. The miscellaneous bug I referenced above would not have been prevented—it was a crash due to a failed allocation.

Security bugs by category

All of the bugs in this analysis are related to security, but only 43 received official security classifications. (These are assigned by Mozilla’s security engineers based on educated “exploitability” guesses.) Normal bugs might indicate missing features or problems like crashes. While undesirable, crashes don’t result in data leakage or behavior modification. Official security bugs can range from low severity (highly limited in scope) to critical vulnerability (might allow an attacker to run arbitrary code on the user’s platform).

There’s a significant overlap between memory vulnerabilities and severe security problems. Of the 34 critical/high bugs, 32 were memory-related.

Security rated bug breakdown

Comparing Rust and C++ code

Bug 955914 is a heap buffer overflow in the GetCustomPropertyNameAt function. The code used the wrong variable for indexing, which resulted in interpreting memory past the end of the array. This could either crash while accessing a bad pointer or copy memory to a string that is passed to another component.

The ordering of all CSS properties (both longhand and custom) is stored in an array, mOrder. Each element is either represented by its CSS property value or, in the case of custom properties, by a value that starts at eCSSProperty_COUNT (the total number of non-custom CSS properties). To retrieve the name of a custom property, first, you have to retrieve the custom property value from mOrder, then access the name at the corresponding index of the mVariableOrder array, which stores the custom property names in order.

Vulnerable C++ code:

    void GetCustomPropertyNameAt(uint32_t aIndex, nsAString& aResult) const {
            MOZ_ASSERT(mOrder[aIndex] >= eCSSProperty_COUNT);

            aResult.Truncate();
            aResult.AppendLiteral("var-");
            aResult.Append(mVariableOrder[aIndex]);

The problem occurs at line 6 when using aIndex to access an element of the mVariableOrder array. aIndex is intended for use with the mOrder array not the mVariableOrder array. The corresponding element for the custom property represented by aIndex in mOrder is actually mOrder[aIndex] - eCSSProperty_COUNT.

Fixed C++ code:

    void Get CustomPropertyNameAt(uint32_t aIndex, nsAString& aResult) const {
      MOZ_ASSERT(mOrder[aIndex] >= eCSSProperty_COUNT);

      uint32_t variableIndex = mOrder[aIndex] - eCSSProperty_COUNT;
      aResult.Truncate();
      aResult.AppendLiteral("var-");
      aResult.Append(mVariableOrder[variableIndex]);
    }

Equivalent Rust code

While Rust is similar to C++ in some ways, idiomatic Rust uses different abstractions and data structures. Rust code will look very different from C++ (see below for details). First, let’s consider what would happen if we translated the vulnerable code as literally as possible:

    fn GetCustomPropertyNameAt(&self, aIndex: usize) -> String {
        assert!(self.mOrder[aIndex] >= self.eCSSProperty_COUNT);

        let mut result = "var-".to_string();
        result += &self.mVariableOrder[aIndex];
        result
    }

The Rust compiler would accept the code, since there is no way to determine the length of vectors before runtime. Unlike arrays, whose length must be known, the Vec type in Rust is dynamically sized. However, the standard library vector implementation has built-in bounds checking. When an invalid index is used, the program immediately terminates in a controlled fashion, preventing any illegal access.

The actual code in Quantum CSS uses very different data structures, so there’s no exact equivalent. For example, we use Rust’s powerful built-in data structures to unify the ordering and property name data. This allows us to avoid having to maintain two independent arrays. Rust data structures also improve data encapsulation and reduce the likelihood of these kinds of logic errors. Because the code needs to interact with C++ code in other parts of the browser engine, the new GetCustomPropertyNameAt function doesn’t look like idiomatic Rust code. It still offers all of the safety guarantees while providing a more understandable abstraction of the underlying data.

tl;dr;

Due to the overlap between memory safety violations and security-related bugs, we can say that Rust code should result in fewer critical CVEs (Common Vulnerabilities and Exposures). However, even Rust is not foolproof. Developers still need to be aware of correctness bugs and data leakage attacks. Code review, testing, and fuzzing still remain essential for maintaining secure libraries.

Compilers can’t catch every mistake that programmers can make. However, Rust has been designed to remove the burden of memory safety from our shoulders, allowing us to focus on logical correctness and soundness instead.

The post Implications of Rewriting a Browser Component in Rust appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2019/02/rewriting-a-browser-component-in-rust/feed/ 16 33198
Fearless Security: Thread Safety https://hacks.mozilla.org/2019/02/fearless-security-thread-safety/ https://hacks.mozilla.org/2019/02/fearless-security-thread-safety/#comments Thu, 14 Feb 2019 15:48:10 +0000 https://hacks.mozilla.org/?p=33181 Multithreading allows programs to do more faster, but adds synchronization bugs and attacks. Programming languages have evolved different concurrency strategies to help developers manage both the performance and security challenges of multi-threaded applications. Diane Hosfelt explores the challenges of thread safety, and the approach that Rust takes.

The post Fearless Security: Thread Safety appeared first on Mozilla Hacks - the Web developer blog.

]]>
In Part 2 of my three-part Fearless Security series, I’ll explore thread safety.

Today’s applications are multi-threaded—instead of sequentially completing tasks, a program uses threads to perform multiple tasks simultaneously. We all use concurrency and parallelism every day:

  • Web sites serve multiple simultaneous users.
  • User interfaces perform background work that doesn’t interrupt the user. (Imagine if your application froze each time you typed a character because it was spell-checking).
  • Multiple applications can run at the same time on a computer.

While this allows programs to do more faster, it comes with a set of synchronization problems, namely deadlocks and data races. From a security standpoint, why do we care about thread safety? Memory safety bugs and thread safety bugs have the same core problem: invalid resource use. Concurrency attacks can lead to similar consequences as memory attacks, including privilege escalation, arbitrary code execution (ACE), and bypassing security checks.

Concurrency bugs, like implementation bugs, are closely related to program correctness. While memory vulnerabilities are nearly always dangerous, implementation/logic bugs don’t always indicate a security concern, unless they occur in the part of the code that deals with ensuring security contracts are upheld (e.g. allowing a security check bypass). However, while security problems stemming from logic errors often occur near the error in sequential code, concurrency bugs often happen in different functions from their corresponding vulnerability, making them difficult to trace and resolve. Another complication is the overlap between mishandling memory and concurrency flaws, which we see in data races.

Programming languages have evolved different concurrency strategies to help developers manage both the performance and security challenges of multi-threaded applications.

Problems with concurrency

It’s a common axiom that parallel programming is hard—our brains are better at sequential reasoning. Concurrent code can have unexpected and unwanted interactions between threads, including deadlocks, race conditions, and data races.

A deadlock occurs when multiple threads are each waiting on the other to take some action in order to proceed, leading to the threads becoming permanently blocked. While this is undesirable behavior and could cause a denial of service attack, it wouldn’t cause vulnerabilities like ACE.

A race condition is a situation in which the timing or ordering of tasks can affect the correctness of a program, while a data race happens when multiple threads attempt to concurrently access the same location in memory and at least one of those accesses is a write. There’s a lot of overlap between data races and race conditions, but they can also occur independently. There are no benign data races.

Potential consequences of concurrency bugs:

  1. Deadlock
  2. Information loss: another thread overwrites information
  3. Integrity loss: information from multiple threads is interlaced
  4. Loss of liveness: performance problems resulting from uneven access to shared resources

The best-known type of concurrency attack is called a TOCTOU (time of check to time of use) attack, which is a race condition between checking a condition (like a security credential) and using the results. TOCTOU attacks are examples of integrity loss.

Deadlocks and loss of liveness are considered performance problems, not security issues, while information and integrity loss are both more likely to be security-related. This paper from Red Balloon Security examines some exploitable concurrency errors. One example is a pointer corruption that allows privilege escalation or remote execution—a function that loads a shared ELF (Executable and Linkable Format) library holds a semaphore correctly the first time it’s called, but the second time it doesn’t, enabling kernel memory corruption. This attack is an example of information loss.

The trickiest part of concurrent programming is testing and debugging—concurrency bugs have poor reproducibility. Event timings, operating system decisions, network traffic, etc. can all cause different behavior each time you run a program that has a concurrency bug.

Not only can behavior change each time we run a concurrent program, but inserting print or debugging statements can also modify the behavior, causing heisenbugs (nondeterministic, hard to reproduce bugs that are common in concurrent programming) to mysteriously disappear. These operations are slow compared to others and change message interleaving and event timing accordingly.

Concurrent programming is hard. Predicting how concurrent code interacts with other concurrent code is difficult to do. When bugs appear, they’re difficult to find and fix. Instead of relying on programmers to worry about this, let’s look at ways to design programs and use languages to make it easier to write concurrent code.

First, we need to define what “threadsafe” means:

“A data type or static method is threadsafe if it behaves correctly when used from multiple threads, regardless of how those threads are executed, and without demanding additional coordination from the calling code.” MIT

How programming languages manage concurrency

In languages that don’t statically enforce thread safety, programmers must remain constantly vigilant when interacting with memory that can be shared with another thread and could change at any time. In sequential programming, we’re taught to avoid global variables in case another part of code has silently modified them. Like manual memory management, requiring programmers to safely mutate shared data is problematic.

Generally, programming languages are limited to two approaches for managing safe concurrency:

  1. Confining mutability or limiting sharing
  2. Manual thread safety (e.g. locks, semaphores)

Languages that limit threading either confine mutable variables to a single thread or require that all shared variables be immutable. Both approaches eliminate the core problem of data races—improperly mutating shared data—but this can be too limiting. To solve this, languages have introduced low-level synchronization primitives like mutexes. These can be used to build threadsafe data structures.

Python and the global interpreter lock

The reference implementation of Python, CPython, has a mutex called the Global Interpreter Lock (GIL), which only allows a single thread to access a Python object. Multi-threaded Python is notorious for being inefficient because of the time spent waiting to acquire the GIL. Instead, most parallel Python programs use multiprocessing, meaning each process has its own GIL.

Java and runtime exceptions

Java is designed to support concurrent programming via a shared-memory model. Each thread has its own execution path, but is able to access any object in the program—it’s up to the programmer to synchronize accesses between threads using Java built-in primitives.

While Java has the building blocks for creating thread-safe programs, thread safety is not guaranteed by the compiler (unlike memory safety). If an unsynchronized memory access occurs (aka a data race), then Java will raise a runtime exception—however, this still relies on programmers appropriately using concurrency primitives.

C++ and the programmer’s brain

While Python avoids data races by synchronizing everything with the GIL, and Java raises runtime exceptions if it detects a data race, C++ relies on programmers to manually synchronize memory accesses. Prior to C++11, the standard library did not include concurrency primitives.

Most programming languages provide programmers with the tools to write thread-safe code, and post hoc methods exist for detecting data races and race conditions; however, this does not result in any guarantees of thread safety or data race freedom.

How does Rust manage concurrency?

Rust takes a multi-pronged approach to eliminating data races, using ownership rules and type safety to guarantee data race freedom at compile time.

The first post of this series introduced ownership—one of the core concepts of Rust. Each variable has a unique owner and can either be moved or borrowed. If a different thread needs to modify a resource, then we can transfer ownership by moving the variable to the new thread.

Moving enforces exclusion, allowing multiple threads to write to the same memory, but never at the same time. Since an owner is confined to a single thread, what happens if another thread borrows a variable?

In Rust, you can have either one mutable borrow or as many immutable borrows as you want. You can never simultaneously have a mutable borrow and an immutable borrow (or multiple mutable borrows). When we talk about memory safety, this ensures that resources are freed properly, but when we talk about thread safety, it means that only one thread can ever modify a variable at a time. Furthermore, we know that no other threads will try to reference an out of date borrow—borrowing enforces either sharing or writing, but never both.

Ownership was designed to mitigate memory vulnerabilities. It turns out that it also prevents data races.

While many programming languages have methods to enforce memory safety (like reference counting and garbage collection), they usually rely on manual synchronization or prohibitions on concurrent sharing to prevent data races. Rust’s approach addresses both kinds of safety by attempting to solve the core problem of identifying valid resource use and enforcing that validity during compilation.

Either one mutable borrow or infinitely many immutable borrows

But wait! There’s more!

The ownership rules prevent multiple threads from writing to the same memory and disallow simultaneous sharing between threads and mutability, but this doesn’t necessarily provide thread-safe data structures. Every data structure in Rust is either thread-safe or it’s not. This is communicated to the compiler using the type system.

A well-typed program can’t go wrong. Robin Milner, 1978

In programming languages, type systems describe valid behaviors. In other words, a well-typed program is well-defined. As long as our types are expressive enough to capture our intended meaning, then a well-typed program will behave as intended.

Rust is a type safe language—the compiler verifies that all types are consistent. For example, the following code would not compile:

    let mut x = "I am a string";
    x = 6;
    error[E0308]: mismatched types
     --> src/main.rs:6:5
      |
    6 | x = 6; //
      |     ^ expected &str, found integral variable
      |
      = note: expected type `&str`
                 found type `{integer}`

All variables in Rust have a type—often, they’re implicit. We can also define new types and describe what capabilities a type has using the trait system. Traits provide an interface abstraction in Rust. Two important built-in traits are Send and Sync, which are exposed by default by the Rust compiler for every type in a Rust program:

  • Send indicates that a struct may safely be sent between threads (required for an ownership move)
  • Sync indicates that a struct may safely be shared between threads

This example is a simplified version of the standard library code that spawns threads:

    fn spawn<Closure: Fn() + Send>(closure: Closure){ ... }

    let x = std::rc::Rc::new(6);
    spawn(|| { x; });

The spawn function takes a single argument, closure, and requires that closure has a type that implements the Send and Fn traits. When we try to spawn a thread and pass a closure value that makes use of the variable x, the compiler rejects the program for not fulfilling these requirements with the following error:

    error[E0277]: `std::rc::Rc<i32>` cannot be sent between threads safely
     --> src/main.rs:8:1
      |
    8 | spawn(move || { x; });
      | ^^^^^ `std::rc::Rc<i32>` cannot be sent between threads safely
      |
      = help: within `[closure@src/main.rs:8:7: 8:21 x:std::rc::Rc<i32>]`, the trait `std::marker::Send` is not implemented for `std::rc::Rc<i32>`
      = note: required because it appears within the type `[closure@src/main.rs:8:7: 8:21 x:std::rc::Rc<i32>]`
    note: required by `spawn`

The Send and Sync traits allow the Rust type system to reason about what data may be shared. By including this information in the type system, thread safety becomes type safety. Instead of relying on documentation, thread safety is part of the compiler’s law.

This allows programmers to be opinionated about what can be shared between threads, and the compiler will enforce those opinions.

While many programming languages provide tools for concurrent programming, preventing data races is a difficult problem. Requiring programmers to reason about complex instruction interleaving and interaction between threads leads to error prone code. While thread safety and memory safety violations share similar consequences, traditional memory safety mitigations like reference counting and garbage collection don’t prevent data races. In addition to statically guaranteeing memory safety, Rust’s ownership model prevents unsafe data modification and sharing across threads, while the type system propagates and enforces thread safety at compile time.
Pikachu finally discovers fearless concurrency with Rust

The post Fearless Security: Thread Safety appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2019/02/fearless-security-thread-safety/feed/ 1 33181
Fearless Security: Memory Safety https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/ https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/#comments Wed, 23 Jan 2019 15:00:57 +0000 https://hacks.mozilla.org/?p=33087 Memory safety violations leave programs vulnerable to security threats like unintentional data leakage and remote code execution. There are ways to ensure memory safety, including smart pointers and garbage collection. Research engineer Diane Hosfelt explains how Rust’s ownership system achieves memory safety while minimizing performance costs.

The post Fearless Security: Memory Safety appeared first on Mozilla Hacks - the Web developer blog.

]]>
Fearless Security

Last year, Mozilla shipped Quantum CSS in Firefox, which was the culmination of 8 years of investment in Rust, a memory-safe systems programming language, and over a year of rewriting a major browser component in Rust. Until now, all major browser engines have been written in C++, mostly for performance reasons. However, with great performance comes great (memory) responsibility: C++ programmers have to manually manage memory, which opens a Pandora’s box of vulnerabilities. Rust not only prevents these kinds of errors, but the techniques it uses to do so also prevent data races, allowing programmers to reason more effectively about parallel code.

With great performance comes great memory responsibility

In the coming weeks, this three-part series will examine memory safety and thread safety, and close with a case study of the potential security benefits gained from rewriting Firefox’s CSS engine in Rust.

What Is Memory Safety

When we talk about building secure applications, we often focus on memory safety. Informally, this means that in all possible executions of a program, there is no access to invalid memory. Violations include:

  • use after free
  • null pointer dereference
  • using uninitialized memory
  • double free
  • buffer overflow

For a more formal definition, see Michael Hicks’ What is memory safety post and The Meaning of Memory Safety, a paper that formalizes memory safety.

Memory violations like these can cause programs to crash unexpectedly and can be exploited to alter intended behavior. Potential consequences of a memory-related bug include information leakage, arbitrary code execution, and remote code execution.

Managing Memory

Memory management is crucial to both the performance and the security of applications. This section will discuss the basic memory model. One key concept is pointers. A pointer is a variable that stores a memory address. If we visit that memory address, there will be some data there, so we say that the pointer is a reference to (or points to) that data. Just like a home address shows people where to find you, a memory address shows a program where to find data.

Everything in a program is located at a particular memory address, including code instructions. Pointer misuse can cause serious security vulnerabilities, including information leakage and arbitrary code execution.

Allocation/free

When we create a variable, the program needs to allocate enough space in memory to store the data for that variable. Since the memory owned by each process is finite, we also need some way of reclaiming resources (or freeing them). When memory is freed, it becomes available to store new data, but the old data can still exist until it is overwritten.

Buffers

A buffer is a contiguous area of memory that stores multiple instances of the same data type. For example, the phrase “My cat is Batman” would be stored in a 16-byte buffer. Buffers are defined by a starting memory address and a length; because the data stored in memory next to a buffer could be unrelated, it’s important to ensure we don’t read or write past the buffer boundaries.

Control Flow

Programs are composed of subroutines, which are executed in a particular order. At the end of a subroutine, the computer jumps to a stored pointer (called the return address) to the next part of code that should be executed. When we jump to the return address, one of three things happens:

  1. The process continues as expected (the return address was not corrupted).
  2. The process crashes (the return address was altered to point at non-executable memory).
  3. The process continues, but not as expected (the return address was altered and control flow changed).

How languages achieve memory safety

We often think of programming languages on a spectrum. On one end, languages like C/C++ are efficient, but require manual memory management; on the other, interpreted languages use automatic memory management (like reference counting or garbage collection [GC]), but pay the price in performance. Even languages with highly optimized garbage collectors can’t match the performance of non-GC’d languages.

Manually

Some languages (like C) require programmers to manually manage memory by specifying when to allocate resources, how much to allocate, and when to free the resources. This gives the programmer very fine-grained control over how their implementation uses resources, enabling fast and efficient code. However, this approach is prone to mistakes, particularly in complex codebases.

Mistakes that are easy to make include:

  • forgetting that resources have been freed and trying to use them
  • not allocating enough space to store data
  • reading past the boundary of a buffer

Shake hands with danger!
A safety video candidate for manual memory management

Smart pointers

A smart pointer is a pointer with additional information to help prevent memory mismanagement. These can be used for automated memory management and bounds checking. Unlike raw pointers, a smart pointer is able to self-destruct, instead of waiting for the programmer to manually destroy it.

There’s no single smart pointer type—a smart pointer is any type that wraps a raw pointer in some practical abstraction. Some smart pointers use reference counting to count how many variables are using the data owned by a variable, while others implement a scoping policy to constrain a pointer lifetime to a particular scope.

In reference counting, the object’s resources are reclaimed when the last reference to the object is destroyed. Basic reference counting implementations can suffer from performance and space overhead, and can be difficult to use in multi-threaded environments. Situations where objects refer to each other (cyclical references) can prohibit either object’s reference count from ever reaching zero, which requires more sophisticated methods.

Garbage Collection

Some languages (like Java, Go, Python) are garbage collected. A part of the runtime environment, named the garbage collector (GC), traces variables to determine what resources are reachable in a graph that represents references between objects. Once an object is no longer reachable, its resources are not needed and the GC reclaims the underlying memory to reuse in the future. All allocations and deallocations occur without explicit programmer instruction.

While a GC ensures that memory is always used validly, it doesn’t reclaim memory in the most efficient way. The last time an object is used could occur much earlier than when it is freed by the GC. Garbage collection has a performance overhead that can be prohibitive for performance critical applications; it requires up to 5x as much memory to avoid a runtime performance penalty.

Ownership

To achieve both performance and memory safety, Rust uses a concept called ownership. More formally, the ownership model is an example of an affine type system. All Rust code follows certain ownership rules that allow the compiler to manage memory without incurring runtime costs:

  1. Each value has a variable, called the owner.
  2. There can only be one owner at a time.
  3. When the owner goes out of scope, the value will be dropped.

Values can be moved or borrowed between variables. These rules are enforced by a part of the compiler called the borrow checker.

When a variable goes out of scope, Rust frees that memory. In the following example, when s1 and s2 go out of scope, they would both try to free the same memory, resulting in a double free error. To prevent this, when a value is moved out of a variable, the previous owner becomes invalid. If the programmer then attempts to use the invalid variable, the compiler will reject the code. This can be avoided by creating a deep copy of the data or by using references.

Example 1: Moving ownership

let s1 = String::from("hello");
let s2 = s1;

//won't compile because s1 is now invalid
println!("{}, world!", s1);

Another set of rules verified by the borrow checker pertains to variable lifetimes. Rust prohibits the use of uninitialized variables and dangling pointers, which can cause a program to reference unintended data. If the code in the example below compiled, r would reference memory that is deallocated when x goes out of scope—a dangling pointer. The compiler tracks scopes to ensure that all borrows are valid, occasionally requiring the programmer to explicitly annotate variable lifetimes.

Example 2: A dangling pointer

let r;
{
  let x = 5;
  r = &x;
}
println!("r: {}", r);

The ownership model provides a strong foundation for ensuring that memory is accessed appropriately, preventing undefined behavior.

Memory Vulnerabilities

The main consequences of memory vulnerabilities include:

  1. Crash: accessing invalid memory can make applications terminate unexpectedly
  2. Information leakage: inadvertently exposing non-public data, including sensitive information like passwords
  3. Arbitrary code execution (ACE): allows an attacker to execute arbitrary commands on a target machine; when this is possible over a network, we call it a remote code execution (RCE)

Another type of problem that can appear is memory leakage, which occurs when memory is allocated, but not released after the program is finished using it. It’s possible to use up all available memory this way. Without any remaining memory, legitimate resource requests will be blocked, causing a denial of service. This is a memory-related problem, but one that can’t be addressed by programming languages.

The best case scenario with most memory errors is that an application will crash harmlessly—this isn’t a good best case. However, the worst case scenario is that an attacker can gain control of the program through the vulnerability (which could lead to further attacks).

Misusing Free (use-after-free, double free)

This subclass of vulnerabilities occurs when some resource has been freed, but its memory position is still referenced. It’s a powerful exploitation method that can lead to out of bounds access, information leakage, code execution and more.

Garbage-collected and reference-counted languages prevent the use of invalid pointers by only destroying unreachable objects (which can have a performance penalty), while manually managed languages are particularly susceptible to invalid pointer use (particularly in complex codebases). Rust’s borrow checker doesn’t allow object destruction as long as references to the object exist, which means bugs like these are prevented at compile time.

Uninitialized variables

If a variable is used prior to initialization, the data it contains could be anything—including random garbage or previously discarded data, resulting in information leakage (these are sometimes called wild pointers). Often, memory managed languages use a default initialization routine that is run after allocation to prevent these problems.

Like C, most variables in Rust are uninitialized until assignment—unlike C, you can’t read them prior to initialization. The following code will fail to compile:

Example 3: Using an uninitialized variable

fn main() {
    let x: i32;
    println!("{}", x);
}

Null pointers

When an application dereferences a pointer that turns out to be null, usually this means that it simply accesses garbage that will cause a crash. In some cases, these vulnerabilities can lead to arbitrary code execution 1 2 3. Rust has two types of pointers, references and raw pointers. References are safe to access, while raw pointers could be problematic.

Rust prevents null pointer dereferencing two ways:

  1. Avoiding nullable pointers
  2. Avoiding raw pointer dereferencing

Rust avoids nullable pointers by replacing them with a special Option type. In order to manipulate the possibly-null value inside of an Option, the language requires the programmer to explicitly handle the null case or the program will not compile.

When we can’t avoid nullable pointers (for example, when interacting with non-Rust code), what can we do? Try to isolate the damage. Any dereferencing raw pointers must occur in an unsafe block. This keyword relaxes Rust’s guarantees to allow some operations that could cause undefined behavior (like dereferencing a raw pointer).

Everything the borrow checker touches...what about that shadowy place? That's an unsafe block. You must never go there Simba.

Buffer overflow

While the other vulnerabilities discussed here are prevented by methods that restrict access to undefined memory, a buffer overflow may access legally allocated memory. The problem is that a buffer overflow inappropriately accesses legally allocated memory. Like a use-after-free bug, out-of-bounds access can also be problematic because it accesses freed memory that hasn’t been reallocated yet, and hence still contains sensitive information that’s supposed to not exist anymore.

A buffer overflow simply means an out-of-bounds access. Due to how buffers are stored in memory, they often lead to information leakage, which could include sensitive data such as passwords. More severe instances can allow ACE/RCE vulnerabilities by overwriting the instruction pointer.

Example 4: Buffer overflow (C code)

int main() {
  int buf[] = {0, 1, 2, 3, 4};
  
  // print out of bounds
  printf("Out of bounds: %d\n", buf[10]);
  
  // write out of bounds
  buf[10] = 10;
  printf("Out of bounds: %d\n", buf[10]);
  
  return 0;
}

The simplest defense against a buffer overflow is to always require a bounds check when accessing elements, but this adds a runtime performance penalty.

How does Rust handle this? The built-in buffer types in Rust’s standard library require a bounds check for any random access, but also provide iterator APIs that can reduce the impact of these bounds checks over multiple sequential accesses. These choices ensure that out-of-bounds reads and writes are impossible for these types. Rust promotes patterns that lead to bounds checks only occurring in those places where a programmer would almost certainly have to manually place them in C/C++.

Memory safety is only half the battle

Memory safety violations open programs to security vulnerabilities like unintentional data leakage and remote code execution. There are various ways to ensure memory safety, including smart pointers and garbage collection. You can even formally prove memory safety. While some languages have accepted slower performance as a tradeoff for memory safety, Rust’s ownership system achieves both memory safety and minimizes the performance costs.

Unfortunately, memory errors are only part of the story when we talk about writing secure code. The next post in this series will discuss concurrency attacks and thread safety.

Exploiting Memory: In-depth resources

Heap memory and exploitation
Smashing the stack for fun and profit
Analogies of Information Security
Intro to use after free vulnerabilities

The post Fearless Security: Memory Safety appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/feed/ 6 33087
Private by Design: How we built Firefox Sync https://hacks.mozilla.org/2018/11/firefox-sync-privacy/ https://hacks.mozilla.org/2018/11/firefox-sync-privacy/#comments Tue, 13 Nov 2018 15:09:17 +0000 https://hacks.mozilla.org/?p=32922 Firefox Sync lets you share your bookmarks, browsing history, passwords and other browser data between different devices, and send tabs from one device to another. We think it’s important to highlight the privacy aspects of Sync, which protects all your synced data by default so Mozilla can’t read it, ever. In this post, we take a closer look at some of the technical design choices we made in order to put user privacy first.

The post Private by Design: How we built Firefox Sync appeared first on Mozilla Hacks - the Web developer blog.

]]>
What is Firefox Sync and why would you use it

That shopping rabbit hole you started on your laptop this morning? Pick up where you left off on your phone tonight. That dinner recipe you discovered at lunchtime? Open it on your kitchen tablet, instantly. Connect your personal devices, securely. – Firefox Sync

Firefox Sync lets you share your bookmarks, browsing history, passwords and other browser data between different devices, and send tabs from one device to another. It’s a feature that millions of our users take advantage of to streamline their lives and how they interact with the web.

But on an Internet where sharing your data with a provider is the norm, we think it’s important to highlight the privacy aspects of Firefox Sync.

Firefox Sync by default protects all your synced data so Mozilla can’t read it. We built Sync this way because we put user privacy first. In this post, we take a closer look at some of the technical design choices we made and why.

When building a browser and implementing a sync service, we think it’s important to look at what one might call ‘Total Cost of Ownership’.  Not just what users get from a feature, but what they give up in exchange for ease of use.

We believe that by making the right choices to protect your privacy, we’ve also lowered the barrier to trying out Sync. When you sign up and choose a strong passphrase, your data is protected from both attackers and from Mozilla, so you can try out Sync without worry. Give it a shot, it’s right up there in the menu bar!

Sign in to Sync Button in the Firefox Menu

Why Firefox Sync is safe

Encryption allows one to protect data so that it is entirely unreadable without the key used to encrypt it. The math behind encryption is strong, has been tested for decades, and every government in the world uses it to protect its most valuable secrets.

The hard part of encryption is that key. What key do you encrypt with, where does it come from, where is it stored, and how does it move between places? Lots of cloud providers claim they encrypt your data, and they do. But they also have the key! While the encryption is not meaningless, it is a small measure, and does not protect the data against the most concerning threats.

The encryption key is the essential element. The service provider must never receive it – even temporarily – and must never know it. When you sign into your Firefox Account, you enter a username and passphrase, which are sent to the server. How is it that we can claim to never know your encryption key if that’s all you ever provide us?  The difference is in how we handle your passphrase.

A typical login flow for an internet service is to send your username and passphrase up to the server, where they hash it, compare it to a stored hash, and if correct, the server sends you your data. (Hashing refers to the activity of converting passwords into unreadable strings of characters impossible to revert.)

Typical Web Provider Login Flow

The crux of the difference in how we designed Firefox Accounts, and Firefox Sync (our underlying syncing service), is that you never send us your passphrase. We transform your passphrase on your computer into two different, unrelated values. With one value, you cannot derive the other0. We send an authentication token, derived from your passphrase, to the server as the password-equivalent. And the encryption key derived from your passphrase never leaves your computer.

Firefox Sync Login Flow

Interested in the technical details?  We use 1000 rounds of PBKDF2 to derive your passphrase into the authentication token1. On the server, we additionally hash this token with scrypt (parameters N=65536, r=8, p=1)2 to make sure our database of authentication tokens is even more difficult to crack.

We derive your passphrase into an encryption key using the same 1000 rounds of PBKDF2. It is domain-separated from your authentication token by using HKDF with separate info values. We use this key to unwrap an encryption key (which you generated during setup and which we never see unwrapped), and that encryption key is used to protect your data.  We use the key to encrypt your data using AES-256 in CBC mode, protected with an HMAC3.

This cryptographic design is solid – but the constants need to be updated. One thousand rounds of PBKDF can be improved, and we intend to do so in the future (Bug 1320222). This token is only ever sent over a HTTPS connection (with preloaded HPKP pins) and is not stored, so when we initially developed this and needed to support low-power, low-resources devices, a trade-off was made. AES-CBC + HMAC is acceptable – it would be nice to upgrade this to an authenticated mode sometime in the future.

Other approaches

This isn’t the only approach to building a browser sync feature. There are at least three other options:

Option 1: Share your data with the browser maker

In this approach, the browser maker is able to read your data, and use it to provide services to you. For example,  when you sync your browser history in Chrome it will automatically go into your Web & App Activity unless you’ve changed the default settings. As Google Chrome Help explains, “Your activity may be used to personalize your experience on other Google products, like Search or ads. For example, you may see a news story recommended in your feed based on your Chrome history.”4

Option 2: Use a separate password for sign-in and encryption

We developed Firefox Sync to be as easy to use as possible, so we designed it from the ground up to derive an authentication token and an encryption key – and we never see the passphrase or the encryption key. One cannot safely derive an encryption key from a passphrase if the passphrase is sent to the server.

One could, however, add a second passphrase that is never sent to the server, and encrypt the data using that. Chrome provides this as a non-default option5. You can sign in to sync with your Google Account credentials; but you choose a separate passphrase to encrypt your data. It’s imperative you choose a separate passphrase though.

All-in-all, we don’t care for the design that requires a second passphrase. This approach is confusing to users. It’s very easy to choose the same (or similar) passphrase and negate the security of the design. It’s hard to determine which is more confusing: to require a second passphrase or to make it optional! Making it optional means it will be used very rarely.  We don’t believe users should have to opt-in to privacy.

Option 3: Manual key synchronization

The key (pun intended) to auditing a cryptographic design is to ask about the key: “Where does it come from? Where does it go?” With the Firefox Sync design, you enter a passphrase of your choosing and it is used to derive an encryption key that never leaves your computer.

Another option for Sync is to remove user choice, and provide a passphrase for you (that never leaves your computer). This passphrase would be secure and unguessable – which is an advantage, but it would be near-impossible to remember – which is a disadvantage.

When you want to add a new device to sync to, you’d need your existing device nearby in order to manually read and type the passphrase into the new device. (You could also scan a QR code if your new device has a camera).

Other Browsers

Overall, Sync works the way it does because we feel it’s the best design choice. Options 1 and 2 don’t provide thorough user privacy protections by default. Option 3 results in lower user adoption and thus reduces the number of people we can help (more on this below).

As noted above, Chrome implements Option 1 by default, which means unless you change the settings before you enable sync, Google will see all of your browsing history and other data, and use it to market services to you. Chrome also implements Option 2 as an opt-in feature.

Opera and Vivaldi follow Chrome’s lead, implementing Option 1 by default and Option 2 as an opt-in feature. Update: Vivaldi actually prompts you for a separate password by default (Option 2), and allows you to opt-out and use your login password (Option 1).

Brave, also a privacy-focused browser, has implemented Option 3. And, in fact, Firefox also implemented a form of Option 3 in its original Sync Protocol, but we changed our design in April 2014 (Firefox 29) in response to user feedback6. For example, our original design (and Brave’s current design) makes it much harder to regain access to your data if you lose your device or it gets stolen. Passwords or passphrases make that experience substantially easier for the average user, and significantly increased Sync adoption by users.

Brave’s sync protocol has some interesting wrinkles7. One distinct minus is that you can’t change your passphrase, if it were to be stolen by malware. Another interesting wrinkle is that Brave does not keep track of how many or what types of devices you have. This is a nuanced security trade-off: having less information about the user is always desirable… The downside is that Brave can’t allow you to detect when a new device begins receiving your sync data or allow you to deauthorize it. We respect Brave’s decision. In Firefox, however, we have chosen to provide this additional security feature for users (at the cost of knowing more about their devices).

Conclusion

We designed Firefox Sync to protect your data – by default – so Mozilla can’t read it. We built it this way – despite trade-offs that make development and offering features more difficult – because we put user privacy first. At Mozilla, this priority is a core part of our mission to “ensure the Internet is a global public resource… where individuals can shape their own experience and are empowered, safe and independent.”


0 It is possible to use one to guess the other, but only if you choose a weak password.

1 You can find more details in the full protocol specification or by following the code starting at this point. There are a few details we have omitted to simplify this blog post, including the difference between kA and kB keys, and application-specific subkeys.

2 Server hashing code is located here.

3 The encryption code can be seen here.

4 https://support.google.com/chrome/answer/165139 Section “Use your Chrome history to personalize Google”

5 Chrome 71 says “For added security, Google Chrome will encrypt your data” and describes these two options as “Encrypt synced passwords with your Google username and password” and “Encrypt synced data with your own sync passphrase”.  Despite this wording, only the sync passphrase option protects your data from Google.

6 One of the original engineers of Sync has written two blog posts about the transition to the new sync protocol, and why we did it. If you’re interested in the usability aspects of cryptography, we highly recommend you read them to see what we learned.

7 You can read more about Brave sync on Brave’s Design page.

The post Private by Design: How we built Firefox Sync appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2018/11/firefox-sync-privacy/feed/ 36 32922
A cartoon intro to DNS over HTTPS https://hacks.mozilla.org/2018/05/a-cartoon-intro-to-dns-over-https/ https://hacks.mozilla.org/2018/05/a-cartoon-intro-to-dns-over-https/#comments Thu, 31 May 2018 14:04:51 +0000 https://hacks.mozilla.org/?p=32285 At Mozilla, we closely track threats to users' privacy and security. This is why we've added tracking protection to Firefox and created the Facebook container extension. In today's cartoon intro, Lin Clark describes two new initiatives we're championing to close data leaks that have been part of the domain name system since it was created 35 years ago: DNS over HTTPS, a new IETF standard, and Trusted Recursive Resolver, a new secure way to resolve DNS that we’ve partnered with Cloudflare to provide.

The post A cartoon intro to DNS over HTTPS appeared first on Mozilla Hacks - the Web developer blog.

]]>
Threats to users’ privacy and security are growing. At Mozilla, we closely track these threats. We believe we have a duty to do everything we can to protect Firefox users and their data.

We’re taking on the companies and organizations that want to secretly collect and sell user data. This is why we added tracking protection and created the Facebook container extension. And you’ll be seeing us do more things to protect our users over the coming months.

Icons for security projects that we’ve introduced

Two more protections we’re adding to that list are:

  • DNS over HTTPS, a new IETF standards effort that we’ve championed
  • Trusted Recursive Resolver, a new secure way to resolve DNS that we’ve partnered with Cloudflare to provide

With these two initiatives, we’re closing data leaks that have been part of the domain name system since it was created 35 years ago. And we’d like your help in testing them. So let’s look at how DNS over HTTPS and Trusted Recursive Resolver protect our users.

But first, let’s look at how web pages move around the Internet.

If you already know how DNS and HTTPS work, you can skip to how DNS over HTTPS helps.

A brief HTTP crash course

When people explain how a browser downloads a web page, they usually explain it this way:

  1. Your browser makes a GET request to a server.
  2. The server sends a response, which is a file containing HTML.

browser GET request + response

This system is called HTTP.

But this diagram is a little oversimplified. Your browser doesn’t talk directly to the server. That’s because they probably aren’t close to each other.

Instead, the server could be thousands of miles away. And there’s likely no direct link between your computer and the server.

image of client and server on opposite ends of the network

So this request needs to get from the browser to that server, and it will go through multiple hands before it gets there. And the same is true for the response coming back from the server.

I think of this like kids passing notes to each other in class. On the outside, the note will say who it’s supposed to go to. The kid who wrote the note will pass it to their neighbor. Then that next kid passes it to one of their neighbors — probably not the eventual recipient, but someone who’s in that direction.

kids passing notes

The problem with this is that anyone along the path can open up the note and read it. And there’s no way to know in advance which path the note is going to take, so there’s no telling what kind of people will have access to it.

It could end up in the hands of people who do harmful things…

Like sharing the contents of the note with everyone.

kid saying “Ooo, hey everybody… Danny loves Sandy!”

Or changing the response.

kid saying “Do you like me? Y/N… Heh, I’m going to prank him and put no here”

To fix these issues, a new, secure version of HTTP was created. This is called HTTPS. With HTTPS, it’s kind of like each message has a lock on it.

open envelope next to locked envelope

Both the browser and the server know the combination to that lock, but no one in between does.

With this, even if the messages go through multiple routers in between, only you and the web site will actually be able to read the contents.

This solves a lot of the security issues. But there are still some messages going between your browser and the server that aren’t encrypted. This means people along the way can still pry into what you’re doing.

One place where data is still exposed is in setting up the connection to the server. When you send your initial message to the server, you send the server name as well (in a field called “Server Name Indication”). This lets server operators run multiple sites on the same machine while still knowing who you are trying to talk to. This initial request is part of setting up encryption, but the initial request itself isn’t encrypted.

The other place where data is exposed is in DNS. But what is DNS?

DNS: the Domain Name System

In the passing notes metaphor above, I said that the name of the recipient had to be on the outside of the note. This is true for HTTP requests too… they need to say who they are going to.

But you can’t use a name for them. None of the routers would know who you were talking about. Instead, you have to use an IP address. That’s how the routers in between know which server you want to send your request to.

network with IP addresses

This causes a problem. You don’t want users to have to remember your site’s IP address. Instead, you want to be able to give your site a catchy name… something that users can remember.

This is why we have the domain name system (DNS). Your browser uses DNS to convert the site name to an IP address. This process — converting the domain name to an IP address — is called domain name resolution.

domain and address equivalence

How does the browser know how to do this?

One option would be to have a big list, like a phone book in the browser. But as new web sites came online, or as sites moved to new servers, it would be hard to keep that list up-to-date.

So instead of having one list which keeps track of all of the domain names, there are lots of smaller lists that are linked to each other. This allows them to be managed independently.

one list, vs lots of smaller lists

In order to get the IP address that corresponds to a domain name, you have to find the list that contains that domain name. Doing this is kind of like a treasure hunt.

What would this treasure hunt look like for a site like the English version of wikipedia, en.wikipedia.org?

We can split this domain into parts.

domain split into top level, second level, and subdomain.

With these parts, we can hunt for the list that contains the IP address for the site. We need some help in our quest, though. The tool that will go on this hunt for us and find the IP address is called a resolver.

First, the resolver talks to a server called the Root DNS. It knows of a few different Root DNS servers, so it sends the request to one of them. The resolver asks the Root DNS where it can find more info about addresses in the .org top-level domain.

The Root DNS will give the resolver an address for a server that knows about .org addresses.

resolver talking to Root DNS

This next server is called a top-level domain (TLD) name server. The TLD server knows about all of the second-level domains that end with .org.

It doesn’t know anything about the subdomains under wikipedia.org, though, so it doesn’t know the IP address for en.wikipedia.org.

The TLD name server will tell the resolver to ask Wikipedia’s name server.

resolver talking to TLD DNS

The resolver is almost done now. Wikipedia’s name server is what’s called the authoritative server. It knows about all of the domains under wikipedia.org. So this server knows about en.wikipedia.org, and other subdomains like the German version, de.wikipedia.org. The authoritative server tells the resolver which IP address has the HTML files for the site.

resolver talking to authoritative DNS

The resolver will return the IP address for en.wikipedia.org to the operating system.

This process is called recursive resolution, because you have to go back and forth asking different servers what’s basically the same question.

I said we need a resolver to help us in our quest. But how does the browser find this resolver? In general, it asks the computer’s operating system to set it up with a resolver that can help.

browser asking OS for resolver

How does the operating system know which resolver to use? There are two possible ways.

You can configure your computer to use a resolver you trust. But very few people do this.

Instead, most people just use the default. And by default, the OS will just use whatever resolver the network told it to. When the computer connects to the network and gets its IP address, the network recommends a resolver to use.

operating system getting a recommendation from the network

This means that the resolver that you’re using can change multiple times per day. If you head to the coffee shop for an afternoon work session, you’re probably using a different resolver than you were in the morning. And this is true even if you have configured your own resolver, because there’s no security in the DNS protocol.

How can DNS be exploited?

So how can this system make users vulnerable?

Usually a resolver will tell each DNS server what domain you are looking for. This request sometimes includes your full IP address. Or if not your full IP address, increasingly often the request includes most of your IP address, which can easily be combined with other information to figure out your identity.

DNS request

This means that every server that you ask to help with domain name resolution sees what site you’re looking for. But more than that, it also means that anyone on the path to those servers sees your requests, too.

There are a few ways that this system puts users’ data at risk. The two major risks are tracking and spoofing.

Tracking

Like I said above, it’s easy to take the full or partial IP address info and figure out who’s asking for that web site. This means that the DNS server and anyone along the path to that DNS server — called on-path routers — can create a profile of you. They can create a record of all of the web sites that they’ve seen you look up.

And that data is valuable. Many people and companies will pay lots of money to see what you are browsing for.

a router offering to sell data

Even if you didn’t have to worry about the possibly nefarious DNS servers or on-path routers, you still risk having your data harvested and sold. That’s because the resolver itself — the one that the network gives to you — could be untrustworthy.

Even if you trust your network’s recommended resolver, you’re probably only using that resolver when you’re at home. Like I mentioned before, whenever you go to a coffee shop or hotel or use any other network, you’re probably using a different resolver. And who knows what its data collection policies are?

Beyond having your data collected and then sold without your knowledge or consent, there are even more dangerous ways the system can be exploited.

Spoofing

With spoofing, someone on the path between the DNS server and you changes the response. Instead of telling you the real IP address, a spoofer will give you the wrong IP address for a site. This way, they can block you from visiting the real site or send you to a scam one.

spoofer sending user to wrong site

Again, this is a case where the resolver itself might act nefariously.

For example, let’s say you’re shopping for something at Megastore. You want to do a price check to see if you can get it cheaper at a competing online store, big-box.com.

But if you’re on Megastore WiFi, you’re probably using their resolver. That resolver could hijack the request to big-box.com and lie to you, saying that the site is unavailable.

How can we fix this with Trusted Recursive Resolver (TRR) and DNS over HTTPS (DoH)?

At Mozilla, we feel strongly that we have a responsibility to protect our users and their data. We’ve been working on fixing these vulnerabilities.

We are introducing two new features to fix this — Trusted Recursive Resolver (TRR) and DNS over HTTPS (DoH). Because really, there are three threats here:

  1. You could end up using an untrustworthy resolver that tracks your requests, or tampers with responses from DNS servers.
  2. On-path routers can track or tamper in the same way.
  3. DNS servers can track your DNS requests.

the three threats—resolvers, on-path routers, and DNS servers

So how do we fix these?

  1. Avoid untrustworthy resolvers by using Trusted Recursive Resolver.
  2. Protect against on-path eavesdropping and tampering using DNS over HTTPS.
  3. Transmit as little data as possible to protect users from deanonymization.

Avoid untrustworthy resolvers by using Trusted Recursive Resolver

Networks can get away with providing untrustworthy resolvers that steal your data or spoof DNS because very few users know the risks or how to protect themselves.

Even for users who do know the risks, it’s hard for an individual user to negotiate with their ISP or other entity to ensure that their DNS data is handled responsibly.

However, we’ve spent time studying these risks… and we have negotiating power. We worked hard to find a company to work with us to protect users’ DNS data. And we found one: Cloudflare.

Cloudflare is providing a recursive resolution service with a pro-user privacy policy. They have committed to throwing away all personally identifiable data after 24 hours, and to never pass that data along to third-parties. And there will be regular audits to ensure that data is being cleared as expected.

With this, we have a resolver that we can trust to protect users’ privacy. This means Firefox can ignore the resolver that the network provides and just go straight to Cloudflare. With this trusted resolver in place, we don’t have to worry about rogue resolvers selling our users’ data or tricking our users with spoofed DNS.

Why are we picking one resolver? Cloudflare is as excited as we are about building a privacy-first DNS service. They worked with us to build a DoH resolution service that would serve our users well in a transparent way. They’ve been very open to adding user protections to the service, so we’re happy to be able to collaborate with them.

But this doesn’t mean you have to use Cloudflare. Users can configure Firefox to use whichever DoH-supporting recursive resolver they want. As more offerings crop up, we plan to make it easy to discover and switch to them.

Protect against on-path eavesdropping and tampering using DNS over HTTPS

The resolver isn’t the only threat, though. On-path routers can track and spoof DNS because they can see the contents of the DNS requests and responses. But the Internet already has technology for ensuring that on-path routers can’t eavesdrop like this. It’s the encryption that I talked about before.

By using HTTPS to exchange the DNS packets, we ensure that no one can spy on the DNS requests that our users are making.

Transmit as little data as possible to protect users from deanonymization

In addition to providing a trusted resolver which communicates using the DoH protocol, Cloudflare is working with us to make this even more secure.

Normally, a resolver would send the whole domain name to each server—to the Root DNS, the TLD name server, the second-level name server, etc. But Cloudflare will be doing something different. It will only send the part that is relevant to the DNS server it’s talking to at the moment. This is called QNAME minimization.

image showing resolver only asking the relevant question

The resolver will also often include the first 24 bits of your IP address in the request. This helps the DNS server know where you are and pick a CDN closer to you. But this information can be used by DNS servers to link different requests together.

Instead of doing this, Cloudflare will make the request from one of their own IP addresses near the user. This provides geolocation without tying it to a particular user. In addition to this, we’re looking into how we can enable even better, very fine-grained load balancing in a privacy-sensitive way.

Doing this — removing the irrelevant parts of the domain name and not including your IP address — means that DNS servers have much less data that they can collect about you.

DNS request with client subnet and first part of domain cross out

What isn’t fixed by TRR with DoH?

With these fixes, we’ve reduced the number of people who can see what sites you’re visiting. But this doesn’t eliminate data leaks entirely.

After you do the DNS lookup to find the IP address, you still need to connect to the web server at that address. To do this, you send an initial request. This request includes a server name indication, which says which site on the server you want to connect to. And this request is unencrypted.

That means that your ISP can still figure out which sites you’re visiting, because it’s right there in the server name indication. Plus, the routers that pass that initial request from your browser to the web server can see that info too.

However, once you’ve made that connection to the web server, then everything is encrypted. And the neat thing is that this encrypted connection can be used for any site that is hosted on that server, not just the one that you initially asked for.

This is sometimes called HTTP/2 connection coalescing, or simply connection reuse. When you open a connection to a server that supports it, that server will tell you what other sites it hosts. Then you can visit those other sites using that existing encrypted connection.

Why does this help? You don’t need to start up a new connection to visit these other sites. This means you don’t need to send that unencrypted initial request with its server name indication saying which site you’re visiting. Which means you can visit any of the other sites on the same server without revealing what sites you’re looking at to your ISP and on-path routers.

With the rise of CDNs, more and more independent sites are being served by a single server. And since you can have multiple coalesced connections open, you can be connected to multiple shared servers or CDNs at once, visiting all of the sites across the different servers without leaking data. This means this will be more and more effective as a privacy shield.

What is the status?

You can enable DNS over HTTPS in Firefox today, and we encourage you to.

We’d like to turn this on as the default for all of our users. We believe that every one of our users deserves this privacy and security, no matter if they understand DNS leaks or not.

But it’s a big change and we need to test it out first. That’s why we’re conducting a study. We’re asking half of our Firefox Nightly users to help us collect data on performance.

We’ll use the default resolver, as we do now, but we’ll also send the request to Cloudflare’s DoH resolver. Then we’ll compare the two to make sure that everything is working as we expect.

For participants in the study, the Cloudflare DNS response won’t be used yet. We’re simply checking that everything works, and then throwing away the Cloudflare response.

diagram showing a person timing both and then throwing away Cloudflare response

We are thankful to have the support of our Nightly users — the people who help us test Firefox every day — and we hope that you will help us test this, too.

The post A cartoon intro to DNS over HTTPS appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2018/05/a-cartoon-intro-to-dns-over-https/feed/ 62 32285
Firefox 60 – Modules and More https://hacks.mozilla.org/2018/05/firefox-60-modules-and-more/ https://hacks.mozilla.org/2018/05/firefox-60-modules-and-more/#comments Wed, 09 May 2018 15:04:45 +0000 https://hacks.mozilla.org/?p=32252 Firefox 60 continues the evolution of Quantum. The parallel processing of Quantum CSS comes to Firefox for Android, while WebRender work is ongoing. Potch reports on two security upgrades - support for the Web Authentication API and for the Same-Site attribute for cookies - as well the arrival of ES modules. Firefox Quantum for Enterprise, our Extended Support Release, is now available for large installations. Read all about it!

The post Firefox 60 – Modules and More appeared first on Mozilla Hacks - the Web developer blog.

]]>
Firefox 60 is here, and the Quantum lineage continues apace. The parallel processing prowess of Quantum CSS is now available on Firefox for Android, and work continues on WebRender, which modernizes the whole idea of what it means to draw a web page. But we’re not just spreading the love on internals. Firefox 60 boasts a number of web platform and developer-facing improvements as well. Here are a few highlights:

ES Modules are Here!

A Code Cartoon of a module tree

Modular code isn’t just a good idea, it’s the law it’s a great idea! Being able to separate functional units of software allows for cleaner re-use of individual modules and easier inclusion of third-party code. Many languages have support for modules, and if you’re familiar with Node.JS, they’ve been available in some form with the CommonJS require API, but a standardized syntax was created as part of ES2015 (ES6).

Although the syntax for ES modules was standardized, it was left as an exercise for browsers to understand and retrieve the modules. This took a bit of extra time, but now that the browser loading behavior is standardized, support has started rolling out, and this release brings that support to Spidermonkey, Firefox’s JavaScript engine. You can check out the docs on MDN, and of course don’t miss Lin Clark’s breakdown of ES modules either!

Keep Your Cookies to Yourself

Firefox 60 supports the Same-Site attribute when setting cookies. When set, the browser will not send cookies along with a cross-origin request to the issuing server, e.g. during fetch or loading an image. This helps mitigate against common silent forms of Cross-Origin Request Forgery. There is a “lax” mode that does the above, as well as a strict mode that, in addition to the lax behavior, will also not send cookies with an in-bound navigation. This helps prevent a malicious site deep-linking to a page where unintentional behavior could occur when cookies are included.

Read more on the Mozilla Security Blog.

Web Authentication API

It’s been known for a while now that in many contexts, a well-known username (like an email address) and a user-generated password are not sufficiently secure for authentication. This has led to the rise of Multi-Factor Authentication, usually 2-factor authentication, in which in addition to a password, users must also provide information from an additional source. Many sites will send an SMS message with a code to a mobile device, and some also accept tokens generated by a code-generator app or purpose-built hardware “key”. This whole exchange has required the user to copy numbers from a screen into a text field, or at minimum the hardware key has had to simulate key presses.

The Web Authentication API (WebAuthn for short) seeks to eliminate the clunkier aspects of this process by letting a multi-factor authentication device or app communicate directly with a requesting site. The particulars of making this work securely are a bit too complex to cover in this post, but you can learn more about WebAuthn on MDN or here on the Hacks Blog.

A Stroke of Style

The (as-of-yet non-standard) text-stroke property defines a solid fixed-width stroke centered along the path of the characters of text. It allows for effects that aren’t achievable with text-shadow alone. A wide stroke will occlude portions of the characters because by default, the stroke is drawn over top of the font glyph. This can be a bit ugly. To fix this, browsers are borrowing paint-order property from the SVG standard. When properly set, browsers will draw the stroke underneath the text glyphs. For example:

It’s super nifty- but don’t forget that it’s not yet a standard, and you should always check that text is legible without stroke effects applied! You can read more on MDN and check out the compatibility matrix there.

ESR / Group Policy

Firefox 60 is the next version of Firefox to be designated an “Extended Support Release”, or ESR. ESR releases are intended for system administrators who deploy and maintain desktop environments in large organizations. They receive security and stability updates in sync with the latest Release versions of Firefox, and each ESR release’s support overlaps with the next one. This overlap period allows a large organization to certify and deploy new ESR versions before leaving the support window for the prior release.

Firefox 60 ships along with the first incarnation of a new Policy Engine that allows organizational adminstrators to configure Firefox for all their users en masse. On Windows, this is accomplished using Windows Group Policy, and via a configuration file on other platforms. It’s not a feature that most Firefox users will ever need, but if your job is managing thousands of installations of Firefox, we hope you’ll find this a welcome addition.

Would You Like to Know More?

As always, the full list of developer-facing changes is on MDN, and you can find the release notes here.

Keep on rocking the free web!

The post Firefox 60 – Modules and More appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2018/05/firefox-60-modules-and-more/feed/ 3 32252
Shipping a security update of Firefox in less than a day https://hacks.mozilla.org/2018/03/shipping-a-security-update-of-firefox-in-less-than-a-day/ https://hacks.mozilla.org/2018/03/shipping-a-security-update-of-firefox-in-less-than-a-day/#comments Thu, 22 Mar 2018 20:17:02 +0000 https://hacks.mozilla.org/?p=32046 One of Mozilla’s top priorities is to keep our users safe; this commitment is written into our mission. As soon as we discover a critical issue in Firefox, we plan a rapid mitigation. This post describes how we fixed a Pwn2Own exploit discovery and released new builds of the browser in less than 22 hours, through the collaborative and well-coordinated efforts of a global cross-functional team.

The post Shipping a security update of Firefox in less than a day appeared first on Mozilla Hacks - the Web developer blog.

]]>
One of Mozilla’s top priorities is to keep our users safe; this commitment is written into our mission. As soon as we discover a critical issue in Firefox, we plan a rapid mitigation. This post will describe how we fixed a Pwn2Own exploit discovery in less than 22 hours, through the collaborative and well-coordinated efforts of a global cross-functional team of release and QA engineers, security experts, and other stakeholders.

Pwn2Own is an annual computer hacking contest. The goal of this event is to find security vulnerabilities in major software such as browsers. Last week, this event took place in Vancouver. Without getting into technical details of the exploit here, this blog post will describe how Mozilla responded quickly to ship updated builds of Firefox once an exploit was found during Pwn2Own.

We will share some of the processes that enable us to update and release a new version of the Firefox browser to hundreds of millions of users on a regular recurring basis.

This browser is a huge piece of software: 18 million+ lines of code, 6 platforms (Windows 32 & 64bit, GNU/Linux 32 & 64bit, Mac OS X and Android), 90 languages, plus installers, updaters, etc. Releasing such a beast involves coordination among many people from several cross-functional teams spanning locations such as San Francisco, Philadelphia, Paris, Cluj in Romania, and Rangiora in New Zealand.

The timing of the Pwn2Own event is known weeks beforehand, and so Mozilla is prepared! The Firefox train release calendar takes into consideration the timing of Pwn2Own. We try not to ship a new version of Firefox to end users on the release channel on the same day as Pwn2Own.

A Firefox Chemspill

A chemspill is a “security-driven dot release of our product.”  It’s an internal name for the Mozilla machinery that produces updated builds of Firefox on all channels (Nightly, Beta, Release, ESR) in response to an event that negatively impacts browser stability or user security.

Our rapid response model is similar to the way emergency personnel organize and mobilize to deal with a chemical spill and its hazards. All key people stop working on their current tasks and focus only on the cleanup itself. Because our focus is our end users, we need to ensure that they are using the safest and fastest version of Firefox!

This year, we created a private Slack channel prior to Pwn2Own to coordinate all the activity related to the event. The initial Slack group consisted only of security experts, directors of engineering, senior engineers, release managers and release engineers – essential staff.

We prepared a release checklist in advance with added items and a specific focus on the potential for a chemspill triggered by Pwn2Own. This document helped track the cross-functional tasks, their owners, status and due date, which helped track individual tasks and the necessary coordination. It also helped stakeholders view and report chemspill status down to the minute.

Screenshot of the release checklist

One of the members of our security team was attending the Pwn2Own event. After it was announced that one of the participants, Richard Zhu, found the security issue in Firefox, this Mozilla representative received the exploit directly from Richard Zhu as part of the regular Pwn2Own disclosure process for affected vendors. The bug was added to our bug tracking system at 10:59AM PDT on March 15th with the necessary privacy settings. Soon after, the chemspill team reviewed the issue and made a decision to ship updated builds ASAP.

In parallel, there was a discussion happening on the private Slack channel. When we saw the tweet from cybersecurity reporter @howelloneill that made the news public, we knew it was time to identify the developer who’d be getting to work on fixing the bug…

And so, quickly, the developer got to work.

The fix: planning, risk analysis, go-live timelines

While engineers were investigating the exploit and coming up with a fix, the cross-functional coordination needed to ship updated builds had already begun. The chemspill team met within 2 hours of the event. We discussed the next steps in terms of fix readiness, test plans, go-to-build, QA sign-offs, and determined the sequence of steps along with rough timelines. We needed to ensure a smooth hand-off from folks in North America to folks in Europe (France, Romania, UK) and then back to California by morning.

From the moment we had information about the exploit, two discussions began in parallel: a technical discussion on the bug tracking system; and a release-oriented discussion, driven by the release and security managers, on the Slack channel.

12 minutes later, at 11:11AM, a relevant developer is contacted.

11:17AM: The bug is updated to confirm that our long-term support release (ESR) has also been impacted by the issue.
12:32PM: Less than 3 hours after the disclosure, the developer provides a first patch addressing the issue.
14:21PM: An improved version of the fix is pushed.
15:23PM: This patch is pushed to the development branch. Then, in the next 70 minutes, we go through the process of getting the patch landed into the other release and pre-release repositories.

17:16PM: Little more than 6 hours after the publication of the exploit, the Beta and Release builds (desktop and Android) are in progress.

During the build phase

Let’s take a step back to describe the regular workflow that happens every time a new build of Firefox is released. Building the Firefox browser with our complete test suite for all platforms takes about 5 hours. While the builds are in progress, many teams are working in parallel.

Test plan

The QA team designs a test plan with the help of engineering. When fixing security issues, we always have two goals in mind:

  1. Verify that the fix addresses the security issue,
  2. Catch any other potential regressions due to the fix.

With these two goals, the QA team aims to cover a wide range of cases using different inputs.

For example, the following test case #3 has been played on the various impacted versions and platforms:

Test Case 3 (ogg enabled false – Real .ogg File)

  • Select a channel
  • Navigate to about:config
  • Set pref “media.ogg.enabled” to false
  • Download an .ogg file
  • Drag the .ogg file into the Mozilla build
  • Observe an error message/prompt “You have chosen to open [name of file].ogg
  • Try and open the file with Firefox as the application
  • Observe that Firefox does not play the selected .ogg file (or any sound)
  • Repeat step 1 for all builds (ESR, RC, Beta/DevEdition, Fennec)

Exploit analysis

In parallel, our security experts jumped on the exploit to analyze it.

They look closely at several things:

  • How the exploit works technically
  • How we could have detected the issue ourselves
  • The in progress efforts: How to mitigate this kind of attack
  • The stalled efforts: What we started but didn’t finish
  • The future efforts: Scoping the long term work to eliminate or mitigate this category of attacks

Outreach

The vulnerability was found to be in a library that did not originate with the Mozilla project, and is used by other software. Because we didn’t want to 0-day the vulnerable software library and make the vulnerability more widely known, we reached out to the maintainer of the library directly. Then, we investigated which other applications use this code and we tried to notify them and make them aware of the issue.

In parallel, we worked with the library maintainers to prepare a new version of the standalone library code.

Last but not least, as GNU/Linux distributions provide packages of this library, we also informed these distributions about the issue.

Once the builds are ready

After roughly 5 hours, the builds were ready. This is when the QA team starts executing the test plans.

They verify all the scenarios on a bunch of different platforms/operating systems.

A screenshot of the chart showing the readiness of all builds

In a matter of 22 hours, less than a day from when the exploit was found, Mozilla was ready to push updated builds of Firefox for Desktop and Android on our Nightly, Beta, ESR and release update channel.

For the release go live, the security team wrote the security advisories and created an entry for the CVE (Common Vulnerabilities and Exposures), a public reference that lists publicly known cybersecurity vulnerabilities.

And then, at the last moment, we discovered a second variant of the affected code and had to rebuild the Android version. This was also impacting Firefox ESR on ARM devices. We shipped this fix as well at 23:10PM.

Nobody likes to see their product get pwned, but as with so much in software development, preparation and coordination can make the difference between a chemspill where no damage is done, and a potentially endangering situation.

Through the combined work of several distributed teams, and good planning and communication, Mozilla was able to test and release a fix for the vulnerability as fast as possible, ensuring the security of users around the world. That’s a story we think is worth sharing.

Related Resources

If you’re interested in learning more about Mozilla’s security initiatives or Firefox security, here are some resources to help you get started:

Mozilla Security
Mozilla Security Blog
Bug Bounty Program
Mozilla Security playlist on YouTube

The post Shipping a security update of Firefox in less than a day appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2018/03/shipping-a-security-update-of-firefox-in-less-than-a-day/feed/ 3 32046
Hands-On Web Security: Capture the Flag with OWASP Juice Shop https://hacks.mozilla.org/2018/03/hands-on-web-security-capture-the-flag-with-owasp-juice-shop/ https://hacks.mozilla.org/2018/03/hands-on-web-security-capture-the-flag-with-owasp-juice-shop/#comments Fri, 09 Mar 2018 06:28:49 +0000 https://hacks.mozilla.org/?p=31965 A CTF (Capture the Flag) event is a type of security challenge or competition that can be used to teach or test online security. In this post, Mozilla security engineer and OWASP developer Simon Bennetts describes a recent CTF he hosted at a Mozilla event, and how to set up your own web security CTF with OWASP Juice Shop.

The post Hands-On Web Security: Capture the Flag with OWASP Juice Shop appeared first on Mozilla Hacks - the Web developer blog.

]]>
.post blockquote { border-left: .5em solid #eee; padding-left: 1.5em; margin-left: 0; }

As a developer, are you confident that you know what you need to know about web security? Wait, maybe you work in infosec. As a security specialist, are you confident that the developers you work with know enough to do the right thing?

Screenshot of OWASP Juice Shop

Often, these aren’t easy questions to answer, even for seasoned security professionals working with world class software engineers as we do at Mozilla.

OK, you can watch tutorial videos and take a variety of online tests, but it’s always more fun to try things in real life with a group of friends or colleagues. Our recent Mozilla all-hands was one of those opportunities.

A Capture the Flag (CTF) event offer a sociable hands-on way to learn about security and they are often a tradition at security conferences.

I’m part of the Mozilla Firefox Operations Security team and we work closely with all Mozilla developers to make sure that the core services Mozilla relies on to build, ship, and run Firefox are as secure as possible.

In this retrospective, I’ll show how you can easily set up a CTF event using free and open source software, as the Security team did back in December, when we gathered in Austin for Mozilla All Hands event.

Customizing OWASP Juice Shop

We chose OWASP Juice Shop, a web app designed intentionally for training purposes to be insecure. Juice Shop uses modern technologies like Node.js, Express and AngularJS, and provides a wide range of security challenges ranging from the simple to the complex. This was important for us since our participants had a wide range of skills, and included developers with little formal security training to professional penetration testers.

Juice Shop is a “single user application,” but it comes with a CTF mode and detailed instructions for Hosting a CTF Event. When this is turned on, the application generates “CTF-tokens” anytime someone solves one of the challenges. These can then be uploaded to a central scoring server. The CTF mode also disables the hints which might have made some of the challenges too easy for our more advanced players.

Juice Shop can be run in a wide variety of ways, but to make it easy for your participants I recommend using a docker image, as this has only one dependency: docker.

You can find the official Juice Shop docker image here: https://hub.docker.com/r/bkimminich/juice-shop/ or you can build your own if you want to customize it. You can customization instructions online.

We enabled the built-in CTF mode and changed the application name and the example products in order to make it feel more Firefox-y and to hide its origin (as solutions for the Juice Shop challenges are easily found on the internet).

Once we were happy with our changes we uploaded our image to dockerhub: mozilla/ctf-austin

Screenshot of Mozilla-customized OWASP Juice Shop

Setting Up a Scoring Server

You’ll want to set up a scoring server, to allow participants to upload their CTF-tokens and compare their scores with everyone else. It definitely helped encourage competition among our participants!

A scoring server should also provide a summary of each of the challenges and the points each challenge is worth. For this we used CTFd – it’s easy to install and there’s an officially supported tool for importing the Juice Shop challenges into CTFd which can be run using:

npm install -g juice-shop-ctf-cli
juice-shop-ctf

You’re then presented with a set of questions that allow you to tune the setup to your requirements.

Running the CTF

To get your CTF event underway you just need to tell participants the URL of your CTFd server and how to get Juice Shop running locally. If you are using the official image, here’s how to go about running Juice Shop locally:

docker pull bkimminich/juice-shop
docker run -d -e "NODE_ENV=ctf" -p 3000:3000 bkimminich/juice-shop

If you’re using your own image then change the image name, and if you have the CTF option enabled then your code wont need the -e "NODE_ENV=ctf" part:

docker pull mozilla/ctf-austin
docker run -d -p 3000:3000 mozilla/ctf-austin

In either case, participants will now be able to access their own local copy of Juice Shop via http://localhost:3000/

Although some of the Juice Shop security challenges can be solved just by using Firefox, a security tool that proxies your browser will really help.

A good option for this is OWASP ZAP (for which I’m the project leader), a free and open source security tool specifically designed to find security vulnerabilities in web applications.

ZAP sits between your browser and the application you want to test and shows all of the traffic that flows between them. It also allows you to intercept and change that traffic and provides a wide range of automated and manual features that can be used to test the application. If you use ZAP you won’t need to change your browser settings, as ZAP can launch Firefox (or any other locally installed browser) preconfigured to proxy through ZAP.

OWASP ZAP Dev Build

Remind all participants to explore Juice Shop as thoroughly as they can – you can’t find all the issues if there are features that you are not aware of. Suggest that they start with the easiest challenges w(the ones with the fewest points) and work upwards, as the challenges are designed to get progressively harder.

A graph of the top 10 teams and their results on the challenges

If you are running the CTF over several days (as we did), it’s a good idea to be available for help and advice. We set up a private irc channel, a Google group, and held daily check-in sessions where anyone could come along and ask us questions about the event, and get help on solving the challenges.

Graphs and charts from CTFd showing Score Charts, Key Percentages, Category Breakdowns

On the last day of our event, we held a final session to congratulate the winners, revealed the app’s origin and handed out Juice Shop stickers kindly provided by Björn Kimminich (the JuiceShop project lead).

Outcomes and Next Steps

Running a Capture the Flag event is a great way to raise security awareness and knowledge within a team, a company, or an organization.

Juice Shop is an ideal application for a CTF as its based on modern web technologies and includes a wide range of challenges. It’s very well thought out and well supported.The fact that it’s a real application with realistic vulnerabilities, rather than a set of convoluted tasks, makes it ideal for learning about application security.

Our Mozilla/Firefox custom Juice Shop app is available at https://github.com/mozilla/ctf-austin. Unless you particularly want to use a Mozilla-branded version, we recommend the original Juice Shop app: https://github.com/bkimminich/juice-shop. (Note: It has already been updated since we forked our copy.)
And if you haven’t played with it yet, then I strongly recommend doing so. It’s a lot of fun and you’ll almost certainly learn something.

In the end, over 20 people registered for our event and their feedback was very positive:

“The cookie / JWT stuff is the most illuminating part of this.”

“This whole thing is excellent thanks for putting it together.”

“I hate the fact I can’t focus on my things because I’d like to solve more ctf tasks and learn something.”


“It’s awesome because I’m planning to improve my sec skills.”


“This has been a lot of fun – thanks for setting it up.”

Photo of Mozilla Y'All-Hands CTF participants

Not surprisingly 2 of our pen testers who took part did very well, but they were given a run for their money by one of our operations staff who clearly knows a lot about security!

Do you have a knack for uncovering security vulnerabilities? At Mozilla, we have a Web and Services Bug Bounty Program. We welcome your help in making Mozilla even more secure. You could even earn some bounty rewards for your efforts. And we’re always looking for contributors to help us make ZAP better, so if that sounds interesting, have a look at Contributing to OWASP ZAP.

The post Hands-On Web Security: Capture the Flag with OWASP Juice Shop appeared first on Mozilla Hacks - the Web developer blog.

]]>
https://hacks.mozilla.org/2018/03/hands-on-web-security-capture-the-flag-with-owasp-juice-shop/feed/ 1 31965