FAQ
What's fuzzing?
Fuzz testing is a dynamic testing method to identify bugs and vulnerabilities early in software development. It enables developers to ensure secure and stable software deployment by executing programs with invalid or random inputs. With these inputs, fuzzers uncover potential crashes and offer detailed feedback on code coverage. This process not only detects bugs but also generates additional inputs to reproduce Findings, thus maximizing code coverage without false positives.
Memory unsafe languages
Fuzzing is particularly effective when applied to memory unsafe languages like C and C++. These languages are more prone to memory-related vulnerabilities, such as buffer overflows and uninitialized memory usage. Fuzzing can identify these issues early in the development cycle, enabling developers to fix them before they can be exploited by attackers. Furthermore, the process of fuzzing can reveal complex code paths or edge cases that other testing methods might miss.
Memory safe languages
While memory safe languages like Java, JavaScript, Python, or Rust have built-in mechanisms to prevent many common memory-related vulnerabilities, fuzzing can identify logic errors, incorrect assumptions, or other application-specific vulnerabilities that might not be related to memory management. Additionally, fuzzing can uncover issues related to third-party libraries or dependencies, which could introduce security risks. For applications developed in memory safe languages, fuzzing contributes to increased reliability, performance, and security by discovering and allowing developers to address these potential weaknesses.
What goals can fuzzing achieve?
Fuzzing is a versatile and effective testing technique that can be applied to a wide range of programming languages and applications. Regardless of whether a language is memory safe or memory unsafe, fuzzing aids in achieving key goals such as strengthening security, enhancing stability, and ensuring robustness. By incorporating fuzzing into the software development process, developers can proactively identify and resolve vulnerabilities, leading to more reliable and secure applications.
What results can you expect from fuzzing?
Fuzzing provides valuable feedback and output in the form of uncovered software vulnerabilities and inconsistencies. As you input a variety of unexpected or random data into your system, the system responds, often in unforeseen ways. These responses can reveal hidden bugs, exceptions, memory leaks, or even system crashes. Some fuzzing tools offer detailed reports on each fuzzing session, including information on input data that led to unexpected behavior, system status, and potential vulnerabilities discovered.
It's essential to remember that fuzzing doesn't assure you of the absence of bugs but instead reveals their presence. Its strength lies in its capacity to expose unforeseen edge cases that could lead to potential security breaches or system crashes. Interpreting the output from fuzzing requires a solid understanding of the SUT and an analytical mindset to determine the potential impact of the identified issues.
How long should a fuzz test run?
The ideal duration for a fuzz test depends on various factors such as the complexity of the target application, the resources available, and the specific goals of the testing process. As a general rule of thumb, you can say the longer you fuzz, the higher the chances to discover bugs and vulnerabilities. Considering some key factors can help guide you in deciding how long to run your fuzz tests:
- Complexity of the target application: Larger and more complex applications typically require longer fuzzing sessions to effectively explore various code paths and potential vulnerabilities. A more comprehensive testing process increases the likelihood of discovering hidden issues and ensuring the application’s robustness.
- Available resources: The amount of time you can allocate to fuzzing depends on your available resources, such as computing power and personnel. If resources are limited, you need to prioritize certain components of the application or focus on specific vulnerabilities. You can also consider using distributed fuzzing to scale your testing efforts and reduce the time required.
- Goals and priorities: Your fuzz testing objectives influence the duration of the test. If you’re aiming to meet specific security requirements or achieve a certain level of code coverage, you may need to run fuzz tests for a longer duration to achieve those goals. On the other hand, if you’re running fuzz tests as part of a continuous integration process, shorter, more frequent fuzzing sessions may be more appropriate.
Guidelines for fuzzing duration
As a starting point, consider running individual fuzz tests for at least 24 hours. This often provide sufficient time to discover many common issues. However, for more critical or complex applications, it’s not uncommon to run fuzz tests for several weeks or even months. Continuously monitoring the progress of your fuzz testing and analyzing the results can help you determine if additional time is needed.
For ongoing development projects, integrating fuzz testing into your CI/CD pipeline can help ensure that newly introduced code is regularly tested. By continuously fuzzing your application, you can proactively identify and fix vulnerabilities before they make their way into production.
To make the most of your fuzz testing efforts, consider starting with a baseline duration and then adjust as needed based on your Finding and priorities. Continuous fuzz testing can help ensure the ongoing security and stability of your application as fuzz testing isn't a one-shot solution.
The metric “x since last new branch“ provides a good indicator of the saturation of a fuzz test. If a fuzz test has been running for three days and no new path has been found for two and a half days, it's unlikely that a new path can be found. This could be because all paths have already been discovered or the fuzz test can't reach the paths. You can review the coverage of the fuzz test to determine the cause and improve the fuzz test.
How often should you run a fuzz test?
Many factors play a role in the estimation:
-
Magnitude and complexity of the software: The bigger and more complex the software, the more often it should be fuzzed. With each new fuzz run, the coverage can be increased and the likelihood of finding hidden bugs increases.
-
New or old software: New software should be fuzzed often and continuously as it changes and expands frequently. With older software it makes sense to test it when something has changed.
-
Feature frequency: Software projects that require many changes and/or new features in a short period of time should be tested in an aligned manner. Bugs should be found and fixed immediately after the feature has been created. The later they're found, the more effort is required to fix them.
-
Project resources and budget: More fuzzing likely leads to finding bugs earlier. Finding and fixing bugs late in the project is expensive and can lead to additional cost as resources and budgets need to be reallocated.
Generally, fuzzing can be done at the unit test-, system test level or at least before a major release.
At the unit test level, fuzzing can be integrated into the CI/CD pipeline, for example triggering a fuzzing run with each pull request.
At the system test level, fuzzing can be performed less frequently but you can fuzz the entire system with it. The fuzz duration is significantly longer here and it could be expensive to fix a bug at that point in time.
Fuzzing can be automated to a great extent, so that it can be used as a regression test in many cases.
What's a good coverage percent for a project?
There is no absolute target percentage as it depends on the requirements of the project.
Higher code coverage generally indicates that the fuzzer is going through more paths and therefore has a higher likelihood of discovering bugs.
For example, if the coverage should be 80% for a project with 1000 lines of code, this would be only 800 lines of code, which isn't difficult to achieve. For a project with 1,000,000 lines of code, the coverage would be 800,000 lines of code which is very optimistic and not easily achievable. Therefore, for a project with over a million lines of code, even small percentage of coverage is good coverage.
While this is a useful metric to measure the effectiveness of fuzzing, it shouldn't be the main criteria. Improving the corpus and fuzz test with a more real-word scenario can be much more effective at finding more critical bugs.
How to improve code coverage?
To improve code coverage, it's helpful to generate a source line coverage report and investigate the report in detail. Usually the report format can be specified to enable you to view it in a browser or your IDE. You can create a coverage report in CI Fuzz with the following command:
cifuzz coverage <fuzz_test>
When reviewing code coverage, analyze why a specific code part is no covered:
-
The executed fuzz test is theoretically unable to reach the code. Write more tests or adjust the existing tests to reach new code parts.
-
The fuzzing time was too short. Fuzzing isn't deterministic and the fuzzing engine might have focused on another part of the code during that specific run. Check the fuzzer statistics and when the last new path was found. If coverage during the run is still increasing, that's a good indicator to allow the fuzzer to continue to run until coverage plateaus.
-
The SUT expects complex input data and edge cases are never covered by the fuzzer. Value profiling can help the fuzzer get more precise feedback from the application. If you want to quickly increase the code coverage, it's suggested to add seed corpora and a dictionary for the fuzz test to help the fuzzer generate syntactic and semantically valid inputs.
-
The fuzzer could regularly run into a specific blocker. This can be an issue in the tested code that the fuzzer re-discovers regularly which needs to be fixed to unblock the fuzzer. You can also aid the fuzzer by including checksums to verify the integrity of inputs.
How to determine a good fuzz test?
-
Determinism: A good fuzz test needs to have the same behavior given the same input and shouldn't use any additional source of randomness.
-
Speed: A good fuzz test should be efficient since fuzzing requires many iterations.
-
Memory consumption: For CPU-efficient fuzzing, the fuzz test should consume less RAM than available on the given machine per one CPU core.
-
Coverage discoverability: It's important to ensure that the fuzz test can discover a large subset of reachable control flow edges without using the seed corpus. If a fuzz test without seed corpus doesn't provide coverage comparable to with a seed corpus, consider splitting it into smaller tests and use dictionaries or structure-aware fuzzing.
-
I/O: A good fuzz test shouldn't use I/O because it can introduce non-deterministic behavior and make Findings harder to reproduce. Avoid debugging output to
StdErr
orStdOut
as it slows down fuzzing. Avoid writing to a disk generally and reading from a disk other than during initialization.
See the Google fuzzing guide for further information.
What types of bugs does the fuzzer catch automatically?
Each fuzzer typically has a set of sanitizers or bug-detectors to automatically detect specific types of bugs.
LibFuzzer
LibFuzzer works in conjunction with several sanitizers.
AddressSanitizer (ASan) is capable of finding various memory related bugs:
- Out-of-bounds accesses to heap, stack, and globals
- Use-after-free, use-after-return, use-after-scope
- Double-free, invalid free
- Memory leaks (experimental)
UndefinedBehaviorSanitizer (UBSan) is used to detect several types of undefined behavior at runtime:
- Array subscript out of bounds, where the bounds can be statically determined
- Bitwise shifts that are out of bounds for their data type
- Dereferencing misaligned or null pointers
- Signed integer overflow
- Conversion to, from, or between floating-point types which would overflow the destination
Other sanitizers:
- ThreadSanitizer (data races)
- MemorySanitizer (uninitialized reads)
- LeakSanitizer (memory leaks)
- DataFlowSanitizer (data flow analysis, no bugs)
Jazzer
Since Java is a memory safe language, there are no sanitizers that focus on memory corruption bugs but Jazzer can detect uncaught exceptions, memory leaks and infinite loops that can lead to DoS attacks.
It can also detect a variety of bug classes common to the Java software ecosystem:
- Deserialization - unsafe deserialization that leads to attacker-controlled method calls
- Expression Language Injection - injectable inputs to an expression language interpreter
- LDAP Injection - LDAP DN and search filter injections
- Naming Context Lookup - JNDI lookups such as log4j
- Command Injection - unsafe execution of OS commands using ProcessBuilder
- Reflective Call - unsafe calls that lead to attacker-controlled class loading
- Regular Expression Injection - regular expression based injection that lead to OOM
- Script Engine Injection - insecure user input in script engine invocation
- Server Side Request Forgery - unsafe network connection attempts
- SQL Injection - SQL injections
- XPath Injection - XPath injections
Jazzer.js
JavaScript is also memory safe, so Jazzer.js focuses similar to Jazzer on detecting uncaught exceptions, memory leaks and infinite loops. Through its sanitizers Jazzer.js can also detect a variety of bug classes common to the JavaScript software ecosystem:
- Command Injection - unsafe execution of OS commands using ProcessBuilder class
- Path Traversal - inputs triggering the access of files and directories outside the current one
How to assess Findings?
You can list all local Findings CI Fuzz detected with the following command:
cifuzz findings
You can review detailed information to a specific Finding, including stack trace, sanitizer details, severity, description & mitigation, with the following command:
cifuzz finding <name>
In general, all Findings discovered by a fuzzer can be assessed by analyzing the generated stack trace and crashing input. For every Finding, the fuzzer generates a crashing input that can reproduce the issue. This can be used as a regression test, after a fix is implemented.
Static assessment of Finding details is often insufficient and requires additional dynamic analysis. A debugger should be attached during execution of the fuzz test running with the crashing input of the Finding. The stack traces are valuable in the debugging process to identify interesting code positions to place breakpoints and investigate the memory state.
How to continue fuzzing after a Finding is triggered?
Java
--keep_going=<numbers of crashes allowed>
You can also set this in the cifuzz.yaml
instead of adding the flag to the command. This is applied for each local
fuzz test.
engine-args:
- --keep_going=10
C/C++
For C/C++, use the following environment variable:
ASAN_OPTIONS=detect_leaks=0:halt_on_error=0
Due to technical limitations, certain bugs can still cause the fuzzing engine to stop or affect the overall behavior and cause false positive.
Which code to instrument and which code to ignore?
When it comes to fuzzing, deciding which code to instrument and which code to ignore is critical to maximizing the performance and effectiveness of your fuzzing campaign. Proper instrumentation can lead to better code coverage, more efficient execution, and improved vulnerability discovery.
Code coverage vs. bug detection instrumentation
Instrumentation has two primary goals: maximizing code coverage and detecting bugs. Most often, these goals are distinct and subject to careful considerations.
Code coverage
Instrumentation for code coverage guides the fuzzer while exploring as many code paths as possible within a target application. The fuzzer collects information about code execution paths and uses this to generate new inputs to trigger previously unexplored paths. To maximize code coverage, it's essential to instrument critical and complex parts of an application, as well as any custom libraries or components.
Depending on the fuzzer and its configuration, there are different granularity of code coverage collection metrics, for example function-, basic block-, or instruction-based metrics which can all impact the performance. To minimize this impact, prioritize instrumenting the most critical parts of your application first. Avoid instrumentation of widely tested third-party libraries or system libraries to reduce overhead and improve the overall performance of a fuzzing campaign. If you suspect vulnerabilities in third-party libraries, you may choose to instrument them selectively on top of your own application.
Bug detection
To maximize the effectiveness of a fuzzing campaign in discovering vulnerabilities, you can add additional code instrumentation. This is typically worth the significant performance impact.
An example for C/C++ applications is the AddressSanitizer, which introduces on average a 2x slowdown on its own. It’s advisable when using bug detection tools, to primarily instrument code that's more likely to contain memory-related vulnerabilities, such as custom memory management implementations or components that handle complex data structures.
If a partial instrumentation isn't possible, avoid stacking multiple bug detectors in a single fuzzing campaign. Instead, run multiple fuzzing campaigns either in parallel or in sequence, each with a different bug detector instrumentation.
Deciding which code to instrument and which code to ignore during a fuzzing campaign is crucial to optimizing performance and vulnerability discovery. By focusing on instrumenting critical components, security-sensitive code, and areas prone to memory-related issues, you can maximize the effectiveness of your fuzzing efforts. Balancing the goals of code coverage and bug detection, along with being mindful of the impact on execution performance, helps to ensure a successful fuzzing campaign. Code coverage instrumentation likely always has a higher prioritization than specific bug detection capabilities, as without the former the chance of hitting deep code paths with a fuzzer is minimized and therefore those code areas, which were instrumented with a bug detector may never be reached.
What's a dictionary?
If the code expects certain keywords, formats, or syntax as input, you can provide these with a dictionary file to the fuzzing engine. This can improve the speed at which the fuzzing engine generates new inputs that trigger new code paths during execution. You can find examples of Dictionaries for various input types and formats in this GitHub repository.
What's a corpus?
A corpus is a set of test input values that trigger code paths in the target code. These come in two primary forms:
-
Seed corpus: A small corpus hand-crafted or generated to trigger initial code path execution at the first fuzz test execution. If you're fuzzing a well known protocol, any sample data from real world examples can be helpful to assist the fuzzer to quickly generate syntactic and semantic valid input data. You can find a number of corpora for various data formats in this GitHub repository.
-
Test or fuzzing corpus: In addition to the seed corpus, whenever the fuzzing engine generates a new input that leads to new code paths, it saves this input for further mutation and uses it in future runs. At the beginning of every fuzz test, then engine re-evaluates the test corpus to check which inputs still lead to unique code paths and takes only those into account for mutations in the run.
Corpora can slow down the fuzzing process when incorrectly selected or tuned corpora are utilized. If the inputs from the corpus don’t lead to triggering unique code paths, these executions are “wasted”. The process of tuning the corpora to only include relevant inputs that lead to unique paths is called “corpus minimization”.
How to mock/replace functions?
When fuzz testing projects, it may be necessary to mock or replace functions, similar to unit testing. There are two main reasons for mocking functions:
- The SUT depends on external functions from other software that are outside the scope of the fuzzing setup
- The SUT contains functions that can't or shouldn't be executed in their original form when fuzzing. For example, functions with specific hardware dependencies that aren't satisfied on the hardware you are running the fuzz tests on.
Creating mocks for fuzz testing
Starting with a simple approach and improving over time is often the best way to create mocks for fuzz testing. You can create mocks by defining a function that always returns a default value, such as 0. If this method doesn't achieve the desired goals, you can adjust the mocks to return data taken from the input generated by the fuzzer instead of returning fixed values.
Replacing functions in C/C++ projects
You can replace already existing function definitions without compiler or linker errors with the following techniques. The appropriate approach depends on the type of the function you want to replace.
- Wrapping functions
- Works well for statically linked functions called from other files
- To wrap a function, add linker flags to the project
- The function mangling in C++ can be confusing
- Allows to call the original function
- Overwriting functions
- Works well for statically linked functions even if called from within the same file
- Requires that the symbols of the functions to be replaced, are weak (use tools like
objdump
for this) - The original function can't be called
LD_PRELOAD
- Works well for dynamically linked functions
- Requires the
LD_PRELOAD
environment variable to be set when executing the fuzz test - Allows to call the original function
If none of the preceding techniques are applicable, as a last resort, change the code of the SUT to support fuzzing.
How to use the FuzzedDataProvider
?
By default, the fuzzer provides an array of raw bytes to the fuzz test. It would be up to the fuzz test creator to
allocate and convert the byte array as needed. The FuzzedDataProvider
(FDP) is a class that provides the ability to
easily split the input data generated by the fuzzer. This is useful whenever you need to easily generate different data
types (string
,int
, float
, etc.) as part of your fuzz test. An FDP is provided for multiple languages / fuzzers.
FDP data types
All implementations of the FDP provide similar convenience functions for most common data
types (string
,int
, float
, etc.). You can use a variety of methods:
-
If you need a single value, you can use methods like
ConsumeBool
,ConsumeIntegral
orConsumeFloatingPoint
. Many of these methods allow you to specify a range of valid values as well. -
If you need a sequence of bytes, you can use methods like
ConsumeBytes
,ConsumeBytesAsString
. -
There are also methods that include the remaining word.
ConsumeRemainingBytesAsString
returns a string from the bytes remaining in the fuzzing input buffer. Call these methods as the last FDP method in the fuzz test, because the buffer is empty after the call. -
If you need to pick a random value from an array, you can use PickValueInArray methods.
C++
The FDP for C++ is a single-header library that's included as part of LLVM. You can include it in your fuzz test with:
#include <fuzzer/FuzzedDataProvider.h>
. To create the FDP object, pass it the raw fuzzer input buffer and size. Below
is an example of including, creating, and using an FDP object:
#include "src/explore_me.h"
#include <cifuzz/cifuzz.h>
#include <fuzzer/FuzzedDataProvider.h>
#include <stdio.h>
FUZZ_TEST_SETUP() {}
FUZZ_TEST(const uint8_t *data, size_t size) {
//create FuzzedDataProvider object from fuzzer input buffer and size
FuzzedDataProvider fuzzed_data(data, size);
//create an int from FDP
int a = fuzzed_data.ConsumeIntegral<int>();
//create a 2nd int from FDP
int b = fuzzed_data.ConsumeIntegral<int>();
//generate a string from the remaining bytes in the input buffer
std::string c = fuzzed_data.ConsumeRemainingBytesAsString();
exploreMe(a, b, c);
}
Java
The FDP for Java is part of Jazzer. You can find the Javadocs for FDP here. To use the FDP, just import it in your fuzz test and create a method that accepts an FDP as an argument:
package com.example;
import com.code_intelligence.jazzer.api.FuzzedDataProvider;
import com.code_intelligence.jazzer.junit.FuzzTest;
public class FuzzTestCase {
@FuzzTest
void myFuzzTest(FuzzedDataProvider data) {
//create an int from FDP
int b = data.consumeInt();
//generate a string from the remaining bytes in the input buffer
String c = data.consumeRemainingAsString();
ExploreMe ex = new ExploreMe(a);
ex.exploreMe(b, c);
}
}
JavaScript
Jazzer.js provides an implementation of FuzzedDataProvider
.
To use the FDP, just import it and create a function that accepts the raw fuzzer input. Create a FuzzedDataProvider
object from the raw fuzzer input.
const { FuzzedDataProvider } = require("@jazzer.js/core");
/**
* @param { Buffer } fuzzerInputData
*/
module.exports.fuzz = function (fuzzerInputData) {
const data = new FuzzedDataProvider(fuzzerInputData);
//create a string with max length between 10-15 and utf-8 encoding
const s1 = data.consumeString(data.consumeIntegralInRange(10, 15), "utf-8");
//consume 1 byte to create an unsigned integer
const i1 = data.consumeIntegral(1);
//consume 2 bytes to create an unsigned integer
const i2 = data.consumeIntegral(2);
//consume 4 bytes to create an unsigned integer
let i3 = data.consumeIntegral(4);
if (i3 === 1000) {
if (s1 === "Hello World!") {
if (i1 === 3) {
if (i2 === 3) {
throw new Error("Crash!");
}
}
}
}
};