Skip to content

Fuzzing Utilities, and bson2json+json2bson tools #1000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ include (CMakeDependentOption)

include(MongoC-Warnings)

# "Fuzzing" must be included before "Sanitizers," to enable fuzzer sanitizer
include (Fuzzing)

# Enable CCache, if possible
include (CCache)

Expand Down
1 change: 1 addition & 0 deletions build/cmake/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ set (build_cmake_MODULES
FindSASL2.cmake
FindSnappy.cmake
FindSphinx.cmake
Fuzzing.cmake
LoadVersion.cmake
MaintainerFlags.cmake
MongoCPackage.cmake
Expand Down
131 changes: 131 additions & 0 deletions build/cmake/Fuzzing.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
option (ENABLE_FUZZING "Enable fuzzing using LLVM libFuzzer" OFF)

if (ENABLE_FUZZING)
# This will add another sanitizer when we later include Sanitizers.cmake
list (APPEND MONGO_SANITIZE "fuzzer-no-link")
endif ()

include (ProcessorCount)
ProcessorCount (_FUZZER_PARALLELISM)

set (_FUZZERS_OUT_DIR "${CMAKE_CURRENT_BINARY_DIR}/fuzzers")


#[[
Generate an executable target that links and runs with LLVM libFuzzer.

Amongst the given source files there must be one definition of LLVMFuzzerTestOneInput.
Refer: https://www.llvm.org/docs/LibFuzzer.html#fuzz-target

This will define an executable with the given name, and all additional
arguments will be given as source files to that executable. This executable
will be linked with the '-fsanitize=fuzzer' command-line option.

This will additionally define a custom target "run-fuzzer-${name}," which,
when executed, will run the fuzzer executable with a set of pre-defined
libFuzzer command-line options.

The following target properties can be used to control the 'run-fuzzer'
target:

FUZZER_FORK (integer)
Set the number of parallel fuzzer tasks to run. The default is the
parallelism of the host plus four.

FUZZER_TIMEOUT (integer, seonds)
Set the maximum amount a single fuzzer task should run before the fuzzer
consideres it to be "stuck" and to generate a timeout report for the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
consideres it to be "stuck" and to generate a timeout report for the
considers it to be "stuck" and to generate a timeout report for the

given input.

FUZZER_LEN_CONTROL (integer, 1-100)
Set the len_control option for the libFuzzer run. Lower values tend to
generate larger inputs. Default is 50.

FUZZER_MAX_LEN (integer, bytes)
Set the maximum input size for a fuzzer input. The default is 4096.

FUZZER_ONLY_ASCII (boolean)
If TRUE, only valid ASCII will be given as fuzzer input.
The default is FALSE.

FUZZER_DICT (filepath)
Set to a filepath of a fuzzer dictionary.
Refer: https://www.llvm.org/docs/LibFuzzer.html#dictionaries
Default is to have no dictionary.

Fuzzer executables are written to to the <BUILD_DIR>/fuzzers directory.

This will unconditionally define the target and the custom target that
executes it, but it will be EXCLUDE_FROM_ALL=TRUE if the CMake setting
ENABLE_FUZZING is not true.
]]
function (mongoc_add_fuzzer name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the run-fuzzer-bson-fuzz target expected to complete quickly? I suspect I am doing something wrong or misunderstanding. But running cmake --build cmake-build --target run-fuzzer-bson-fuzz produces this output:

% cmake --build cmake-build --target run-fuzzer-bson-fuzz                
[0/1] cd /Users/kevin.albertson/review/mongo-c-driver-1000/cmake-build/fuzze...bertson/review/mongo-c-driver-1000/cmake-build/fuzzers/bson-fuzz.out//corpus

  Running fuzzer program : /Users/kevin.albertson/review/mongo-c-driver-1000/cmake-build/fuzzers/bson-fuzz.debug
     Corpus is stored in : /Users/kevin.albertson/review/mongo-c-driver-1000/cmake-build/fuzzers/bson-fuzz.out//corpus
  Crashes will appear in : /Users/kevin.albertson/review/mongo-c-driver-1000/cmake-build/fuzzers/bson-fuzz.out/

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2410153351
INFO: -fork=12: fuzzing in separate process(s)
INFO: -fork=12: 0 seed inputs, starting to fuzz in /var/folders/pv/p1jss0l97mq0ddr7rjbcbdt00000gp/T//libFuzzerTemp.FuzzWithFork16948.dir
#0: cov: 0 ft: 0 corp: 0 exec/s 0 oom/timeout/crash: 0/0/0 time: 0s job: 6 dft_time: 0
INFO: log from the inner process:
INFO: Seed: 2410369315
INFO:        0 files found in /var/folders/pv/p1jss0l97mq0ddr7rjbcbdt00000gp/T//libFuzzerTemp.FuzzWithFork16948.dir/C6
INFO: DataFlowTrace: reading from '/var/folders/pv/p1jss0l97mq0ddr7rjbcbdt00000gp/T//libFuzzerTemp.FuzzWithFork16948.dir/DFT'
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED exec/s: 0 rss: 29Mb
INFO: 0/0 inputs touch the focus function
INFO: 0/0 inputs have the Data Flow Trace
ERROR: no interesting inputs were found. Is the code instrumented for coverage? Exiting.
INFO: exiting: 256 time: 0s

I expected the fuzzer to run for some time, and this completed instantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is abnormal. It means the execution is not generating any coverage data. I'll investigate when I get free time.

add_executable ("${name}" ${ARGN})
# Run with 4 more jobs than hardware parallelism
math (EXPR default_fork "${_FUZZER_PARALLELISM} + 4")
set_target_properties("${name}" PROPERTIES
# Qualify the filename with the build type:
DEBUG_POSTFIX ".debug"
RELEASE_POSTFIX ".opt"
RELWITHDEBINFO_POSTFIX ".opt-debug"
# Put them all in the fuzzers/ directory:
RUNTIME_OUTPUT_DIRECTORY "${_FUZZERS_OUT_DIR}"
RUNTIME_OUTPUT_DIRECTORY_RELEASE "${_FUZZERS_OUT_DIR}"
RUNTIME_OUTPUT_DIRECTORY_DEBUG "${_FUZZERS_OUT_DIR}"
RUNTIME_OUTPUT_DIRECTORY_RELWITHDEBINFO "${_FUZZERS_OUT_DIR}"
# Target options to control the fuzzer run:
FUZZER_FORK "${default_fork}"
FUZZER_TIMEOUT "10"
FUZZER_LEN_CONTROL "50"
FUZZER_MAX_LEN "4096"
FUZZER_ONLY_ASCII "FALSE"
)
# Link with the libFuzzer runtime:
target_link_libraries ("${name}" PRIVATE -fsanitize=fuzzer)

set (dict "$<TARGET_PROPERTY:${name},FUZZER_DICT>")
set (art_dir "$<TARGET_FILE_DIR:${name}>/${name}.out/")
add_custom_target(run-fuzzer-${name}
COMMAND "${CMAKE_COMMAND}" -E make_directory "${art_dir}/corpus"
# Print some usefile info for the user:
COMMAND "${CMAKE_COMMAND}" -E echo
COMMAND "${CMAKE_COMMAND}" -E echo
" Running fuzzer program : $<TARGET_FILE:${name}>"
COMMAND "${CMAKE_COMMAND}" -E echo
" Corpus is stored in : ${art_dir}/corpus"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" Corpus is stored in : ${art_dir}/corpus"
" Corpus is stored in : ${art_dir}corpus"

Slash is already present in ${arg_dir}.

COMMAND "${CMAKE_COMMAND}" -E echo
" Crashes will appear in : ${art_dir}"
COMMAND "${CMAKE_COMMAND}" -E echo
# Run the fuzzer:
COMMAND
"${CMAKE_COMMAND}" -E chdir "${art_dir}"
"$<TARGET_FILE:${name}>"
-create_missing_dirs=1
-collect_data_flow=1
-shrink=1 # Try to shrink the test corpus
-use_value_profile=1
-ignore_timeouts=0 # Do not ignore timeouts
-ignore_ooms=0 # Do not ignore OOMs
-reload=10 # Reload every ten seconds
"-artifact_prefix=${art_dir}"
# Target property options:
"-fork=$<TARGET_PROPERTY:${name},FUZZER_FORK>"
"-timeout=$<TARGET_PROPERTY:${name},FUZZER_TIMEOUT>"
"-max_len=$<TARGET_PROPERTY:${name},FUZZER_MAX_LEN>"
"-len_control=$<TARGET_PROPERTY:${name},FUZZER_LEN_CONTROL>"
"-only_ascii=$<BOOL:$<TARGET_PROPERTY:${name},FUZZER_ONLY_ASCII>>"
"-analyze_dict=$<BOOL:${dict}>"
"$<IF:$<BOOL:${dict}>,-dict=${dict},${art_dir}/corpus>"
"${art_dir}/corpus"
WORKING_DIRECTORY "${_FUZZERS_OUT_DIR}"
DEPENDS "${name}"
VERBATIM USES_TERMINAL
)

# We might not want to build by default:
if (NOT ENABLE_FUZZING)
# Fuzzing is not enabled. Exclude the target from being built by default, but still define
# it so that CMake can verify that it is used correctly.
set_property (TARGET "${name}" PROPERTY EXCLUDE_FROM_ALL TRUE)
endif ()
endfunction ()
19 changes: 17 additions & 2 deletions src/libbson/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -473,22 +473,37 @@ endif ()
add_subdirectory (build)
# sub-directory 'doc' was already included above
add_subdirectory (examples)
add_subdirectory (fuzz)
add_subdirectory (src)
add_subdirectory (tests)

set_local_dist (src_libbson_DIST_local
CMakeLists.txt
NEWS
THIRD_PARTY_NOTICES
fuzz/bson.fuzz.c
fuzz/json.fuzz.c
)

mongoc_add_fuzzer (json-fuzz fuzz/json.fuzz.c)
target_link_libraries(json-fuzz PRIVATE bson_static)

mongoc_add_fuzzer (bson-fuzz fuzz/bson.fuzz.c)
target_link_libraries(bson-fuzz PRIVATE bson_static)
set_property(TARGET bson-fuzz PROPERTY FUZZER_MAX_LEN 65536)

add_executable (bson2json tools/bson2json.main.c)
add_executable (json2bson tools/json2bson.main.c)
target_link_libraries(bson2json PRIVATE bson_static)
target_link_libraries(json2bson PRIVATE bson_static)
set_target_properties(bson2json json2bson PROPERTIES
RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}"
)

set (src_libbson_DIST
${src_libbson_DIST_local}
${src_libbson_build_DIST}
${src_libbson_doc_DIST}
${src_libbson_examples_DIST}
${src_libbson_fuzz_DIST}
${src_libbson_src_DIST}
${src_libbson_tests_DIST}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The make-release-archive task task failure may be resolved by including bson2json.main.c and json2bson.c in the distribution tarball.

I suggest adding a CMakeLists.txt in the src/libbson/tools directory to set the variable ${src_libbbson_tools_DIST}.

PARENT_SCOPE
Expand Down
4 changes: 0 additions & 4 deletions src/libbson/fuzz/CMakeLists.txt

This file was deleted.

15 changes: 15 additions & 0 deletions src/libbson/fuzz/bson.fuzz.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#include <bson/bson.h>

#include <stdint.h>

int
LLVMFuzzerTestOneInput (const uint8_t *data, size_t len)
{
bson_t *b = bson_new_from_data (data, len);
if (!b) {
return 0;
}
bson_validate (b, 0xffffff, NULL);
bson_destroy (b);
return 0;
}
19 changes: 0 additions & 19 deletions src/libbson/fuzz/fuzz_test_libbson.c

This file was deleted.

11 changes: 11 additions & 0 deletions src/libbson/fuzz/json.fuzz.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#include <bson/bson.h>

#include <stdint.h>

int
LLVMFuzzerTestOneInput (const uint8_t *data, size_t len)
{
bson_t *b = bson_new_from_json (data, (ssize_t) len, NULL);
bson_destroy (b);
return 0;
}
54 changes: 54 additions & 0 deletions src/libbson/tools/bson2json.main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#include <bson/bson.h>

#include "./common.h"


int
main (int argc, char **argv)
{
if (argc != 1) {
fputs ("Usage:\n"
" Pipe a BSON document through standard input, and this program\n"
" will write JSON data to standard output.\n",
stderr);
return 1;
}

int retcode = 0;

read_result read = read_stream (stdin);
if (read.error) {
fprintf (stderr, "Failed to read from stdin: %s", strerror (read.error));
retcode = 2;
goto read_fail;
}

bson_t b;
if (!bson_init_static (&b, read.data, read.len)) {
fputs ("Failed to read BSON: Invalid header\n", stderr);
retcode = 3;
goto bson_init_fail;
}

size_t len;
char *json = bson_as_canonical_extended_json (&b, &len);
if (!json) {
fputs ("Failed to create JSON data\n", stderr);
retcode = 4;
goto json_fail;
}

const char *jptr = json;
for (size_t remain = len; remain;) {
size_t nwritten = fwrite (jptr, 1, remain, stdout);
remain -= nwritten;
jptr += nwritten;
}

json_fail:
bson_free (json);
bson_init_fail:
free (read.data);
read_fail:
return retcode;
}
58 changes: 58 additions & 0 deletions src/libbson/tools/common.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#ifndef BSON_TOOLS_COMMON_H_INCLUDED
#define BSON_TOOLS_COMMON_H_INCLUDED

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

enum { PRINT_TRACE = 0 };
#define TRACE(S, ...) \
if (PRINT_TRACE) { \
fprintf (stderr, S "\n", __VA_ARGS__); \
} else \
((void) (0))

typedef struct read_result {
uint8_t *data;
size_t len;
int error;
} read_result;

static inline read_result
read_stream (FILE *strm)
{
size_t buf_size = 0;
uint8_t *data = NULL;
size_t total_nread = 0;
while (true) {
// Calc how much is space is left in our buffer:
const size_t buf_remain = buf_size - total_nread;
if (buf_remain == 0) {
// Increase the buffer size:
buf_size += 1024;
TRACE ("Increase buffer size to %zu bytes", buf_size);
data = realloc (data, buf_size);
if (!data) {
fputs ("Failed to allocate a buffer for input\n", stderr);
free (data);
return (read_result){.error = ENOMEM};
}
// Try again
continue;
}
// Set the output pointer to the beginning of the unread area:
uint8_t *const ptr = data + total_nread;
// Read some more
TRACE ("Try to read %zu bytes", buf_remain);
const size_t part_nread = fread (ptr, 1, buf_remain, strm);
TRACE ("Read %zu bytes", part_nread);
if (part_nread == 0) {
// EOF
break;
}
total_nread += part_nread;
}
return (read_result){.data = data, .len = total_nread};
}

#endif // BSON_TOOLS_COMMON_H_INCLUDED
Loading