\def\hs{\hspace{0.4 cm}} \documentclass[aspectratio=169,usenames,dvipsnames]{beamer} \usetheme{lined} \usecolortheme{beaver} \setbeamertemplate{footline}[default] %\beamertemplatenavigationsymbolsempty \title{\huge Fuzzing Simplicity: A Story} \author{ ~\\ \small Andrew Poelstra\\~\\ Blockstream Engineering Call\\ June 25, 2021 } \date{ \includegraphics[scale=0.2]{blockstream-1.png} } \usepackage{graphicx} \setlength{\fboxsep}{1ex} \begin{document} \frame{ \maketitle } %%% 1. Options \frame { \frametitle{Preamble} Elements (from Bitcoin) has fuzzing infrastructure. \\~\\ \begin{itemize} \item Run without corpus in 2020 or so. Not since(?)\\~\\ \item No pre-built fuzzer input corpus. Didn't know how to build it. (doc/fuzzer.md)\\~\\ \item Elements needs old boost, etc., which means nix.\\~\\ \item But lcov has weird expectations about its environment. \end{itemize} } \frame { \frametitle{Preamble} Byron threw together a fuzz target (\#1332 on ElementsProject/elements) \\~\\ \begin{itemize} \item \texttt{PrecomputedTransactionData} requires calling an \texttt{Init} method or it doesn't work.\\~\\ \item And actually it still won't work; needs a complete list of input data or it will silently not initialize the data.\\~\\ \item Also we have a \texttt{tapEnv} object that needs a valid CMR (can't be obtained by the fuzzer). \end{itemize} } \frame { \frametitle{A First Start} Ok, so fix these things:\\~\\ \begin{itemize} \item Decode some input scriptpubkeys \alert{etc} from the fuzzer.\\~\\ \item For the input under test, decode a Simplicity program, \alert{tweak a key}, make a scriptpubkey, etc.\\~\\ \item Just disable the CMR check.\\~\\ \end{itemize} After six days, achieve 36\% coverage of the \texttt{simplicity/} directory. } \frame { \frametitle{Idea: Generate, Don't Decode} Coverage shows that we are only hitting a couple combinators and struggling to get nontrivial things that typecheck.\\~\\ \begin{itemize} \item It's hard for the fuzzer to generate valid diverse transactions from bytes.\\~\\ \item And even harder to generate Simplicity which is bit-aligned, and length prefixed with bit-aligned prefix-free encoding. \item Also we don't need to do taptweaks so we should not do taptweaks. \end{itemize} } \frame { \frametitle{Idea: Generate, Don't Decode} So instead take each byte from the fuzzer and switch on it, generating all valid possibilities. Easy to do in Rust.\\~\\ \begin{itemize} \item Do this for every Elements transaction data structure including rangeproofs and surjectionproofs (should upstream this..).\\~\\ \item Learn weird cool stuff about Elements consensus logic (did you know that coinbase inputs cannot carry asset issuances?)\\~\\ \item File a bug against rust-elements about parsing pegin witnesses. \end{itemize} } \frame { \frametitle{Idea: Generate, Don't Decode} Then just link the Rust code to the C++ fuzzing harness. Easy.\\~\\ \begin{itemize} \item Both Rust and C++ can handle byteslices. C cannot. You have to go through C.\\~\\ \item Both Rust and C++ have RAII. C does not. You have to manually get your destructors right.\\~\\ \end{itemize} } \frame { \frametitle{Idea: Generate, Don't Decode} Then just link the Rust code to the C++ fuzzing harness. Easy.\\~\\ \begin{itemize} \item rust-simplicity and Elements both have copies of libsimplicity. They are not the same. And even if they were, the linker will still barf on them because C has no namespaces and C has poisoned everything.\\\ \item So use a shared library for the Rust code. Good luck getting the \texttt{LD\_LIBRARY\_PATH} var in the fuzzer Python script correct inside a Nix environment. \end{itemize} } \frame { \frametitle{How to Generate Simplicity Programs} Fuzzer-guided recursive type generation is tricky. \begin{itemize} \item You need to cap your sizes somehow. If it's possible to make arbitrarily-large things the fuzzer will figure it out.\\~\\ \item Simplicity has exponentially-sized types and exponentially-sized values.\\~\\ \item No problem, we have static analysis. Except rust-simplicity actually evaluates entire types before the static bounds are applied (rust-simplicity \#221 and \#222) \end{itemize} } \frame { \frametitle{How to Generate Simplicity Programs} Simplicity types are weird. \begin{itemize} \item During type inference, types are not fully specified and do not have exact sizes.\\~\\ \item They can also be infinitely sized, and checking for this is expensive so we defer it.\\~\\ \item But finalization can't happen until your whole program is built. \end{itemize} } \frame { \frametitle{How to Generate Simplicity Programs} Simplicity types are weird. \begin{itemize} \item Programs must be 1->1 (take no input, take no output).\\~\\ \item ``Inputs'' are witnesses, ``outputs'' are aborts.\\~\\ \item To glue two programs together, easiest is to pair them\ldots but this will blow up your type sizes if you are not careful. \end{itemize} } \frame { \frametitle{Misc} \begin{itemize} \item Anyway I got up to 91.5\% once but then stalled out, still not hitting all jets, and with some seed inputs \emph{very} slow.\\~\\ \item At one point honggfuzz started crashing because of something to do with memcmp and I had to update it.\\~\\ \item Also my keyboard broke. \end{itemize} } %% end \frame { \frametitle{Future Work} \begin{itemize} \item More \alert{covenant Script fragments} (scheduled payouts, dividends, bonds) \item Reissuance covenants \item \alert{Improving Script} for efficiency/expressivity \item Porting this all to \alert{Simplicity}, zero-knowledge, crossing chains, $\cdots$ \end{itemize} } \end{document}