Reproducible Builds

BSides Warsaw · 201838:46547 viewsPublished 2018-10Watch on YouTube ↗

Speakers

Mariusz Zaborski

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

DiffScope Make Tor Tor Browser TrueCrypt

Platforms

Debian FreeBSD

Concepts

Bitcoin Reproducible Builds

Show transcript [en]

Can you hear me well? Look, they've learned how to clap. I don't know if you can hear me, so... OK, great. So today I'd like to tell you a little bit about reproducible builds. What is it, where does it come from, what was the history, why is it so important for security, which is what we are most interested in. There will be a bit of code, there will be a bit of theoretical issues, so I hope you'll like it. My name is Mariusz Zaborski, as I was introduced. I work in the company Fudo Security. We used to be called Wheel Systems, but we changed our name. I didn't change my job, but the company changed for me. After hours I'm also a developer in the FreeBSD project. I mainly

do the sandboxing of processes. After hours we have a group of BSD users. If you are interested in BSD systems, I invite you to our meetings. There are more than two of us, so it's cool. So, what is reproducible building? Before we get to this topic, I would like to apologize to all the Polonists. I don't want to break the language into reproducible builds all the time, so I decided to translate it. We will talk about reproducible building or repeatable building. If you have any better ideas for the name, I would be happy to listen to them after the presentation. Maybe we will do a competition for better names for this topic. What is reproductive building? You have definitely had to download some software from the

Internet and install it on your computers. Or you just used some package managers, for example, apt-get from Debian to install some software. Here we have an example with Firefox. Now we have the source code for Firefox. Let's assume that we have audited the whole thing. It turned out to be great, we want to use it. However, how are we sure that The source code that we have re-audited was used to build the packages that we are installing. At the stage of building, many things could happen. The developer could change the code, provide us with a different version, someone could call us on the server and change the package. We could have some bugs on the network, something could go wrong and we would get a different package again.

That's why the idea of reproducing was born. We have some source code, we build it and we get a binary file. I'm talking about building, not compiling, because I'm talking about the whole process of building. It's not just about translating the source code into a file, but also the whole process of packing, distribution, etc. Now, about reproduction, we are talking about the exact source code and the binary file itself. Bit to bit, the binary file must match. Here we have the definition of reproducible building from reproduciblebuilds.org, which combines various projects that would like to be reproduced or are already being built. Here we have written that for the same code and for the environment, which I will

talk about later, when we build something, we must get exactly the same binary copy from bit to bit. The history of reproduction construction starts around 2000, so it's an old idea when Mr. Evans sends a question to Debian's email group whether Debian autobuilders do it. So, do they make the output the same all the time? In 2000, it was criticized by Demian's group. It was said that it is not necessary, that it is additional work that is not needed by anyone, additional development that is not needed by anyone, that it does not improve security, so let's just ignore it. Seven years later, a mail appears again for Debian group with a question: "Maybe we would like to do it?" "It's

a cool idea, this reproduction building, and maybe we would like to introduce it into our development process." The email group replied again that "No, we are not interested in it, it is not necessary, why should we do it?" "It is just an additional workload that is completely unnecessary." And so on until about 2012, when two projects started this whole movement of reproducing builds. The first project was Bitcoin. They were very concerned about security. They didn't want to steal the users' wallets. Currently, Bitcoin is quite valuable, as far as I've heard. So they wanted users to be able to check that they have properly installed software. And the second project, which is definitely well known to you,

is Tor and Tor Browser. It also started building The process of building a reproduction system began with the use of TORs for watching cats on the Internet. Then, in 2013, Mr. Snowden came out and told us about many interesting projects. Suddenly, everyone opened their eyes and said: "Wow, we have to be safe, we have to improve our security and we would like our software to be verified in some way". 13 years later, at the DebConf13 conference, Debian finally grew and decided that it would be time to build a reproduction. So, at the conference, attempts were made to build a reproduction. Unlike TOR or Bitcoin, it is a bit more challenging, because it is not only the Linux kernel for compiling, but also the

whole world around Debian. All the packages, which are estimated to be around 50,000, have different languages, because Debian is widely used, so it has support for many different languages. and also a few compilers. We can compile with GCC, we can with Clang. Taking all this into account, this challenge was quite big. Currently, For AMD64, 94% of all packages are built in reproductions. Here you can't see it, but when they started working on it, it was around 2016, statistics and then It took them almost a year, until 2017, to reach this level of 94% and they are currently maintaining it. Another projects followed the exchange, such as Corbut, FreeBSD, Signal, CubeOS. All these projects are either

already built reproductively or are currently working on being built reproductively. Now the question arises: why build a reproduction? Do we have any problems that the reproduction can solve? This is my favorite example, my favorite CV, which I show very often. This is the OpenSSH CV from 2002. where we have one wrong comparison. Here we have the ID of the channel, and here we have the number of channels located in SSH. One wrong comparison allowed us to make some recommendations from the route level. One wrong comparison. If we look a little closer, it turns out that this is one assembler instruction. One assembly instruction divides us between a safe and a dangerous SSH. If we get a little closer, it turns out that it is one

byte. One byte that divides us between a safe and a dangerous SSH. In the end, it turns out that it is one bit. One bit can change our safe SSH into an unsafe one, which we can exploit easily. One bit flip like this can happen at any stage of distribution, of building. So it's really scary that one bit can change so much. Okay, now someone in the audience can say that "I understand, but I wrote the code, I trust my programming." Or he can say: "Okay, I won't download the code from the Internet, I'll build it myself, it's hard, I'll be safe now." It turns out that it's not that colorful. In 1983, Mr. Ken Thompson wrote an article about "Reflecting on Trusting Trust".

It says that if there was malware in the compiler, it could propagate to all binaries that the compiler builds. Of course, if the compiler builds a compiler, it could detect that it is building a compiler and reproduce the malware. This way, every time we build a new programming, our programming has malware in it. A few years later, a work was created that was supposed to respond to Ken Thompson's idea. The idea was that if we want to check that our Clang doesn't have malware, we take the source code, we take some Clang compiler and we take some GCC compiler. Now we take the source code of the clang and compile it twice. Once with this clang and once with GCC. We get two new clang

compilators. They are completely different in binary. There is no way that these compilators are the same. Then we take some code that is built reproductively, for example, our Firefox, and with this clang we build this source code. Then we compare if the new Firefox is similar in bit to bit. If it is not, then the compiler probably added something. It doesn't matter how GCC and Clang built this new Clang, because they can implement it anyway, but all the mechanisms that were implemented in this Clang should work the same, regardless of the compiler. And of course, this is only theory. We are researchers, we wrote two articles. It's hard, right? Maybe someone will do it someday. Unfortunately, it didn't happen. There was a

case in 2015 called X-Cop Ghost. It happened in China, where Xcode was spread with malware. Over 4000 applications were infected with malware. Xcode is a developer environment for Apple. Every time we compiled an application, malware was added to it. It is because in China the Internet freedom is controversial, so instead of taking the Internet from official sources, the Chinese preferred to take it from their own torrents. One of these torrents had such a surprise: 4000 infected applications. It happened once, but not entirely. Mr. Snowden also pulled out documents that stated that one of the NSA configurations also discussed how such compilers can be infected, how malware can be added to such compilers. So, Ken Thompson's idea turns out to be usable. So, how

would it look like to build our packages safely? It could look like this: we have several machines, in our cluster and we build our software on these machines. For example, Debian autobuilders. We build our software on them and then, for example, on two autobuilders we build the same Firefox and then we compare whether these two autobuilders have exactly the same binary. If not, This means that the compiler may be treffy, we may use another compiler or use the countering Reflecting Trust. We can detect all kinds of breaks, if we separate these two environments from each other. We can detect that someone broke into our environment and tried to replace our binaries. If the binaries are compatible, if checksum or bit-to-bit

binaries are compatible, then we publish such binaries for use. We talked about security, but Zen also has an interesting application for reproducing, namely live patching. Because ZEN is built in a reproductive way, if we change a fragment in its code, for example, if we have a bug similar to SSH, where we only need to change one comparison, By rebuilding it and comparing it with the previous version of the built-in Xen, we will get the difference in a few assembler codes, bits and bytes. Xen will allow us to live-patch this hypervisor without the need to reboot the entire machine. Of course, reproducing the structure enabled it, but Zen went very far. It enables translation of internal structures

of hypervisor, to livepatch it. the length of the executable instructions has changed, it will put our new fragment of code in some other space and instead of doing this tref code or this tref function, it will say to jump to this place and just make this new patched code. But why so much noise? Building is very, very simple. There is no magic there. We run Make, everything is built, we get binary files, everything works. Unfortunately, no. Building is very complicated. A lot of variables are included. A great example of this is TrueCrypt. TrueCrypt was a project that published binary files and source code. Who of you used TrueCrypt? Who built TrueCrypt? One person. No one practically built TrueCrypt. The source

code was there, but no one built it. And at some point the question arose: is the code that is published actually used to build TrueCrypt? TrueCrypt did not share any information on how the process of building was going on. And some researchers said that they would check it. I will take the TrueCrypt source code, try to build it and check if it is built from these sources. In this article, I describe how they chose the compiler, tried various compilers, various versions of the compiler, because it can all change when the compiler is They generated code, described how they had to test various files, libraries that were used to build TrueCrypt, various headings. A lot of work was

put in to recreate the environment in which TrueCrypt was built, and in the end it turned out that These files are not the same. And then they went byte by byte verifying what these changes mean. What led to these changes? What do they mean? If someone doesn't like dramas, it turned out that TrueCrypt was built from these sources. So there were no twists and turns. However, the amount of work involved in the recreation of this is terrifying. And so Many factors influence reproduction. For example, external libraries, especially if we link statically. If we link statically, the library must also be reproduced. Metadata of TAR files. Time of TAR files, all kinds of information that are contained in TAR files. If they are not the

same, it will cause that our results are not reproduced. If we have some debug paths saved in our software, it also means that our software is not reproducible. If I build in my catalog, someone else will build in their catalog, we automatically get some differences and we have to interpret them ourselves. Optimization. What are the optimization flags used for this software? On the one hand, we can use some shifts to divide, and on the other hand, we can use some additional processor instructions. All of this has an impact. Signatures and random values, such as dice roll, and we write down this value as a random value. If we have random values in our programming, they prevent our project from being built

reproductively. And variables coming from the environment we were building on: kernel version, user name, hostname. If such information is in our output files, it automatically disqualifies us from building reproductively. Locations. If we have different locations set on our computers and we do not document it, it may turn out that sorting works differently. Here at the top we have locations set to UTF-8 and if we sort some values, they will be sorted in an alphabetical way. If we have the locations set to C, which is probably the standard in the Unix network, the result will be sorted in ASCII codes. So in these two cases we already get a different result. And what happens next? Our binary files will be

different again. And the same if we just insert... And the same if we try to pack a catalog. It turns out that TAR takes files as they are saved on the disk, and not as sorted by name or anything else. Therefore, in the TAR file, the order of our files will be different. In this case, we have two catalogs that contain exactly the same files, and their order is completely random. We need to add appropriate flags, like sort name, so that they are sorted alphabetically. from TAR implementation, because for example, TAR BSD does not allow this option and you have to do all the magic to sort these files properly. And here we have a few examples,

here from FreeBSD, where in the loader there was information about what user built the loader, what was the hostname and when it was built. It's a very useless information, and it actually eliminates the reproduction of the data. If someone else would build it, the information could be completely different. Uname. You probably know it from Unix, which shows the current kernel or environment we work in. It has the information, the build number, When it was built, who built it, hostname and path to the configuration kernel file. Completely useless information, variable each time, disqualifying us from the reproduction. Here we have a Corbuth example, which also tried to be reproduced. In the end, they succeeded. Here we have a difference in binary file, random bytes, which were different

each time. When we were building Corbuth, each time it turned out that these values are different. What happened? It turned out that there was one function that generated this file and it had a poorly initiated structure. The paddings in this structure were also saved on the disk. As we can see, this structure was defined on the stack, so random values from the stack were simply written out. Let's go through the components of reproduction building. One of them is deterministic system of building. We also need to be able to provide the repetitive environment to all users of our software, as well as to rebuild and test the results. The last three points are about the fact that we want to control

the process all the time. If anything changes, if we make any changes, we want to know if these changes are reproductively built, if they will destroy the entire ecosystem. When it comes to a deterministic system of construction, it is about taking the source file, i.e. the same input, and getting exactly the same output file. As little information as possible from the environment. Of course, sometimes it may not work. For some reason, this information from the environment must be there. Then it should be documented and simply said that it must be filled in x, y, z. However, we try to provide as little information as possible from the building environment. A repeatable environment. We need to have the same version of the

compiler, the same libraries. If anything changes, if we use other libraries, it turns out that our code is not reproduced. The compiler can explain it differently. We have to document all this or deliver it to users. If we need these building paths, for example debug paths, we also need to document it. And distribution from the environment. We can document all this: what version of the compiler was used, what libraries, etc. We can simply document it or we can provide users with a building environment, e.g. some virtual machines or containers. However, in these virtual machines, there may be something that changes the code, there may be a new version of the library, etc. We can also use containers as documentation. We know what's in the container and

we can try to open it up. If something doesn't work, we can go back to the container and see what's changed, what's missing, if we use the same version of libraries. In all these works on reproduction building, a tool called DivScope was created. Here we have a fragment of C code. Of course, time will be changed to the current time, date will be changed to the current date. If we build it and use DivScope on these two binaries, DivScope will write down what has been changed. Exactly in which section, a very useful tool, from which developers use to analyze these differences. between binaries and how they could know where to look. If someone prefers the HTML version, DiskOff can also

generate such files. There is an old Chinese proverb: "Where trust begins, security ends". If we have to trust what we take from the Internet, if we have to trust some maintainers, it turns out that we are not safe there. There is a possibility of some liability, something may go wrong there. I like the philosophy of the CubeOS, which claims that they treat their environment as if it were under constant attack. As if everything they do was just or taken by some people, so they are constantly working to verify something. Thanks to reproductive building, we no longer have to trust. We can just verify whether what we get is what we wanted to get. Thank you very much for

your attention. If there are any questions, I will gladly answer. Very nice presentation. But now, a question. We can compile a compiler to eliminate malware. We can eliminate binaries, so we eliminate another factor. But it's hard for me to imagine that we can analyze the source code. We have to trust the limitless or spend a lot of time on analyzing it and trusting ourselves. How do you approach this topic? When it comes to the source code, The fact that the code is open and we can analyze it, gives us the possibility to verify it. So before the reproduction, we did not have the possibility to analyze this process. And the fact that the source code is open, already puts certain

hurdles on these programmers, that they will be more careful, because someone may finally notice that something is wrong there. It's hard to imagine that everyone would go and start building reproductively and check if the programming is built as it is. But if there is a person who starts doing it and finds these errors, he will tell everyone else that something is wrong. In this whole process of building reproductively, We have a possibility. We don't have to trust them. They will tell us that it is true, but we can verify it. It is similar with source code. It is hard to imagine that we will start to audit Firefox. As I mentioned, TrueCrypt is very interesting. interesting project when

it comes to security research. Because a community gathered, which gathered money and decided that we will now audit this TrueCrypt. And thanks to such initiatives, we have some trust in the software we use. Thank you very much. Are there any more questions? I have a question. We are used to creating a semi-perfect world where the programming we use is built like this. What should we expect after this distribution chain? We got used to the binary, we have some md5, i.e. a shortcut function from this binary, or a signed binary with some key and we verify it. What do we need to verify that it was built reproductively? We need to have a documented environment, so we need to know a specific

compiler that was used, specific libraries, or simply the software developer has to provide us with virtual machines. I think that CubeOS is similar in that they provide builders from which they built their images. The other one, for example, Signal, provides a list of tools it used to build. Having such a open environment, we can verify whether the code was built from this source or not. Isn't it better to standardize reproduction construction with virtual machines or with a docker image? Reproduction construction is a process. We need environments, but we also need the source code that will be built in the reproduction. If there is any date or time in the source code, it automatically removes the reprogramming from the reproduction.

On the other hand, if we deliver such a virtual machine, the paths will be the same. So it's better in this respect. But we don't have to worry about these paths, we just need to document it. If we are aware of such things, then let's just document them. For example, the Debian project, no, sorry, the Corbut project has such a policy that if a change spoils the reproduction building, it is automatically withdrawn. is immediately withdrawn without any discussion, it improves its patch so that it is built reproductively. So the world is aiming for this, for programming to be reproductive and for there to be policies on developers that force them to make programming reproductive. But

the source code is the first step. The second step is the environment, which is also a very complicated issue. because everything we build must be documented. And in case of such a Firefox, Chrome OS, there are many of these factors. Question about 6% of Debian, which is not reproducible. Any interesting cases, any controversial things? None of them are controversial. Some of the packages are not being built. Despite being in their repositories, they are not being built, so they are not being built reproducally. But nothing on top of my head that would cause a scandal. Hi, do you know any commercial versions of such a thing? For example, some Microsoft, Cisco or Juniper, etc. make their own projects internally. It's a very interesting

question. Many films do not share their source code. So the first step would be to force them to share their source code. However, I am not aware of any commercial cases where someone boasts that they build reproductively. However, the interesting question is whether we, as a company, have to make the source code available to be verified and reproduced. In my opinion, it can be done in some way. Namely, by delivering the environment of some sandbox, for example, the process, and publishing it, defining some nice IPC between our sandbox and the process, and define what this process can do in the sandbox with some setcomp or something else. Thanks to this, we can have a process that is not auditable, that is, it is auditable only binarily,

or only in assembler, and implements our technological advantage, some incredible algorithm to detect malware or something like that. In a way, we can ensure reproduction in companies without making the intellectual part available. By simply documenting very closely, through a sandbox, what this process will do. But answering your question directly, I don't think any company would do it right now. I was not talking about how users should verify it, but about audit requirements. For example, some time ago, something was changed in Juniper, the backdoor was set. Do you think that companies do it internally? In this way? No, unfortunately I don't think so.

OK, that's all? OK, so Mariusz, thank you very much for the presentation. Thank you very much.

Reproducible Builds

Related talks