← All talks

Jak źle użyć kryptografii

BSides Warsaw · 20171:00:0811K viewsPublished 2017-10Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
About this talk
Autor: Jarosław Jedynak
Show transcript [en]

Test? Ok, so everything works I guess. So, I will have a presentation about cryptography or how to do it and who I am and what I do here. I work in CERT, like Maciek from a while ago, you've already met him. I prefer to play CTFs in my P4 team. Well, it's a bit of mine because I co-founded it. I like cryptography and algorithmics, hence the topic of this presentation. I like cats, so I'll have a lot of pictures of cats, but only on the internet and photos. I did some cryptography, I even wrote some decryptors, they're on the CERT website, so maybe I know something, maybe not. If someone wants to find me on the internet, there are links below, or you can just talk after the

presentation. I took the topic from the fact that I saw on the website of the B-Sides, and the first two presentations that were sent were about cryptography. I thought to myself, "OK, so B-Sides are now a presentation about cryptography, they will also say something about cryptography." And that's it. The second thing is: what doesn't make a stackoverflow? Because it happened that half of the things I wanted to say that are bad with cryptography turned out to be because people were copying the code from the internet. And the third thing is that I laugh at innocent people. But I don't laugh at people, I laugh at their code, so it's not personal. Everyone made mistakes, so

I hope no one will be offended by this presentation. No one will see this presentation from the internet, so it's an agenda. First, there will be block cipher, I hope not everyone will fall asleep. Then there will be a few minutes of theory, then some modes, weird things. Then there will be the shortcut functions, what to do with them and what not to do with them. We will finish with our own cryptography, whether to do it or not, and maybe break various algorithms. I have a lot of slides so I'll be fast, so let's start with block symbols. What are block symbols? Block symbols are... yes, symbols. I hope everyone knows what they are. They

are something we don't want someone else to be able to decode. Let's say our enemy. Symbols are divided into symmetric and asymmetric. Asymmetric symbols are a bit of black magic, they are difficult and usually there are some mathematical operations and papers are written about them. They are difficult, but they also have good libraries that don't allow programmers to make mistakes. That's why they rarely happen in real apps, with symmetric cryptography. But it's the newest field of cryptography, it was created recently. That's why you can do different magical things that you couldn't do before. And we won't talk about them here. What's more interesting for us is the symmetric encryption, which is when we encrypt with the same key we decrypted.

And we also have the strumming encryption, which is basically someone takes a key and generates a lot of random bytes from it, and then plaintext, which is the data they want to encrypt, xor with those bytes and we get some ciphertext. And now, without this key, it's impossible to decrypt ciphertext, because someone can't guess which bytes were generated randomly. In theory, it doesn't have to be xor, but in practice, almost all string codes work like this, so it's a good simplification. On the right side is a picture, if someone likes pictures. I don't know if it's because I'm bad at pictures, but I like code, so on the left side is a code, a visual

one, how it might look. It's a bit different with block codes, because we don't code byte by byte or bit by bit. We just divide data into blocks and encrypt them with some magic. There is a function called AS which encrypts data with exactly 16 bytes. So let's give AS 16 bytes, it does some magic and it gets 16 bytes encrypted with some key. To encrypt something like that, you need to first divide the data into blocks of 16 bytes and then encrypt it. You can do it like in this picture, which is to encrypt each block separately and paste it together. It turns out that this is not a very good idea, but we'll

talk about it later. Examples are 3DS, AS or Blowfish. AS is the most popular one at the moment. And there is a problem that if we have 4 bytes that we want to encrypt, it's hard to do it with ASM because ASM wants 16 bytes. It doesn't want 4, 5, 20, it just wants 16. So we need to add these bytes and one of these solutions is PKCS7. So if we lack one byte, we add one byte of value 1. If we lack two bytes, we add two bytes of value 2. If we lack three bytes, we add three bytes of value 3, etc. to 16, to 15. Because if we lack 16 bytes, which

means we lack nothing, then we add 16 bytes of value 16. So it's unambiguous. Some libraries do it well, for example php-encrypt does it wrong, it adds zero, so it's impossible to remove it. But php, I'm not a fan of this language, I agree with people who program in php, there will be more about it. And now the biggest problem is that when we have these 16 byte blocks and we have some message that divides into 16, because we made a padding, What can we do next? One solution is obvious, like the previous one. We just encrypt everything separately. It's called ECB in science. Because in cryptography everything has to have a cryptonym and it's

called complication. But the point is that we encrypt everything separately. So the first block of text, the first 16 bytes, we encrypt and that's the first 16 bytes in the result. Then we encrypt the second 16 bytes and that's the second 16 bytes in the result. And so on, to the end. It sounds like a good idea at the beginning, but it turns out that it has a lot of problems in practice and it shouldn't be used at all. Unfortunately, it is still a default in most libraries, because it doesn't require any additional data. So it turned out that it is a default. And in Python, for example, which I respect, Python is a good

language, this crypto is still a default and many people use it, maybe not many, but it still happens. Let's imagine we have a page with cookies. I like this example, there will be a lot of things in this format. We have a page with cookies. Instead of keeping the information in the database, I'll save it on the database and keep the information from the user. But the user won't change it. After logging in, the app gives the user a cookie that looks like this. that the name of the user is in the box and the question is whether the user is an admin or not. Maybe it looks good at first glance, the box is encrypted, no one will modify it, that's the assumption, but it

doesn't work like that. Why doesn't it work? For example, if we encrypt the username and password of our user, it's my nickname, I don't know if I'm proud of it, but yes, so if we encrypt the username and password of our user, something like this comes out. We encrypt the data on the left, blue, I know you can't see it from the back or the middle, and we get the orange data, or some encrypted bytes that the attacker can't encrypt. But if the app is open source, the attacker doesn't have to decode them, because he knows what's in the data. So the only thing he can't do is generate his own cookie, which says "name:

msm has admin: true". But let's say we have a hacker user, and he first generates a user named "hacker", and it works, we get a cookie like this one, which you can see. And then I'll set up a hacker user, who calls himself "hacker true x". You can't see it from the back. And it looks like this at the bottom. I know, you can't see it, but the point is that the attacker can divide the ciphertext into blocks and he knows what's in each encrypted block. And he doesn't stop anything to take and paste pieces of these blocks. These are pasted pieces of cookies. And so the first piece is made of the first cookie, the second one of the second cookie, the third one of the second one,

the fourth one of the first one. And if you glue it together, it turns out that We have a cookie, it's a correct JSON on the left. So the attacker will paste these small pieces, not knowing what's there, meaning they can't decode it. And he sends it to the server, the server decodes it and it turns out blue. So a correct JSON decodes nicely. Oh no, we've been hacked. The hacker user got a cookie that says he's an admin. And maybe now, I don't know, he'll delete things from our page. And of course, this is a completely fictional case, no one does it. At least I thought so, but I checked if the internet was

right with me. It's a different thing that can be done wrong in the ECB mode, so the last byte can be brute-forced, but it's not practical. I checked if the internet was right with me, and it turned out that it wasn't, because the first result of my question about encryption in PHP was: how best to encrypt cookies? And the answer that Google says is: "No real reason not to go with AS" So there is no reason not to code with AS. First of all, use CBC, which doesn't help either, but we'll talk about it later. And even if you click on this link, you'll see a cryptography of some kind. A cryptography of some kind,

which is located under this link. And what does it do? First of all, it encrypts the cookies, which is stupid by itself. I don't recommend it to the worst enemy to write a code on the website. But it's okay, we've already decided that the question was how to write a code, so let's say. But there are more problems with it, for example, there is an "iv" used in it. And if someone noticed, they noticed that I haven't said the word "iv" so far. Because in the ECB mode there is no such thing as "iv". "Iv" is something we will talk about later, which only appears in the ECB/EC mode. Someone wrote a code that encrypts

ECB mode and adds an IV parameter. This is an optional parameter in PHP, and PHP is too nice to be used, because someone gives it unnecessary parameters, so it encrypts it. And of course, every encryption gives the same result, so the IV is completely ignored, but as you can see, there is no warning, nothing. PHP is happy. Yes, I don't like this language, but it's not the only problem with this code. We can go further, for example, it's not AS256 at all. It sounds a bit like AS256, because AS is Rindel, and 256 is 256, but in reality it's a encryption mode that is used almost exclusively in mCrypt and PHP, which is very hipster, which is not very well-researched, and which

a lot of people use because they think it's AS256. And it's a Rindel variant, which has 32-byte blocks, But almost nobody has ever studied it and used it. Another mistake, but it's not the end. First of all, why is the key called "solt"? It worries me a bit, because it means that nobody knows what it writes, what parameters it calls. and the second one is "trim" which means that if the data ends up on a walk, for example encrypted, then when we encrypt the data and decrypt it, the walk will be cut. This is not a big problem usually with cookies, but it means that someone probably didn't test the code properly and maybe there are more problems. So they are getting used to this code, but

no one uses it, right? Unfortunately not, because they searched it out of curiosity on GitHub and it turns out that there are almost 6000 results. All of them are using the 256-liter readl, all of them have some sort in the parameters, and all of them are almost transmitting IV to a function that doesn't need IV. I'm glad that people read Stack Overflow, but they could wonder if they are doing a good thing. It worries me a bit. The first example I'll give you is a website that does exactly what we showed before. It encrypts the user and sends information about him to the user. And it does it with the function that is in Stack Overflow.

Or another example, a function that is in a website from e-commerce. There are baskets, people click products, buy products. E-commerce is open source. but I don't think anyone uses it anymore, probably because of some code, because they can't find an operating instance. And the cool thing is that we have a parameter title, which we can set to any value we want, so we can add a transaction title. And the fact that we can add a transaction title means that we can encrypt any data that the user would like to encrypt. What is it useful for? As you can see here, I mean, you probably can't see it because it's at the bottom of the slide, The data sent to the user is serialized, so it's

thrown into a cookie. It's on the top, so you can see it. And the data looks like this. So we have a price, a quantity, a title. And we can make a test purchase. Or put it locally, of course. And make a test purchase. And the cookie will look like this. So we have some code called code123, we have a test title, a test quantity, a test price. The price here is 100. And 100 PLN is a bit too much, or dollars, or anything. So we would like to pay a bit less. How can we do that? I mean, the attacker would like to pay less. So we can, for example, think about it a

bit and come up with something called "Tag". I mean, give it a title. And then PHP will do it without any problem, it will look nice. everything will be well serialized, but in serialized data we will have something like this, I highlighted interesting elements in red, so we will have some blocks on the left side, and that's cool, PHP understands everything, but if we paste this first red block to the second, so the first encrypted data we paste to the second place, it turns out that, well, you can see what's going on, so PHP He will treat the data we pasted, as a fragment of the data we've already created, and treat it as a price. This way the attacker could change

the price in a basket from 100 PLN to 1 PLN, and he would give such a title and do some magic later. It's simple. If someone knows how the code looks, they can test it locally, without any problems, and know exactly what's going on. It took me a while, really. It's easy when you can see everything locally. If the code wasn't open source, it would be much harder. It doesn't mean it was safe. My favorite example is AbanteCard, which is a big PHP framework for e-commerce. Of course they are safe, but I also found them while searching for the ECB code. I also encrypted with Rindel256, because they probably copied from the internet. At least they had so

much credibility that they removed the IV that is not used. So, cool. But there are a lot of problems here. For example, if someone forgets to set the encryption key, then the entire encryption function, depending on the security of the application, will simply go out and ignore the entire encryption. Or if the server doesn't have an mCrypt installed, then some magical encryption will be used, which is equivalent to Cesar's encryption. And this is something that a medium-sized high school student would do on a piece of paper in 5 minutes. And that depends on the security of the store, so I wouldn't want something like that. Or I think the key is hashed using SHA-256, which

is generally not a good idea, because either I want to use a safe Password key derivation function. Or we want to have a key that is safe right away, so it has 32 bytes and is unbrutal. But the best thing is that the ECB algorithm was used, the Rindel algorithm in ECB mode, which allows attacks, as we have already seen. And finally, such a curiosity that it does some strange things in the form of replacing changing characters, which doesn't make sense, because it's not a function task, but I just wanted to complain. So this whole function, all the things I've attached to it, can be thrown away, because I haven't changed the security at all, and this whole function is reduced to something like this. And this is

basically a two times better version, which takes ten times less lines, and you can even write it down from memory. And of course, it's all used in a safe way, so we encrypt the user's data and send it. In this case, for example, we give the first user's name to the cookie, which is the name in Polish. We give the user's ID, which is interesting, because if we change the client's ID to, say, 1, we will probably become some admin or something interesting. And at the end there is the name of the script, i.e. the Pipname parameter, i.e. the place from which it was called. This is exactly unused to nothing, from what I've seen.

And it's cool that we showed you a few sites earlier that you can just break it. Because we're using paramscript firstname when we're doing the registration. There are no limits on what you can give. You can't give an apostrophe, because SQL injection or something, which is stupid, because some people have an apostrophe in their name. But the things we need for the attack are no longer blocked. And this is in AbanteSRC, which is or at least it was, because it's a 2016 code. At this point it's changed a bit, but we'll get back to that. For now, the conclusions from ECB part, because we're already leaving it. The conclusion is not to use ECB mode, but it's a very boring conclusion, because everyone has been

saying it for a long time. I think most people have heard it at least 10 times in their lives. So it's not very constructive. Another conclusion is not to use PHP, but it's optional, because I understand that if someone works in PHP, it's stupid to change language all the time. But I didn't want to talk about it. I wanted to show that even using good primitives can make big mistakes in cryptography, or the application can be dangerous. And of course, the second important thing is that encryption is not a superstition, it's a nice mantra. So we don't send encrypted cookies to the user, at least not to prevent them from being modified. As you can

see, the user can modify the cookies with it. That's it about the CB mode. There are also better modes. For example, the CBC mode, which is another cryptic name. It is based on the fact that the next blocks are xored with the previous ones. Nothing complicated, apparently. But it protects against all the attacks that have been shown so far. So you can't just stick another block to the previous one. Or you can't just change something. Or change order. Or you can't know what's encrypted by using anything. But there are other cool attacks. Here's a code if someone prefers code. I prefer it, so the next ciphertext blocks are xored plaintext blocks encrypted with the previous block. And here we have

an IV that wasn't in the ECB. And AbanteCard updated its code, it no longer uses Rindel256, nor Crypto, but uses OpenSSL. I'll check if OpenSSL exists, if not, I'll encrypt it with Cesar, so it's, I don't know, I'd rather the program not work when it's completely unsafe than to encrypt with Cesar, but who cares. And if there's OpenSSL, it's nice because it encrypts with OpenSSL, so it's safe. It sounds safe. In AS mode 256CBC, so CBC in Rindel 128, key size 256 bytes. Is it safe now? As I said, it's not safe because the assumption hasn't changed. The cookies are encrypted. But the only thing we can do is throw everything away and leave the two lines because they only make sense in this code. And

what can we do with something like that? If we look at this graph, this drawing, we can see that there is an IV, which is the only element which doesn't create any plaintext, but something that isn't encrypted. And we have direct control over it, because we're giving it to the server. So we can... the attacker, the person who has the cake, the person who has the ciphertext, can change the IV, and what could happen then? For example, if we have a message like this one, we have some IV, we have some data encrypted, and what is encrypted is this one at the bottom. So we have, as you can see, MSM equals two. MSM and admin equals zero. I purposefully cheated a bit to fit in one block.

And if we encrypt it, it's something at the top, and something like this comes out at the bottom. The key is probably 16 characters A, if I remember correctly. It's cool, but what's the result? If I change a sign in IV, the corresponding sign in the encrypted data changes. Exactly the same. So here, for example, they changed the first sign in IV and the first sign in the encrypted data changed. So when the attacker thinks, maybe 15 seconds, he will think that he can change the 12th or something sign and then change the 12th sign in the encrypted data. If it knows how to change this sign, it will change admin = 0 to admin = 1. And again, the same attack. The attacker changes the

code that is encrypted and gets the admin's right. Or some other one. Or gets what it wants from the website. And we go back to AbanteCard, which encrypts the cookies in CBC mode. And we have something like this. What was on the slide before. The username is encrypted, the client's ID. and two scripts. Let's assume we have a user named msm123, with ID 123, which is a default name. What happens then? The data is serialized, it's on top. As I said, we can only change the IV. This is a bit of a problem, because we can only change the first 16 bytes of this IV. The IV has 16 bytes, so we can only change the first 16 bytes of the encrypted

data. And this is a bit small, after all. We can change the name of the key first, and it won't help, there will be a mistake in the program. We can change the length of the key, which will probably cause something to fail. Or the size of the table, and everything will fail. But I think we can name the user as we like, so maybe we can come up with something. And so, going through the thinking process, because it wasn't that fast to come up with, But what if we named our user in this way? What could happen then? Nothing will happen. We will have a user whose name is like this. It's not a

convenient name, but who likes it? But now we can change the first byte in such a way that interesting things happen. Here are the fragments colored, I don't know if you can see it from afar. There's even a slide, so we can change the length of the first name to something longer. And now the first name suddenly changes the key to something like this, and the value becomes n. And php completely changes the interpretation of all the data we've encrypted. So by changing just one byte, we change all the keys and all the values, but somehow it gets properly deserialized and we get what we wanted, i.e. the user ID 1. And now we're a hacker, admin, whatever we want, on the server, in many stores, because it's

in a few places. In fact, ATT&CK has one more step that I haven't told you about, I don't want to say it, because someone else will do it, and it's going to be me. So instead of asking where is it used in this AbanteCard, for example, we have a link to when someone forgets a password. These are encrypted data, but someone forgot that they can change it, because we can change the IV. Or when someone sends an order, a request for a purchase, all the data is encrypted with the CBC on the client's side, so the client can influence it. So it's not a good security. The main problem is the same as the previous one. Someone used encryption

because he assumed that the user gets some random encrypted garbage. So he won't do anything with it. Because they are encrypted, so you can't influence them. This is a bad assumption, as you can see. I hope. So I don't recommend doing it if you want your pages to work. There is only one exception. Sometimes it is. Sometimes encryption is a verification, but only when we don't use ECB, CBC, CTR or any other type of encryption. We use some kind of a hipster type of encryption, for example GCM or OCB, CCM. They are very scientific nowadays. The best type of them is OCB, but it has a big problem that it is patented. There are a lot of problems to

use it. And specifically, the creator almost denied all rights, because at first he wanted money for using non-commercial products. Now he gives it almost to everyone for free, but still he is rebelling against it. So people made some other modes that are worse. and I'm trying to use them, GCM is probably the most popular one now and it's objectively a better solution than encryption and signature because we gain the speed of encryption and security at the same time, so no one can modify the data. But as I said, there's a problem with patents, either there are patents or there are worse algorithms. So, it's something for something. Meanwhile, as I'm standing with time, and I'm

even standing well with time, to the second topic, which is signing. If we want to sign some data, we should use a signature, which is a function that allows us to sign some data and no one can modify it. If they do, we can check it and prove it. And then we can reject the request. A lot of people use hashes for this. Why is it a bad idea? We'll get to that in a moment. What are hashes? Cryptographic hashes, or in Polish, a shortcut function, because some people don't like to call them hashes, are functions that take some data, do something, and return some data with a constant length. For example, they take a peak of a Neo Gigabyte, do some magic

on it, and then it turns out 128 bits, like in Fn5. What are the functions we know? For example MD5, SHA1, SHA2 and SHA3. These are the most popular names. MD5 is dead for a long time and should not be used anywhere. SHA1 is dead for a little bit shorter. It is dead for a little bit shorter because Google has released a public collision. There are two functions worth using: SHA2 and SHA3. There is a bit of a mess because SHA2 has six names, it is basically a family. So, for example, SHA-256 is a SHA-2 variant, but SHA-3 is newer than SHA-256. So, good naming. We also have SHA-512/256, which means SHA-512 but cut to 256 bytes. And it also has its

own name, to make it more chaotic. And the same goes for SHA-3. So we have another family of four hashing functions. In addition, three more hashing functions were defined, and SHA-3 itself is called differently, i.e. kecak. There's also Kangaroo 12, which is connected, but different. So there's a lot of it, but if you don't have any weird requirements, you can just use SHA-256 and SHA-512. And if you want to be resistant to NSA or quantum computers, then SHA-3, 256 and 512. And you don't need anything else. What are the hashing functions for? They have two interesting properties. or cryptographic shortcut functions, because they don't need to have a simple shortcut. I have two interesting features that can't be done with two things.

At least two important attacks are on the shortcut function. First, if we have a hash, we can't generate any data from it that gives us this hash. So, for example, a database has leaked and we have a hash of the hash from this database, and it's impossible to generate a hash from this hash of some hash that matches this database. Of course, you can do it with the help of brute force, but unfortunately it can't be stopped. But the fact that this is the only method is a result of the rule: "Premium Tree Resistance" - resistance to hash return. And the second important thing that a hash should be resistant to is "Collision Resistance" -

resistance to collision generation. So it should be very difficult, I mean impossible in practice, to generate two hashes, two messages that give the hash. And this is what was broken on the 5th, in Psh 1. And in almost all of the non-joke functions of the cryptographic shortcut, the "premier" or "resistance" was not broken, but the "collision resistance" was broken sooner or later. And there is nothing to be verified here. So, hashing is not used to verify, either, in the first place, but specifically to sign a message. So it's not a message signing. This is not how you should sign a message, and yet many people like to sign messages. which we'll get to in a

moment. There are a million of codes on Stack Overflow, but I didn't want to look for them. I'll look for them somewhere else. Just to make sure, this is how you sign a message. So if someone wants to sign a message, you should do it like this. Use a structure called HMAC, or Hash Based Message Authentication Code. There's even an authentication in the name, so you can see it's a good method. It's basically the same in the code, except we use chmack instead of SHA-256. So there's no difference for programmers, but there's a big difference in terms of security. What's the difference in terms of security? Let's imagine we have a discount for an apple, for example. Why not an apple? And there's 10% discount. And

it's described as a secret, so no one else can generate a discount for an apple. And the user gets the discount for the apple plus the signature, the one you can see at the bottom, 8b something. But it's signed with MD5, which is a shortcut, so it's a big NO, you shouldn't do that. It's an MD5, but it will work for SHA256 or SHA1, everything but SHA3. How can we break it? We can break it in about 5 minutes by calculating the installation of the tool. So we install a tool called hashpump, we give it 4 parameters, i.e. the data we have, the hash that was, the data we want to add, and it spits out new data, what they should be, and the

new hash that will be. So let's say we can add to it amount = 100, i.e. discount = 100%, and it spits out some garbage, and the discount equals 100. In this case, I assume that the data is parsed by the "url" code, which is quite common in practice, in my opinion. And it's characteristic that the previous parameters are usually overwritten by the following ones. So we have the parameters "name" equals the discount on the apple, then "amount" equals 10 and a lot of trash, and then "amount" equals 100 and after decoding something sensible will come out of it, so "amount" equals 100 and "name" equals the discount on the apple. So we were able

to buy apples, big apples for winter, endless apples. And this whole attack takes more than a minute, if someone has already installed this tool, and if not, or has to google it, it takes 5 minutes. That's why I think that using hashes to verify data is a bad idea. With a map it would be impossible. So I know, you already know, or you knew that this is not the way to do it, but there are a lot of people who don't know that this is the way to do it. And for example, MasterCard had a problem, not really related to it, because they were even more active, and they were taking data, for example, they

wanted to sign, because it looked like there was data on the store's website, so the store sent a request, on the client's website, in the store, there was signed data, and this data was sent to MasterCard, so the client could modify this data, But the signature would not match. We sign the data, but always by hash. So nothing can be changed. But he did it cleverly. He took the price, pasted the card number, the user name, and all the parameters sent to the server were pasted together. And then the signature was verified. This would not be a problem for the company. But they did one interesting thing. They took all the parameters from the store, even those that weren't ordered. So you could add

your own parameters, for example vpc_b and write whatever you want and it will be changed when you count the signature. And the interesting thing is that these two messages on the left and right, after gluing all the parameters together, will give the same result. Which means that if you don't think about it, when you sign them, they will be the same. So on the left side we have something that buys 1000 items for a shop card, and on the right side we have something that buys 1 item for a shop card. The trick is that we convince the shop that we buy 1000 items, and then we pay for 1 item. And we're rich, we

have 1000 something, for example, I don't know, apples. And in this case the signature wouldn't help, but I wonder why something as big as MasterCard uses hashes, MD5 for data signature. But ok, not to say that this is the only place where you can make a mistake, or that only MasterCard is bad because it makes mistakes, a very similar situation took place in Poland two years earlier, but it ended a bit differently. So there was also a lack of payment, a very similar mistake, a mistake in signature, and the server's background, but it's mixed with that, but it ended a bit differently, because The researcher who reported it got an answer from Helpdesk that it was really a problem in 2 hours. Helpdesk wrote that the president will call,

there will be golden trains and rewards, all customers will learn and so on. In fact, it was two years ago, the president did not call, there were no golden trains, and the customers still did not know that there was a problem. I did not give the name because I do not know if I can. Let's move on, so I don't get annoyed over one store. We have dotpay, I don't have any errors here. But it's interesting that we sign everything with SHA1 and we create mastercards, so we scale all the parameters. And maybe there is no mistake in that, maybe there is a mistake in that. I'm not convinced that if someone doesn't do something, it will make them lose money. We just stick all the parameters together

and hope that we won't be fooled. And there are 24-bit switches, which are the best of the previous three, in terms of protocol. The whole secret is that it has 16 16-bit characters, which gives 8 bytes, which gives 64 bits. And that's not a lot, apparently. It's something that you can't do with a normal laptop, but if you have a good cluster, you can easily break it. and make any transfers from that store to infinity. So first you have to spend a bit on electricity and then you can go crazy with purchases from 24/7 transfers. So it's so problematic that it can't be fixed. I mean, every client would have to generate a new secret and set it up in the panel. And people usually

tend to not want to do new things in their store. How does it work? Why do you need to move? And that's the second part. I would like to say something about the third one. As far as I know, some of you are most interested in it. As far as I know, the previous things are... If you can read about it on the Internet, it's a part that is specially prepared for the conference. I didn't plan to talk about it at the beginning, but somehow it happened that I will talk about it. Why? In a moment. So if using cryptographic primitives is difficult, Using your own cryptographic primitives and creating them must be very difficult. I answered my question. Creating primitives is

difficult. Here is an example of a shortcut function used in the history of the world. You can see that there is a trend that a shortcut function is created, it is good, green for a few years, sometimes 10, sometimes less, sometimes more, and then someone finds a problem in it and breaks it. And it doesn't mean that the one who invented the function didn't know how to use cryptography, but it's just that it's hard. To attack cryptography, you just need to make one mistake, and to create something unbreakable, you need to predict all possible attacks. And new attacks are constantly being made. That's why it's harder to defend than to attack. And that's why I was interested in the second topic I showed here. Because it

turned out that at this conference, the day after me, there will be a presentation cryptography of my own hash functions. Which I really liked, because it was a bit under my control. So I decided to analyze this function and see what can be done with it. Because I like challenges. And that's why instead of making slides I wrote a weird code in Python. But it turns out that I didn't know this function, but many people knew this function. There were some papers about it. Some of them I saw, some didn't. But what's going on with it? More about it tomorrow, I'm sure. I will summarize it in a few words. This is a picture I tried to make. I don't know how well it

turned out, because I'm not good at making pictures. But generally, we start with four fixed ones. We have C0, C1, C2 and C3. These are four state elements. And at the beginning, we xor the first element of this state with bytes. We xor with the first block of data that we hash. And we encrypt the S round. Then the same with the second element of the state, the third and the fourth. Then we repeat the same thing, we xor the first element with the second block of hashed data and we encrypt it with the AS round. And the same with the second element, the third and so on. And at the end there is a

strong finalization. And since I prefer code, I don't know if you can see anything here, but in the form of code it is a bit more readable, there are no weird squares. We have an update function that does 4x update rounds and a finalize function that can't be written down. It's basically a strong AS encryption. But if we wanted to generate a collision, we wouldn't be interested in finalizing. The only thing we need to do is generate 4 states that will be equal before the finalization. If something is equal before the finalization, it will be equal after the finalization. So we put... How can we do this? Or how to model it? First, let's see

what happens if we encrypt messages with four blocks. This is something like you can see on the slide. So, encrypting four blocks of messages is based on the fact that we make an update, we call this function update on the first block, second block, third block, fourth block and finally we do the finalization. And finalization is unnecessary for us. So if before finalization we have two messages for which the result will be equal, then we will have a collision. And it can be done like this. Here, attention, there will be a lot of math or some weird explanations how to approach something like this. So at the beginning, very high-level, the definition of collision sounds

more like this: So we have stream_hash = stream_hash , which are two messages that give equal things after hashing. And you can develop it at the beginning, so use the update function. I'll put it at the bottom of the slide, unfortunately, so I don't know if you can see it. But after these four updates, for a1, a2, a3 and a0, the result of the function will be equal to the result of the function after four updates for b1, b2, b3 and b0. And now, attention, because here we start making a lot of ASs, it's exactly the same as before, but it's written out. And now we don't need to understand what's going on here, to understand what I want to say, because it's considered

that this is equivalent to the previous side, so this one. So if we find a solution for this equation, we have an equation for collisions. This is a simple mathematical equation, even though there are some as. As is a good mathematical function like any other, so we can use simple techniques. For example, we can make variables, to make it a bit shorter. And then we can make even more variables, to make it even shorter. And now it looks a bit better. And now we see that in each of these four equations we have some as encryption. AS is a good mathematical function, so we can decrypt the equation with AS on both sides. We didn't do that in high school, but we can do it here,

no one will forbid us, we are free. So we decrypt all the blocks with AS on both sides, and we get something like this at the bottom, so we lose AS, the first one. But there are still two AS, so what can we do? So what can we do? We can move this XOR from A3 to one side, so that It was a better comparison. We have something xor something, something xor something, and we want xor only on one side. So we move it, the result is as below. And then we have a lot of as again, but we won't go further without a deeper dive into as, because unfortunately we don't have to make any tricks to decrypt something. But AS is

not the bottom line yet, because we can still split AS into 4 parts. AS round consists of 4 operations. Specifically, adding the round key, mixing columns, moving verses, and finally, sub bytes, which is an operation that changes bytes in the result. At this point, it looks complicated again, but what is important is that if we divide it equally at the bottom, we will have solution for the whole hash collision. So we can remove the addRoundKey at the beginning, because it's just an XORing. It's not magic. AddRoundKey is with parameter x, it's just XOR with X. It's called smarter, because cryptographers like names. So we can change it to XORing. And then we see that we XOR both sides with the same value, so we can

delete it. And one thing disappears. Now we can move parameters to one side again. We already have them, but we can use them from an AS property that is not taught in high school. Specifically, if we mix columns and XOR with something, we can put this XOR inside when we do the reverse operation on it. It sounds complicated and it's not important. It's important that it can be done quite easily on the card. and put it inside the AS. And now we have an operation where there is a mix columns on both sides, so we can mix the columns on both sides. And now we have an analogous situation, but with a ShiftRows operation. So we can take both sides and put the XOR's reverse

inside and remove ShiftRows. And we get to something that is very simple, meaning there is no AS anymore, only subbytes remained. which is unfortunately not a very good function for cryptanalysis. It's basically just a variable of the value of the table elements, but it doesn't compose anything. So we need to use something else. At the beginning we can move all the elements to one side. And now we see that on one side we have some operations, like INV_MX, INV_SR. It's not really that important what it is, because these are some constant mathematical operations. So if we just give x for this, and we notice that on the right side we have something else. If we have x = a, x = b, x = c, x = d, it

means that a = b, a = c, d = d. So the whole equation of equations can be reduced to one equation, which looks like this. It's not beautiful, but it fits in one line. And a reminder that all the operations we did were reversible, so we can take and if we find a solution for this equation, or evenness, we have again a collision equation. So how can we approach something like this? Here is a complicated picture of what SB does, although it's not that important, it's just a picture that takes some random elements from the table. Not really random, but... And here is a complicated chart that shows an interesting thing. Specifically, the first

element of the result It depends only on the first element b0, the first byte in the result, it depends on the first byte b0, the first big byte b0, the first byte a0, the first big byte e0, the first big byte a0, and the same with the second byte in the result, the third byte in the result, the fourth byte in the result. And why is it interesting? Because it means that we can, maybe, brute-force something, byte by byte, which is somehow two to a greater power faster than brute-forcing everything at once. Can we do something like that? If we bruteforce... Let's assume we have only two equations. And we can do something like that. If we only had two state elements, the attack would be almost trivial. So

we bruteforce... I set as constant B1 and A1 and B1,1 and A1,1. These are the variables we set at the beginning. It doesn't matter what is in the rule. And we can bruteforce A2 and B2. and A2 is enough, because we have 256 possibilities, and the chance of luck being 1 to 256 is 1, so statistically, and in 100% of cases, we will succeed. The same applies to three state elements, so when we have three state elements, we ignore the fourth equation, it doesn't have to be filled in, so we can do the same and brute force both A2 and B2. And this attack costs us 2 to 16 operations, which is a penny. This is 60 000 operations, 65 536 exactly. And this

is to be done on a calculator in less than a second. But unfortunately there is a problem if we want to attack a full streamhush, that... Oh, here is a code if someone wanted to break 3 rounds of streamhush. And basically it's a few lines that brute force the equation we've just introduced. The problem is if we want to...

The disease hurts. Am I alive? I think not. I'm sorry, I'm a bit sick. So now I'll only get flashes. But we'll finish it, I'll live somehow. So now there's a problem, but it turns out that if we go back to the AES round, we can bruteforce not by bytes, but by blocks. So we can bruteforce 4 bytes and then we have a chance of 1 to 34 billion that we'll succeed. which is not a lot, but the computer still manages it. So if we connect this with the previous attack, it gives us the complexity of 2 to 48, which takes a lot of time, but if it starts for a long period of time, for example a few hours, it will do.

And one change that it requires in the code, I had to rewrite it to C, because it didn't work, it's something like this, here is the mod parameter change in the loop, with which we build all the possibilities and put it in the right place. If we do something like that, it turns out that it even works. And we can ignore two files that cause collisions. So it means that StreamHash in the current version should not be used, at least in practical applications. Because these are different files, for example, SHA-512 gives a different result. And what I wanted to convey through this? I wanted to convey that your own cryptography is difficult. I wanted to tell you that even if cryptographers do algorithms that can be broken, like MD5

or SHA1 or StreamHash, the chance that some gray person like me or most of you will come up with something else is almost zero. So why is it happening? I already said that attacking is always easier than defending. It's enough to predict all the possibilities of an attack apart from one and someone will find it sooner or later. People like me, who defend systems more than attack, have a hard life. Sooner or later someone breaks down and then you ask yourself why you didn't expect it. Unfortunately, it's impossible. That's it. I have an epilogue. Maybe three more minutes of voice. If someone is interested in the topic and would like something with less quotes and more content, or something more

interesting, the first article fascinated me. is about encryption and why you shouldn't use AS directly. The second article is an advertisement, written to programmers about something similar. It's wiser than this one. There's more cryptography, and less overflowing code. And maybe there will be a third article. If someone is rich, they can buy this programmer. This programmer, something. It came out this month. And if not, I have permission to release it to the Internet in half a year, so you can wait. And if someone likes to do some weird stuff in some cool place, it turns out that there is such a place in the world and we even recruit. And you don't even have to know

a lot. You want to know something, for example, what is a computer or something. And maybe we'll send a CV, it won't be a problem. Or maybe we'll leave someone on the pulpit. We'll definitely call them. And that's it. Thank you all. Anyone has any questions? Would it be possible to make it more difficult for AI to generate a encryption algorithm? For the artificial intelligence to generate a encryption algorithm? I mean, it would be possible to do it, but what would the results be? Generally, artificial intelligence, I'm not an expert, but it works like this, it learns from many examples. Exactly, but we have a lot of these computer phones here, so they can transfer billions of data. I think it wouldn't

work... I mean, sometimes there's a new algorithm, a new attack method, for example, SquareAttack was invented for the use of Square's mania, and it turned out to be good for S, and it's about the fact that a new attack method for numbers has been created, which couldn't be predicted before. So artificial intelligence wouldn't be able to to predict these new attack methods to defend themselves from them. This is the first thing. And secondly, the artificial intelligence works so heuristically, so the code it would create would probably look like a huge mass of spaghetti. And maybe it would be good, maybe not. But there is a problem that, for example, devices like routers or something, they

also want to encrypt and decrypt. And this could work very slowly on them. For example, AS was created so that it is very fast to implement in hardware. - I would like to thank you for your critical view of my idea. Tomorrow I will tell you more about it. I have written with Michał before. He is a very good man. Privately. These algorithms that we compete with, created by professors of various universities, were watched by their students. We, who work here, somehow, plus or minus, independently, like me, we do not have the comfort to get a fair assessment before publishing our work. That's why it's very important for me that someone could look at it

critically. It's a possibility to improve and create a fifth version of the algorithm, which will be better. But more on that tomorrow. More on that tomorrow at the last session. One small note about the content. When it comes to message digest using the shortcut function, you don't need to use HMAC, just put the secret at the end to prevent the attack from being length-extended. - But of course, HMAC was designed to prevent, and also to enable, to ensure the security of message digest even if the shortcut function is not ideal. Yes, of course. But I had 160 slides and I'm surprised I managed to get them in an hour. So I'll try to... I don't think so, but it's

a good thing. I hope it will be tomorrow. There was another question via the web. People asked if you could share the presentation so they could read it and click on it. Yes, it's in HTML so you can click a lot. And on my blog, tailcall.net, it will be posted. Okay, thanks. Anyone else? That's it. Thank you.