Search
Daniel Deogun & Dan Bergh Johnsson -- SE Radio guests

SE Radio 684: Dan Bergh Johnsson and Daniel Deogun on Secure By Design

Daniel Deogun and Dan Bergh Johnsson — two of the co-authors of the book, Secure by Design — discuss the intersection of good software design and security with host Sam Taggart. They describe how following certain software design principles can help developers create secure software without needing to become security experts. They talked about how this is the continuation of developers taking on more responsibilities: Agile asked developers to become responsible for testing their code. DevOps asked developers to work together with operations in deploying their code. Secure by Design asks developers to incorporate security into their designs.

Brought to you by IEEE Computer Society and IEEE Software magazine.



Show Notes

Related Episodes


Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Sam Taggart 00:00:19 This is Sam Taggart for SE Radio. I am here today with Dan Berg Johnsson and Daniel Deogun. Daniel Deogun is a frequent conference speaker on Cybersecurity. He’s worked on cybersecurity, a wide variety of domains and is currently the Chief Academy Officer for Omega Point in Sweden. Dan Berg Johnsson is a partner at Omega Point and also a frequent conference speaker. Along with another Dan, Dan Sawano. Dan and Daniel authored a book called Secure by Design. We’ve talked previously about security in many episodes such as 584, Charles Weir on Ruthless Security for Busy Developers. 453, Aaron Rinehart on Security Chaos Engineering. 418, Vladimir Khorikov on Functional Programming in Enterprise Applications and 405 Yevgeniy Brikman on Infrastructure as Code Best Practices. Welcome Dan and Daniel.

Dan Bergh Johnsson 00:01:06 Hey, thanks Sam. Thank you so much. Thanks for having us.

Sam Taggart 00:01:08 Thank you. Let’s start by talking about your book. My main takeaway from the book was that instead of training engineers to be security professionals, it’s easier to emphasize good design practices and that those good design practices just so happen to address a lot of security concerns as well. Is that the intended takeaway?

Daniel Deogun 00:01:26 Yeah, I would say the purpose when we wrote the book was that as engineers, as developers, you’re so focused on delivering new features, keeping the timelines. You have a tech lead or some sort of product lead that’s on your back all the time. So keeping your focus on security as well, that becomes a great burden or what do you say, Dan?

Dan Bergh Johnsson 00:01:47 I would say captures 95% of the intent on meaning. You get a lot of security from focusing on good design patterns, which as Daniel alluded to, is easier for developers to just keep at the back of their head all the time than explicitly thinking about security. Whereas there is a rest 5% that you really should be aware of security practices and special procedures as well. So you can’t ignore them. But definitely I think it kept 95% of the intent in that assessment.

Daniel Deogun 00:02:22 So in general as a developer, you would then start writing secure software simply by applying good design practices and get that security, I wouldn’t say for free, but for a lot cheaper than you would do otherwise.

Sam Taggart 00:02:37 So that brings me to a question. Is there some baseline level of generic security awareness training that you recommend for developers or is your recommendation more just to let developers focus on good design?

Dan Bergh Johnsson 00:02:49 I think there’s, to be able to have these kinds of ideas that secure by design is around that you should use design patterns in a deliberate way to get improvement in security. You need to have a basic awareness of what things you are addressing. For example, there’s a lot of practices to protect yourself against crosshead scripting or scur injection, stuff like that. But if you don’t know about scur injection at all, you don’t really know what good designs to look for. So familiarity with quite a breadth of security things, but you don’t need that kind of like specialized knowledge to get a lot out of benefit from that knowledge.

Sam Taggart 00:03:31 Can you describe the traditional purchase security and some of the problems with it?

Daniel Deogun 00:03:36 I would say one of the challenges with the traditional view is that you have to almost be an expert to understand what to do. And if you misconfigure something, it could be a framework, it could be a library or whatnot, you might miss something and if you make that mistake suddenly you’re vulnerable or you allow something. And that’s a big problem for many developers that they don’t feel educated enough on that topic in order to make those type of decisions.

Dan Bergh Johnsson 00:04:03 Yeah, we often see that there’s kind of like a gap traditionally between traditional development role and traditional security role, which is hurtful actually because it makes developers shy away from security and say that, oh, we haven’t met really senior architects that say, oh, I take responsibility for the uptime of the system. I take responsibility for the maintainability of the system, where I take responsibility for this and this and that. But when it comes to security, they go, oh security, no, I’m not a security. There are specialists that will take care of that with. That’s not really help at all.

Daniel Deogun 00:04:42 And the fact that you have that mindset points to an organization where you have us and them that security somebody else’s problem is not really your thing. But in reality, security is a concern. You need to design your software in a way that is secure in every aspect.

Dan Bergh Johnsson 00:05:01 And we have seen a similar play, play out before security. If we look at the early days of Agile with this gulf between developers, that is coders, programmers and testers where testing was a very separate activity and the collaboration between developers and testers were not always hard, but perhaps even like antagonistic and it didn’t help either camp. And throughout a few years we have learned to include coding and testing into the same cycle and I think we see the same kind of progress now that security has been a very separate field, but now security is embraced within coding and testing as a part of, as Daniel said, one of many concerns and a real important concern, but none that should be treated separately.

Daniel Deogun 00:05:56 Writing secure software today is more of a hygiene, I would say. It is an expectancy nowadays of almost any client you interact with. They expect this software to be secure. I mean if you design something that’s insecure is not going to pass as good software. So having the skill to apply the certain patterns have the mindset and so forth, that becomes something that everybody needs to do now and it’s so important because of that.

Sam Taggart 00:06:26 So your comment on Agile and developers and testers, before you came up with that, I was thinking of the DevOps idea of combining developers and operations, which is also very similar. So there seems to be a theme there.

Dan Bergh Johnsson 00:06:37 Definitely. I see that these are three very consecutive steps. First, developers embracing testing and testers getting it to get the dev test culture. And then we’ve got the DevOps culture and now we’ve got this security. So we are rolling more and more roles into this fruitful collaboration.

Sam Taggart 00:06:57 So do you still see separate roles for security practitioners as well? Is it about combining development and security together or is it about both overlapping a little bit? How does that work?

Daniel Deogun 00:07:08 I would say that you can compare this to a like a physician, right? I mean you have experts, those that are really good at doing surgery or something and then you have the common practitioner that sort of has a broad understanding. Most developers will be, those have a broad understanding, whereas you will still have experts in certain areas, right? I mean it could be an expert in authentication, authorization, logging, flows, stuff like that, right. Whereas some others might be experts in cryptology and stuff like that.

Dan Bergh Johnsson 00:07:39 And if we jump the fence a little bit and put ourselves on the other side and ask the dual question to the security professionals, they say that of course we all have got experts that are like really good offensive security testers that can try to get into your systems. But those experts become a lot better if they know a little bit of coding or if they learn, know a little bit about infrastructure or sys ad or whatever. So I think we’re seeing the same thing on both side of the fence that we will both retain our deep professional knowledge and expertise, but we’ll start to learn a little bit more about each other in the corporation will get a lot better.

Sam Taggart 00:08:23 The book specifically talks about domain-driven design. I realize that’s a very large topic with a lot of books written about it, but can you briefly describe the main ideas behind domain-driven design, the ones that at least apply to security?

Dan Bergh Johnsson 00:08:37 Well, if we pick out the part that we think give us the most leverage, it is the message of domain-driven design that a conceptual idea of what you are building is real essential. Application or system is not just a lot of code that happens to work. It’s kind of like encapsulate and encode a certain set of ideas on how they work together. So if you work on a finance system, you should know what is an account in this kind of context as opposed from a transaction which is something else. And they might in different solutions get together and have a little bit different meanings and different boundaries between them. But for this precise application we’re building, we need to have a sharp understanding of what we mean with the different things and that kind of deep understanding, which eliminates a lot of misconceptions and thus bugs is what we built the security upon.

Daniel Deogun 00:09:35 As you say that Dan, the specificity of domain-driven design and knowing exactly what you need is what you can also use from a security aspect. One of the most common mistakes that are done is that you use to, how do I say, generic or broad definitions of your data types in your software. So if you instead choose something that’s very specific, then you can narrow the input to only match that definition and then it becomes very difficult to inject something that violates that contract.

Dan Bergh Johnsson 00:10:07 For example, if you are inputting an address, a zip code throughout the world might look almost in a conceivable format, but if you’re just writing a software for France or for US or for some other limited domain, well then you can narrow it down very, very harshly and get a tight definition of it.

Daniel Deogun 00:10:32 Which makes injection attacks a bit more difficult to succeed.

Sam Taggart 00:10:36 Yes, there are several different directions we could go with this. You hit on several things that I want to talk about. The first thing I think that’s worth talking about is domain language because I think that goes to your specificity part. I remember working on a system where we had both order numbers and order IDs and Ö

Dan Bergh Johnsson 00:10:53 Oh, that sounds fun.

Sam Taggart 00:10:54 Trying to remember which one was which and when to use which one was quite difficult. And we didn’t really have a good dictionary of all the terms. So on a project, would you have like a domain dictionary that says these are what each thing represents?

Daniel Deogun 00:11:08 I guess a dictionary, but what you would have I guess is a model, right? I mean you have a conceptual model where your terms are defined, so to say, in the relationship between these terms. So that is what you should have and we encourage everyone to define their ubiquitous language as we talk in domain driven design, right? To have one definition of the word and no synonyms and so forth. And that helps you to understand exactly what you need to fulfill for a certain concept in your domain. So in that sense we have a dictionary, but it’s expressed mostly like a model I would say.

Dan Bergh Johnsson 00:11:45 I would usually say that the dictionary is an excellent starting point for building a model. If you haven’t got the dictionary, you should definitely start there and collect not all of the words that you use but ensure to really well define some 15 or 20 that are real important terms. But as Daniel said, a language does not stop with the dictionary. It’s also how do you construct a sentence like what verbs and nouns go well together. Will you say that you register an order or do you say that you put purchase request or do you say that you get a warrant form, refill formulation or whatever. So watching out to see that, have we got a lot of synonyms that there might not be precisely synonyms and then in that case, try to prune your language. You also got the idiomatic. What kind of phrasing go well together?

Dan Bergh Johnsson 00:12:45 And I think there’s a lot to do with that, that you really need to speak that language out in the open because then you’ll catch yourself when you start drifting off that terminology and start using other words, that’s a sign that here you’ve got something that you need to specify a little bit better because sooner later you’ll have to write it in code. And if you can’t speak it clearly you are in the worst position when it comes to coding. So yes, the diction is a starting point, but it’s more about like books and literatures and how you expect yourself in sentences.

Daniel Deogun 00:13:20 I would also like to add that having your language shouldn’t just be written down in a model or as a dictionary. It should also be reflected in your requirements in your tests and so forth. And in your code, in

Dan Bergh Johnsson 00:13:32 Your code, in your Durex.

Daniel Deogun 00:13:36 So it is truly a ubiquitous language that you have created for your context, for your you operate.

Sam Taggart 00:13:43 Do you have any specific examples you’ve seen where miscommunication contributed to a security issue?

Daniel Deogun 00:13:49 So you say it this way, you have a shopping cart, and in that shopping cart you can add items or products, books perhaps.

Dan Bergh Johnsson 00:13:58 Like four copies of Secure by Design.

Daniel Deogun 00:14:01 Yeah. So four copies of Secure by Design we say you have a quantity of four, right?

Dan Bergh Johnsson 00:14:06 So quantity is just an integer.

Daniel Deogun 00:14:09 Yeah. The problem now is that if you have defined quantity as an integer, what you automatically accept in your software is all the capabilities of an integer. It can be added, it can be negative and it can be extremely big, right? But from a business perspective, I assume you won’t accept a negative quantity of books or a billion quantity of books, right? I mean it doesn’t make sense from the business perspective, but from the software perspective that works since you used an integer.

Dan Bergh Johnsson 00:14:42 And we’ve seen this is really, there have been some really public bug around this where a negative amount multiplied by a positive price ends up as a negative contribution to the total sum. Basically giving a discount. And this has been both in public domain, so we’ve just seen it over and over and over again in our assessments where we’ve put into customers and clients and worked with them. So you mean that I can create myself a discount by just adding minus 10 of order number, blah blah blah. Oh that was not intended.

Daniel Deogun 00:15:17 It is a discount feature. I would say, but it’s probably not what the business would like. The interesting part, if you analyze this from a why does this happen and usually it is developer’s tendency to fall back on language primitives. Language primitive types that you have need to be a string, an integer, a float or something like that. So they use that to represent a very specific concept in your business domain. But you use a generic concept to do that specific work. So string is also very common that you accept a string, but you expected it to be an order number of what you mentioned before.

Dan Bergh Johnsson 00:15:58 Now we’re talking about just a range of the sensible quantities like from one or zero up to 200 something or whatever it is. But it might also be the operations you can do on them. For example, we have had a client where they let’s obfuscate this a little bit to protect them. They had bus numbers and at one point there was an addition where one bus number was added to another bus number and of course that was not the intent, it was to other things, but someone had just mistaken two parameters for each other and ended up adding two bus numbers coming up with something that looked like a bus number, but in the end made no sense at all and led to very strange behavior, a few steps down the line. Also, something that is preventable, you shouldn’t be able to add two bus numbers to each other.

Sam Taggart 00:16:50 Yeah, I think the other place I’ve seen this is in something that maybe takes in two integers, but they represent two different things and it’s very easy to swap them if you’re not careful. Oh yes. And especially in like a strictly typed language, if you swap them then hopefully the compiler would catch that and break that if they were different types.

Dan Bergh Johnsson 00:17:07 Like longitude and latitude just happened because which is long, which is lat? No one ever remembers that. Thereís also some more intricate stuff like we were talking about. Some people in the gambling, online gambling industry where they hand out money like for free spins and stuff like that. And that is represented as dollars. But then people are very good at finding ways of registering something, get an amount of dollars free and then withdraw them. So often they end up adding a lot of very strange business rules about what dollars you can withdraw after having this or that much gambling. And at one place, which we discussed with, they were actually doing explicit modeling of that. So they didn’t have money, they had withdrawal bill money, they had deposit money, they had free spend money and then they had rules for how to convert them to each other. So there’s a good example where they actively had made a modeling decision to separate different things.

Daniel Deogun 00:18:13 I think it all comes down to if you are careful and start modeling your concepts and add, how should I say, specific rules to it, then it becomes much, much harder to inject something that violates that. Like we see for every injection attack that you inject something that the software didn’t expect, right? So you can access the database and it behaves in a way you didn’t anticipate and so forth, right? So yeah, be specific. That’s our rule for.

Dan Bergh Johnsson 00:18:45 Either inject or to confuse the state. That you end up in a state that was not anticipated and not desire, but enabled the user to do stuff that they shouldn’t be able to do.

Sam Taggart 00:18:58 We’ve been talking around the idea of domain objects, which I don’t think we actually specified. So in domain driven design there’s this idea of domain objects. Do you want to just say what that is?

Daniel Deogun 00:19:08 So a domain object is most likely a concept that’s native in that particular domain, right? I mean if you talk about hospitality then of course your room number is very common. If you’re at the hotel and say, hey, put this on my room and things like that, that’s a concept that needs to be expressed in that domain. So a domain object is something that’s very familiar in that context.

Dan Bergh Johnsson 00:19:32 I think it might be valuable also to point out that we in modeling often see these domain objects coming out in two main flavors. Modeling wise, the most usual that people tend to be familiar with from object orientation, basic, 101 at universities and stuff like that are of course the entities. It’s something that concept that models something which has got the lifespan and an identity that can be changed. So a stay at a hotel is something that might be prolonged, it might have a tab associated with it, which can grow. A person, might be something that who can change their names and grow older or change their relationships and states in different ways. The other big modeling to this, of course the value object, which models a specific aspect of something like a color red or the number four or the duration of 24 hours. So it’s a little bit more like writing something down on piece of paper and then you can be able to share it but you do not change it. Duration of 24 hours cannot be 25 hours because then it’s another duration. So those two tools are very fruitful to keep apart.

Sam Taggart 00:20:57 So want to make sure I understand correctly When you’re talking about value objects, generally they are immutable? Would that be a correct term to use?

Dan Bergh Johnsson 00:21:06 Absolutely. A value object that is not immutable would be a very strange thing.

Sam Taggart 00:21:10 The next questions I have are around validations, but something occurred to me while you were talking Daniel about room numbers and oh, I want to charge something to my room. It seems to me there actually are two validations there. One is that it’s actually a valid room number, but the other one is that there’s actually somebody staying in that room.

Daniel Deogun 00:21:27 Yeah.

Sam Taggart 00:21:28 And so in the book you all make a very strong point about the order in which you validate stuff and how that contributes to security. Can you talk a little bit about that?

Daniel Deogun 00:21:38 Yeah. So the general principle here is that if you look at the relationship between an attacker and the attackee I guess, then you want the relationship to be asymmetric in the sense that it should be really expensive for the attacker to craft the malicious input and it should be really cheap for the recipient, the server, right? So we should be able to reject something quickly and at a low cost. So because of that we have just created what we call an order of validation. Basically it’s a five-step order where we do the cheapest operations first and well, the first operation to check that would just be origin. Does this come from a valid place? It could be that it does come from something that you trust or that it has correct token or something like that, it’s actually coming from a valid origin. It’s okay to have data from that point. The second thing to check would be size or length, stuff like that. That would be constant execution time basically is very cheap due to validate is this room number indeed of proper length?

Dan Bergh Johnsson 00:22:47 And it might be just roughly like order magnitude. If we expect something of a hundred bytes, well check that is no more than a thousand bytes.

Daniel Deogun 00:22:58 So that means that what you have successfully done simply by doing that, you have limit input vectors to meet that requirement. It can be larger than your limit. It can’t be longer than your, if there is a string, it can be more characters that you expect or it can’t be bigger in size if it is kilobytes or bytes or something. The second thing need to do, if it fits that criteria, you have to invest a little bit more. So in this case we want to see that, you know, what does this input contain the proper characters? We don’t care about the order, we care about it is the valid characters. And in a room number, it might be just letters, it could be a specific subset of letters can say that, oh, a room number could either start with A or B but no other character and then there’s going to be four digits. So you would know that oh it can contain A or B or the digits is zero to nine, that’s it. So you don’t care which order it is, but you say this input data does indeed contain the proper characters.

Dan Bergh Johnsson 00:24:10 But what we do at that point is that we actually open up the package and look inside it. And that’s why the first two origin and size are so powerful because you can throw away a lot of attacks without even looking on the inside.

Daniel Deogun 00:24:24 So the idea here is that we have to invest a little bit. Like Dan said, we have to open up the package, we have to look at the content and if the characters match, then we say, well the likelihood now that it is indeed a room number is quite high, but there is a chance that somebody injected something that it contains only A or B at the end or in the middle or only characters. So we need to make sure that the order of the characters are also indeed correct. And in that sense, we would’ve to parse the data.

Dan Bergh Johnsson 00:25:00 Like we could have a Json, but we need to check that it’s from Json and that it’s not incredibly deep.

Daniel Deogun 00:25:08 Yeah. So doing that, we will then of course send that data could be a regular expression engine checking that. And the reason for why we’re doing this so late in the order of validation is because the RegEx engine, the backtracking algorithm that’s used in there is susceptible to bad input that could sort of halt your system.

Dan Bergh Johnsson 00:25:30 And that is of course goes for adjacent parcel and XML parcel as well. Classical things to try to attack if you want to DDO someone.

Daniel Deogun 00:25:37 So basically we move up the ladder of both complexity and cost from a recipient’s point of view. And if it passes all that, then we can say yes, this is indeed a true room number in this case and we can then instantiate a domain object for that.

Dan Bergh Johnsson 00:25:56 And then we end up at the fifth step, which I think you alluded to in the question like touched upon, which is now that we’ve got this input, does it make sense in the system at this point of time? And to continue Daniel’s explanation of what valuable resources we protect, then what we’re protecting here is of course the database. Because to check whether something makes sense at this point of time, we probably have to fact some data from the database which is uh, pretty expensive is operation.

Daniel Deogun 00:26:28 So this fifth step is actually something we call the semantic validation or the, it’s a semantical check. And the best way to explain that is if you go back to your order object that you talked about before, an order object would probably have a method add or something to it. So you can add your products to your order, but of course you can’t add things after a certain time. Like for instance, after you have paid the order, you shouldn’t be allowed to add new products to the order. And because of that you need to make sure that the add method is protected in the sense of does it make sense to add something at this point of time? So usually many developers tend to forget that what they do is that they simply don’t call the ad method. That’s how they protect against illegal ads, which is poor,

Dan Bergh Johnsson 00:27:16 And we must say this five wrong ladder of input validation to us, it’s a very central thing that then we build a lot on top of like our designs of domain primitives is of course on top of that ladder when we talk about consistency of states and the state evolution, it’s of course hang on this ladder as well and just to pay credit where credit is due, we have to make a shout out to, security researcher Dr. John Vilander who was to our knowledge, the one who phrased this specific ladder. Unfortunately he hasn’t published it into a peer reviewed paper, otherwise we could have pointed to and it would’ve been worth a paper of its own. But he’s just mentioned it in conference presentation and he in his writing on a blog and stuff like that. But we really take it to our heart and say this is a design principle that makes sense from a developer’s perspective but adds a lot of security, especially when you combine it with other patterns.

Sam Taggart 00:28:21 We will touch on some of those other patterns in a second, but before we leave the validation, how does the idea of allow list versus block list play into this discussion?

Daniel Deogun 00:28:32 Once again, I would say you can compare it to being specific or generic in a sense. You know, by doing a allow listing you’re very specific. You’re saying only these are allowed to enter, whereas the block listing or disallow list, that would be, you know, the opposite. It’s much harder, it becomes easier to inject something. We pass that type of guard. If you go that route.

Dan Bergh Johnsson 00:28:55 And you can see these five checks on the five rung ladder to be five consecutive allow lists. One allow list checking for allowed origins, like you have to have a claim in your, your token or you have to have an API key or you must come from a well-known IP address or whatever. Second one, you have to have an allowed size. Third one you have to allowed characters, et cetera, et cetera, et cetera.

Sam Taggart 00:29:26 I remember reading a book called the Bug Bounty Hunter’s Guide or something similar and it was very eye-opening because it showed here’s how a developer might normally filter this thing. And then it was immediately like, here’s how you get around it. And there was another, oh well if they thought of this then here’s another way around it. And it just seems like a lot of that was based on this idea of like blocking specific things as opposed to like that whole specific general.

Dan Bergh Johnsson 00:29:49 I think that’s the blocking idea and that’s another idea that I’ve seen come in play in many places. It’s where developers try to be helpful and fix input that is coming in and seems a little bit broken. Like oh, there was not end of line at the end of this post. It was a null character. So then I help the input by replacing that null character with a line feed or whatever it might be replacing or taking away quotes or double quotes or stuff like that. And then you often create a very intricate machine that you do not know all the effects on and that is what the, the attacker looking for. Huh, look here if I’m sending in this kind of stuff, they help rewrite that into some other format. I wonder what I can do with that.

Sam Taggart 00:30:41 I’m curious, what role does input or unit testing play with all of this input validation?

Daniel Deogun 00:30:46 So your unit tests are of course your intention is to see that the behavior is correct. But I would say from a security perspective, you can write unit tests that are more of a, let’s call them a negative characteristic. You basically can test the boundaries that you have defined, right? I mean if it is a length boundary, character types and so forth. So you can see that you are truly rejecting things that are outside of your boundary. But what you can also write tests for is to sort of make sure that you reject things at a very cheap cost. Like for instance, if you, you were expecting a zip code and obviously that you know, it’s a very limited set of characters or numbers, but if you write a test that try to input a, I don’t know, a million characters and if you have done it incorrect, your RegEx validation might halt. So even you know that that that million characters isn’t a zip code you should have rejected immediately. So simply by adding those checks you can very easily check that your code meets those criteria using unit tests.

Dan Bergh Johnsson 00:31:52 Apart from that I would also like to add on the consistent usage of input validation because if you’ve got a large system where you’ve got a lot of order identities, order numbers, whatever the differences, then you need to ensure that you add that small if statement with a RegEx on every single input field. And if you forget to do it in a few places or they become inconsistent in a few places, then you’ve got vague inconsistencies that the security attacker might find interesting. Whereas if you drop that into something that is a unit testable unit, a domain primitive value object capturing the essence of the order number and you have the unit test of that, then you automatically get that validation check in all the places where that domain object is used. So if you consistently use order number domain primitive in all the input fields, then you get the same and enforced and the same input validation in all the places. So this is why I think domain is it modeling and using design pattern, it’s can kind of like comes natural and programmer thinks it’s nice, good looking code to get that draw to write good code and actually piggyback security on back of that drive.

Sam Taggart 00:33:22 I just want to reinforce what you said because it was a very big point in the book too. Instead of passing around the primitives of the language strings numbers, you pass around the domain objects and you put the validation check which is a very specific things that you’re checking a correct order in the constructor for the object. And therefore if it’s in that constructor then everywhere that you’re constructing that object, you are making sure that that input is validated and then everywhere you’re using it you’re only using objects that can only be valid because you can only instantiate them through the constructor. Is that kind of the general idea?

Dan Bergh Johnsson 00:33:52 Indeed, if you’ve got an order number in your hand, you know that it has passed through that validation check. It’s a little bit like you have wrapped it and put a sealed stamp of approval on it.

Sam Taggart 00:34:05 This is as opposed to checking it at every single method every time you want to use it, which would get veryÖ Exactly.

Dan Bergh Johnsson 00:34:10 Have anyone ever deep, deep down in a system seeing a small if statement saying if order number matches and then RegEx and you look at it and say what the flying we are deep, deep down in the system, this should have been input validated like a gazillion times before. And then you start backtracking that code and you realize that it has actually passed through that specific pattern matching like five or six times already. But deep, deep down there in the system the program was not, still not confident to say that this string actually has got the right format.

Sam Taggart 00:34:49 The immediate next question that raises is were all those RegEx is the same?

Daniel Deogun 00:34:54 Exactly.

Dan Bergh Johnsson 00:34:55 Most probably not. Subtle differences.

Daniel Deogun 00:34:59 So I would say that what we touch upon right now also is the complexity of software. If you choose to design that we sort of encourage like what we have called the domain primitive with this validation and immutability of objects and like a building pattern that you or block that you can use wherever it’s defined, then you reduce complexity and you don’t have to have those double checks, you don’t have to have all these RegEx everywhere and so forth, which reduces overall complexity. And by reducing all over complexity, you also decrease the risk of failure, which in turn reduces the risk of having a security vulnerability. So it sort of all goes hand in hand by using code constructs that help you design better code, simpler solutions so to say you also reduce the risk of error and yeah, makes it better and more secure.

Sam Taggart 00:35:56 Let’s talk a little more about failure. That wasn’t exactly in my list, but I remember talking about that in the book. There were some very specific points about like how to fail well. Was there a concept of bulkhead or something like the different compartments in a boat? Is there some, do I remember that correctly?

Daniel Deogun 00:36:09 Oh right, right. Yeah. Right. So if you look at it from an architectural perspective, one very common mistake is that you have cascading failures, right? So using proper bulkheading is, is a very good design patterns for you that allow you to limit the blast radius to a certain area of your design and making sure that it doesn’t propagate to your other areas because then an attacker can simply attack the weak spot in your architecture and then after a while the problem propagates to the next one and the next one and the next one. So using good design to make sure that your bulkheading is good, then you protect yourself against that type of availability error.

Dan Bergh Johnsson 00:36:48 So one example of bulkheading can be to compartmentalize different customer regions into different databases that are shorted. It could be that you have, do not put everything into the same database, the database instance, all of it. Because if one of the services crash because of the database, then all services crash depend on the database. It’s like having a big microservice architecture standing on the same tray. If the trays tip over, you’ve still got a single point of failure. So I think this bulkheading idea is, is very potent. It can be reused both in the infrastructure and in the constructions of the code and also in the modeling.

Sam Taggart 00:37:32 That reminds me of the idea of an attacker breaking into a network and then moving laterally. Very similar idea. They find the one weak point and then that happens to be connected to another point and there’s a gateway there and then they just spread.

Dan Bergh Johnsson 00:37:44 And this is basically what the new zero trust, I am not going to say fad because that would, but the attention to zero trust architectures is very much about not having parameter security but having security and depth. What we often like to talk about is not only in depth but having interlocking practices and interlocking patterns that grow stronger together than I just practice one of them on its own.

Sam Taggart 00:38:13 Speaking of other patterns, what did I think struck me because I had not really thought about it a whole lot, was the idea of logging and being very careful about how and what you log. And from a developer’s point of view, not thinking about security, I always want to log as much information as possible because I want to be able to debug the issue. But there are very serious issues with that. Can you talk about that a little bit?

Daniel Deogun 00:38:34 There’s so many different types of attacks and one of them would be like a second order attack and the easiest way to explain that in terms of logging is that the attacker injects an attack vector that isn’t really targeting your primary system.

Dan Bergh Johnsson 00:38:50 So Daniel is sending it to me. Yeah, but it’s not really hurting me.

Daniel Deogun 00:38:54 Exactly. And that attack vector is logged. So the log data then is interpreted by someone with, I guess they’re using a tool that’s perhaps web-based or something and there’s a weakness in that tool. So that attack vector is for that tool, meaning that you use your system as a like a jump host in order to attack that other system because of this logging, and our tendency as developers wanting to help all the time, right? We want to log so much so we can really help you in debugging things, can be a bit risky from the security perspective, right? So instead we should log things that would describe the problem, right? Saying I got an invalid input, it was too long or too big or something like that.

Sam Taggart 00:39:43 Without reproducing the actual input would be one of the keys there.

Daniel Deogun 00:39:47 Exactly. Yeah. And if you really, really need to get your hands on that input, well then you have to reach out to the, whoever sent you that input, right? Then you can probably figure it out

Dan Bergh Johnsson 00:39:57 So for example, why should you log input at verbatim instead of tracking who sent you that? What then ask them why did you send this?

Sam Taggart 00:40:07 I can think of an interesting reason to do that though. And that’s not from a developer perspective, from a security researcher’s perspective, right? If I’m the security engineer for the team and somebody sends me a thing and I get a thing that says, oh it was invalid input, I am very interested in, well what was that input? Because I want to know what tripped what they were trying to do, but I don’t know how to do that.

Dan Bergh Johnsson 00:40:23 We do acknowledge that situation and say that of course you can sometimes track things but then you should like put them in a separate database. It should be stamped with “piping hot.” You’re only allowed to open this while wearing gloves, breathing protection, you have to have your certification to do it. Basically saying that this is dangerous stuff, this is potentially attack vectors that’s in there. I think what we think is a very valuable way of think of it is alludes to that, that developers often think about logging as debugging, but if you look at a running system, well debugging is valuable but what we really want logging for is operational insight. So it’s really to help what would we log to get insight into what the system is actually doing while running in production. You should be able to look at the system, say, oh it’s doing this, it’s going well. And then you should capture interesting things. For example, this could be extremely interesting to capture, to sample those attack strings, put them in a really, really well protected place. It might be interesting to look how much just, rate of two hundreds versus four hundreds. So we need to do some kind of logging or metric around that. But that’s not really the same thing as debugging and we think that’s a real important distinction between those two.

Daniel Deogun 00:41:54 Would like to add also that another thing that many forget when we talk about logging and how you set it up in your system and so forth is that you tend not to limit the privileges for your log writing process. It usually you only write data to your logs, but from a privileged perspective you can still read the data. So you forget the least privileged principle here. What you should do in this case, you should limit. So your system can only write data to your logs, whereas a human or somebody that process your logs can only read them. They can never write to them. By doing that type of separation, you actually, you create an extra security layer so to say.

Dan Bergh Johnsson 00:42:38 And we do see a movement here like over like recent years, people have gone more and more towards having centralized logging systems where you gather the logs somewhere instead of having them on one log file each for each node of the server. And then you’ve got the real correlation nightmare if you want to understand what’s going on. So this is also something that is secure by design through design. Those centralized loggings were mostly made to make life easier for sys admin, for observability, et cetera. But they also have very interesting secure side effects. So which is I think a really nice example of something that has that kind of characteristics to secure by design, a design that makes things more secure. But we’re stepping away from the earliest years of secure by design, which was domain driven, the domain driven security, domain driven design, driving security. And then we reiterate that idea of having some kind of design that gives a security benefit but in other fields.

Sam Taggart 00:43:49 Yeah, well I imagine the logging, the separation of reading and writing that also helps with things like attackers getting in deleting logs and or like auditing and people manipulating logs to make things look better or whatever their agenda might be.

Daniel Deogun 00:44:03 Exactly. Exactly. And you get that sort of, what should we call it? Security benefit for by free say even though you didn’t think of it that way. You said, all right, well let’s just make sure that we can only write here. Then you have protected yourself against that type of action so to say.

Dan Bergh Johnsson 00:44:20 Taking that one step further, of course you can also split your log into different places depending on the different purposes. For example, you want to have some kind of metrics log just to keep track of the performance of your running system. You might have some other kind of log for sys admin antennas, you might have a third log, which is for auditability. When the auditors come in and say we want to have a look at that, we can see what’s happening in your systems the last seven years. And by doing what Daniel said, you can get the separations so you can treat each of those log syncs for different purposes, separate.

Daniel Deogun 00:45:00 You can also make your CFO very happy because then of course you can have much cheaper solution for your metrics compared to your audit log, right? That you need to maintain for years. Whereas the metrics can go away tomorrow, right? I mean it doesn’t really matter. So that also comes from using that type of design.

Dan Bergh Johnsson 00:45:18 But we think that log systems are so interesting and have so many concerns in and of themself that we think that developers spontaneously don’t take them serious enough. So a lot of developers, the log is just a file on your log system, but it’s actually a subsystem of its own, which is own characteristics and runtime environments and financial implications and compliance implications. So we think it’s really worth to spell that out and make the logging a first order citizen subsystem and it merits some API of its own it maintain it needs management of its own, et cetera.

Sam Taggart 00:45:58 Have you found that the recent Log4j bug brought logging to more people’s attention, at least in the C-suite perhaps?

Dan Bergh Johnsson 00:46:05 Well, guilty as charge, me personally, definitely.

Daniel Deogun 00:46:09 I’m quite surprised when, when I talk to many people around that time, many treated, you know, Log4j as console output almost, right? I mean like if I input something, I expect that something to be output, you know, and didn’t see Log4j as a, I would say a library or a thing that was its own system.

Dan Bergh Johnsson 00:46:34 It was a hand away to write to the log file.

Daniel Deogun 00:46:36 Yeah, exactly. It was just a bridge to that. And they expected data to be output exact the same way as you input. I think that sort of at least put logging on the agenda.

Dan Bergh Johnsson 00:46:49 Yeah.

Daniel Deogun 00:46:50 Yeah.

Dan Bergh Johnsson 00:46:50 I would say to be honest, we actually had a few presentations at conferences thereafter under titles like What Log4J Taught us about Secure by Design and stuff like that.

Sam Taggart 00:47:03 There is one more topic I want to talk about briefly and that is the three Rs that, feature in your book. Can you talk about that a little bit?

Daniel Deogun 00:47:11 Sure. The three Rs stands for, Rotate, Repave and Repair. Dan, so what’s your favorite R here?

Dan Bergh Johnsson 00:47:18 I think we just briefly should just do the same thing and say where is this design, where does it come from and where does it end up security wise? So basically this takes springboard off work in the reliability field for cloud for example, the 12-factor manifesto, the rugged manifesto, et cetera. And it’s a realm of designs that make things easier for you to survive in a cloud environment, but also comes with interesting security benefits. So there are three Rs. My favorite I would say is the one that is strangely named repave, but which I understand as replace. And it’s a routine of replacing your running instances in a structured and scheduled manner. For example, if you have a cluster of five nodes, then you spin up a new one, commission a new one, and then decommission another one just because it’s an hour old, not because it’s broken or something like that. So you constantly replace everything that you got on your cluster with fresh instances.

Daniel Deogun 00:48:36 From a security aspect, what that means is that if you’re as an attacker able to install something on those notes, then you have to reinstall every time a new image is commissioned into your cluster. And then you might say, oh well that shouldn’t be too bad for an attacker. They could probably script this or, you know, keep installing their stuff on those nodes. But what that also then creates this traffic on your network, right? I mean it increases the chances of detecting this type of activity. So that’s only one R, by say basically that’s in a random fashion we could repave our cluster with, you know, our images for instance if that’s the case

Dan Bergh Johnsson 00:49:15 And the kind of threat actors that we want to make life difficult for, it’s not the script kiddies that do drive by hacking because they’re not that dangerous. Those that we want to shield of in this kind of schemes. Of course they’re kind of persistent threat actors that do advanced persistent threats because they got patience and they got resources and they install something and they use that as a sniffer. And by doing this kind of routine that Daniel talks about, they get lost. Oh, everything they’ve installed is just get lost and they need to reinstall instead and then they increase the chances of getting detected.

Daniel Deogun 00:49:52 Another R is of course rotation and that’s rotational secrets for instance. By designing your software to get your secrets from your environment like a key vault or some other system that contains your secret, you could design it so that the secret is rotated on a, I guess a random schedule. You can be really offensive if you want to. You can go by changing secrets after each call. I guess that would be crazy, but in theory you can have a very, very aggressive rotation scheme.

Dan Bergh Johnsson 00:50:26 I think this becomes a little bit clearer if you see the ante pattern that we are trying to antagonize against. Daniel, what is the record holder we’ve seen of people having the same production database?

Daniel Deogun 00:50:41 I think at least the record I think you’re referring to is one client, well they hadn’t changed their password in production for the past 18 years. The downside was that they also used the same password in their other environments as well because it was much easier.

Sam Taggart 00:51:00 My immediate question is how many employees left during 18 years who still do that password? 18 years.

Dan Bergh Johnsson 00:51:06 They were really aware that it was too many. Yeah. And I think there’s something really interesting because that talks indirectly to what we are talking about, that the reason for not changing it for 18 years is what is, it was too complicated to be able to get around that you would basically need to have all the developers and all the operations people in the house at the same time. You had to tear down all the system, recompile everything from scratch and get it up and running again. So the application design was putting a stop to doing rotation of passwords and that is what’s secured by design of course. It’s about that we shouldn’t make designs that make the systems operationable so that it’s possible to do these kinds of things.

Daniel Deogun 00:51:54 Well I would say that if you have designed the software that you either, I mean I guess you hard code your secrets in your code or that you supply them at deploy time, perhaps that could also be the case. Then you create a mess that all your dependent system on that secret needs to be either redeployed or recompiled or rebuilt, which makes it very difficult for someone to decide that, oh we should change this password now or we should, you know, rotate this. Instead If you design a software such way that you fetch the secret upon need, then you can simply rotate it in between those fetches and also create a, how I say, a resilience retry mechanism against the resource that you’re trying to access and make sure that, well if my key or whatever I have doesn’t work anymore, I need to go and fetch a new one. And that sort of emulates the problem of a very flaky network or something like that. Right. Or service.

Dan Bergh Johnsson 00:52:54 I think it’s interesting to see that this is also an instance where a lot of interlocking designs together give a very strong security benefit. Because under the hood for this to work, you have got the designs of the making your configuration or configuration parameters external, which in its turn make it possible to make immutable builds where you actually have the same build artifact in the test environments and in production environments and in multiple instances of the production environments, which also gives a lot of other security benefits. But it’s all these designs together that makes it possible to have this kind of light handed possible to change database production password without even having downtime.

Sam Taggart 00:53:41 So we need to wrap up here, but I do have one last question that I wanted to talk to and actually you kind of changed my thinking about. I was going to make the statement that it seems that security is more of a cultural than a technical problem. But I think what we just talked about hits on the fact that sometimes there are technical barriers to changing the culture and I’m curious to get your thoughts on that as a last statement.

Daniel Deogun 00:54:00 Well actually first I want to address for the audience here, we actually just covered two Rs. So I want to say the third R that’s actually very important and that’s the repair R. And basically what that means is that you can patch your, you rebuild your software, your images, upgrade them to the latest versions of your dependencies and whatnot and then you can deploy it in your normal cycle of re-pavement. But then let’s try to address Sam’s question here.

Dan Bergh Johnsson 00:54:26 To start with, I think there’s a lot of merit to your original point that there’s a lot of culture going on, even if that culture is technical. I think what that is worth pointing out there will be a lot of technical work, but for that technical work to be fruitful, you need to have a no blame high corporation psychological, safe, culture. Because if you haven’t got that and someone says that I think our application design does not really make it easy to change passwords in runtime, then it might be perceived as you trying to wake criticism on the Chief Architect and that Chief of Architect might get et cetera, et cetera, and things get sour. And this has also been studying the research that organization that are able to promote a high trust environment where you can feel your concerns and get them listened to and have an open discussion do display a lot higher security. So I think of culture as the fundamental to be able to do technical work at all in a high-quality fashion. And then you get to be able to do all those technical stuffs that gives security. But technical excellence without a culture is not possible. And cultural excellence without any technical knowledge at all will not yield security. So

Sam Taggart 00:55:50 It’s a both end.

Dan Bergh Johnsson 00:55:52 Yes, it’s definitely both ends.

Sam Taggart 00:55:53 Great. Well thank you very much. I think it’s been a great conversation. For SE Radio, this is Sam Taggart. Thanks for joining us.

[End of Audio]

Join the discussion

More from this show