Computer Scientists Develop 'Mathematical Jigsaw Puzzles' To Encrypt Software 245

Posted by samzenpus on Thursday August 01, 2013 @12:32AM from the looking-for-edges dept.

another random user writes "The claim here is that the encrypted software can be executed, but not reverse-engineered. To quote from the article: 'UCLA computer science professor Amit Sahai and a team of researchers have designed a system to encrypt software so that it only allows someone to use a program as intended while preventing any deciphering of the code behind it. According to Sahai, previously developed techniques for obfuscation presented only a "speed bump," forcing an attacker to spend some effort, perhaps a few days, trying to reverse-engineer the software. The new system, he said, puts up an "iron wall," making it impossible for an adversary to reverse-engineer the software without solving mathematical problems that take hundreds of years to work out on today's computers — a game-change in the field of cryptography.'"

This discussion has been archived. No new comments can be posted.

Computer Scientists Develop 'Mathematical Jigsaw Puzzles' To Encrypt Software

Load All Comments

Search 245 Comments Log In/Create an Account

Comments Filter:

I Call BS (Score:5, Insightful)

by MightyMartian ( 840721 ) writes: on Thursday August 01, 2013 @12:36AM (#44443399) Journal

I'm sure they can further obfuscate the actual code, but at the end of the day the processor is going to have to run machine code, and one way or the other you can tap the processor's activity to read the "decrypted" code. Beyond that, I imagine the performance penalties involved will be monstrous. Even normal obfuscation techniques have pretty heavy penalties.

Share
twitter facebook
- Re: (Score:2)
  
  by shentino ( 1139071 ) writes:
  
  TPM could put a stop to that by requiring auditing of voltages.
  MS already requires this of PVP drivers for vista and will revoke them if you allow copyrighted content to leak.
- Re:I Call BS (Score:5, Informative)
  
  by phantomfive ( 622387 ) writes: on Thursday August 01, 2013 @01:10AM (#44443599) Journal
  
  The title is wrong. It's not talking about encrypting the software, it's talking about obfuscating the software. They put the compiled code through a function that obfuscates it thoroughly. Their function is complex enough that de-obfuscating the code (that is, returning it to the original form) would be computationally expensive. The paper also talks about encryption, but that is a different part of the paper.
  
  Which isn't to say you can't look at some variable in RAM, realize it is the boolean for 'authorized,' and then overwrite it. So it's essentially a complex obfuscation technique, which may or may not be useful. (That is my understanding of the article and abstract, if I am wrong please correct me).
  
  Parent Share
  twitter facebook
  - Re:I Call BS (Score:5, Informative)
    
    by msobkow ( 48369 ) writes: on Thursday August 01, 2013 @06:51AM (#44444835) Homepage Journal
    
    I know a fellow who worked on a system that would take IBM 360 machine code, convert it into graph format, and re-engineer it as C/C++ code. They made a killing on Y2K, because so many companies didn't have the source code for some of their software any more.
    So yes, I call BS as well. Computationally complex, yes, as graph theory often is. But far from impossible to reverse engineer.
    If it can execute, it can be reverse engineered. Only the most obtuse and purposefully warped hand-written assembler can even approach the ability to hide an algorithm from deep inspection, and even that can be largely overcome by applying graph theory to restructure the code graphs.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by msobkow ( 48369 ) writes:
      
      You can contact professor Thomas Dean at Queen's University in Kingston for details of how such graph theory applies -- he was involved with the development of this reverse engineering tool.
  - Re: (Score:2)
    
    by solidraven ( 1633185 ) writes:
    
    Yeah, but the real problem I see with this method is that once you found an entry point and a set of conditions to one specific "tile of the jigsaw" you might be able to build up an entire tree of solutions quite quickly and solve it. It obviously requires some intelligence, hundreds of years is a brute force, but nobody said you can't use a smart attack vector on this one either. And the moment you have a point you can latch onto most obfuscation techniques tend to fail quite quickly. At least if they do n
  - - Re: (Score:2)
      
      by gigaherz ( 2653757 ) writes:
      
      No, you have an unnamed function with unnamed parameters that takes a series of unknown numbers, and returns another unknown number.
      The idea of obfuscation is that it makes the job hard to make those unknowns known, so that you can name the unnamed.
      I didn't read the article, but it looks to me like at some point, the code is going to call some OS API functions, or call the UI library, or similar. And you can always start the naming from there. But of course if the WHOLE PROGRAM is obfuscated, so that the in
      - Re: (Score:2)
        
        by Half-pint HAL ( 718102 ) writes:
        
        ...then it may be impractical to figure it out. Not impossible though.
        Everything in cryptography relies on the impracticality of brute-force attacks, which are never impossible. That's why we talk about security in terms of hundreds of years.
    - Re:I Call BS (Score:5, Informative)
      
      by greg1104 ( 461138 ) writes: <gsmith@gregsmith.com> on Thursday August 01, 2013 @04:46AM (#44444441) Homepage
      
      You can make the algorithm as complex as you'd like, but at the end of the day, you have an input and output(s). You may decide to take a long time to get there, but at the end of the day, I know what you did when the code ran.
      This is referred to as "black-box" reverse engineering in the paper. You know what the code did for one input. And if you injected every possible input to the program, at the end you'd have worked out a complete specifications for the function. But how long will that take? It's not always "the end of the day". For functions with a wide range of inputs, it could take the life of the universe to map them all unless you know what you're looking for in advance.
      Right now, obfuscation approaches for software usually have some small chocking point to attack. It might be an encryption wrapper around the main binary. Bust that with a debugger, you can get to unobfuscated code for the main system, and then really start to work your way through the program.
      But if you have to fight this every step of the way, where all you do is inject inputs and get an output to figure out what the program does, it will take you forever to reverse engineer things. That's the claim of the paper. The code itself is so obfuscated that you can't read it straight and understand it, ever. It looks like random junk. All you can do is run it with an input and see what comes back, and from that reverse engineer what it does. Assemble enough of those and you can see how the program really works. But that black box teardown process--determining possible inputs, injecting them, collecting outputs, and then deriving the function behavior--is time intensive enough that it may not be practical to actually do it anymore. You don't learn anything from reversing any component that speeds handling the next; you have to attack them all like this.
      There's a great line from the seminal paper on this subjectOn the (Im)possibility of Obfuscating Programs [weizmann.ac.il]: "Any programmer knows that total unintelligibility is the natural state of computer programs (and one must work hard in order to keep a program from deteriorating into this state)" Extracting meaning from source code, being able to predict what some lines of code will do, it's hard. Ideally you'll just be able to read the code, make sense of it, and then reverse from there. Most systems that are thoroughly cracked have some sections where it's hard, but once you get those the remaining code is straightforward to read. And in fact, it's impossible to make something where you cannot reverse it. The question though is how long it will take.
      If no sense is ever made of the code, you can only apply the "black-box" reverse engineering, where you inject inputs, watch outputs, and from there determine what the code does. You can't prevent that, but you can make the box so big that such work is impossible to do. That's what this technique tries to accomplish. You never find an easy part to the code you can read; all you ever find are ones where you have to map every input into an output to figure out what the code does.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by HJED ( 1304957 ) writes:
        
        But surely you only need to useful component of the algorithm, there is only going to be a certain range of useful inputs.
- Re: (Score:2)
  
  by tlhIngan ( 30335 ) writes:
  
  I'm sure they can further obfuscate the actual code, but at the end of the day the processor is going to have to run machine code, and one way or the other you can tap the processor's activity to read the "decrypted" code. Beyond that, I imagine the performance penalties involved will be monstrous. Even normal obfuscation techniques have pretty heavy penalties.
  Not really. I've seen memory encryption units that ensure that all data hitting memory is encrypted, and it's possible to have the startup code also
- Just doesn't work... (Score:2)
  
  by TiggertheMad ( 556308 ) writes:
  
  Yeah I agree, there is something fundamentally wrong with the claims being made. If I have byte code, I can rebuild loops and conditionals with a decompiler. Sure, I don't have comments or var names, but those can be guessed or worked out in something less than 'several hundred years'.
  
  Suppose you build something that throws in a lot of crazy random jmp calls to make this harder, and I cant be bothered to re-construct a program I want to steal. At some point a single Boolean decision hits the call stack
  - Re: (Score:3)
    
    by superwiz ( 655733 ) writes:
    
    You are missing the point of code being data. What it means in this context is that there is a duality between functions and the data they operate on. In other words, you can think of data as operating on the code instead of code operating on data. So your data becomes your maps (in the mathematical sense of the word map) and your functions become the mapped objects.
  - Re: (Score:2)
    
    by greg1104 ( 461138 ) writes:
    
    You're focused on a crack of one function in this sort of program. Given enough time, it is always possible to map all the inputs and outputs of a function, and therefore replace it with another that does the same thing--but with a change like a crack installed. The question, though, is how much time will it take?
    There's not an easy to spot boolean on the line here; we're past when idiots built these things now. Let's say the output from the DRM checker is a 1024 bit key that unlocks the next function in
  - - Re: (Score:3)
      
      by epine ( 68316 ) writes:
      
      Great, so all you have to do is replace that conditional so it always evaluates to true, no? When you actually do this, the program happily writes an answer to the screen every time. The only problem is, if you provided an invalid security key at the beginning, the answer it writes is complete nonsense. You see, it's secretly already tested the security key, and if it was wrong, the answer ends up being wrong too.
      I implemented exactly this circa 1990 to protect a small database of disambiguation rules struc
      - Re: (Score:2)
        
        by epine ( 68316 ) writes:
        
        One extra detail: the alphabet of 50 characters was the effective entropy over a much larger space of symbols. I described the tree in entropy space, because that is the what mattered to its performance profile. The naive view is that the symbol set contained 8000 symbols and that four character strings could be selected from a set of 8000^4 members.
        I ignored this detail because conventional reverse engineering would very quickly determine that we only go to the hash table for a much smaller nucleus of t
        
        Re: (Score:2)
        
        by rgbatduke ( 1231380 ) writes:
        
        Wow, beautiful reply. The bottom line message for me was that it was a one-off, in a sense security by obscurity (or, as you put it, by expensive obfuscation), and that to make it work you had to hand code it at a key point in a suitable application. And it doesn't make functional reverse engineering more difficult (seeing what the application does and trying to duplicate it functionally), it just makes disassembling the code into semantically meaningful functional units that can similarly work together t
- Re: (Score:2)
  
  by physicsphairy ( 720718 ) writes:
  
  "at the end of the day the processor is going to have to run machine code, and one way or the other you can tap the processor's activity to read the "decrypted" code"
  Sure, and one way to reconstruct the program is to provide it every possible input and map the outputs. For that matter, one way to reconstruct the program is simply to load it up, see what it does, and code your own version that does the same thing.
  But the question is how deeply you can inspect the algorithms based on what you see happening i
- Re: (Score:2)
  
  by jhol13 ( 1087781 ) writes:
  
  I do not think this the aim. The aim is to hide the "high level" algorith so that you cannot find it out. The obfuscation seems to be done by transforming the source, not binary.
  This way if you have some "interesting" high level algorithm (to solve some real life problem) you can make reverse-engineering it not only a tad hard, but extremely difficult. For example, suppose I can solve NP in polynomial time. You can see from the virtual machine traces how one case of the problem is solved, but you cannot get
- - Re:I Call BS (Score:5, Insightful)
    
    by MightyMartian ( 840721 ) writes: on Thursday August 01, 2013 @12:46AM (#44443459) Journal
    
    Unless the software can magically tlel that it's running in a debugger or a sandbox and refuse to execute, it's activity; from stack activity to system calls to memory allocations can be traced.
    
    Parent Share
    twitter facebook
    - Re:I Call BS (Score:5, Informative)
      
      by 0123456 ( 636235 ) writes: on Thursday August 01, 2013 @12:52AM (#44443493)
      
      When I was writing Windows device drivers, we so, really loved stupid games that detected a debugger and refused to run.
      Fortunately most developers stopped doing that once they realised that their customers would have to wait for bug-fixes until they'd reported it to us and then we'd contacted the developers and they'd sent us a special version without the retarded code that stopped us debugging the problem.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by MightyMartian ( 840721 ) writes:
        
        I see no reason you can't use a VM to accomplish the same damned thing.
        
        Re: (Score:2)
        
        by ls671 ( 1122017 ) writes:
        
        Cause the game detects it runs in a VM?
        http://communities.vmware.com/thread/273480?start=0&tstart=0 [vmware.com]
        
        Re:I Call BS (Score:5, Insightful)
        
        by MightyMartian ( 840721 ) writes: on Thursday August 01, 2013 @01:44AM (#44443757) Journal
        
        That's because VMs really don't try to hide themselves from the guest. While it might be pretty hard to build a VM that did a good enough job to essentially fool software attempting to identify whether its physical or virtualized hardware that it's running on, we have the source for a number of VMs (ie. KVM and Xen) and if this kind of obfuscated software started showing up on the market, I'm sure there would be a much greater push to make a rock solid virtualized environment that mimicked physical hardware with much more fidelity.
        
        Parent Share
        twitter facebook
    - Re: (Score:3)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
    - Re:I Call BS (Score:4, Insightful)
      
      by MikeBabcock ( 65886 ) writes: <mtb-slashdot@mikebabcock.ca> on Thursday August 01, 2013 @01:59AM (#44443823) Homepage Journal
      
      This exactly -- virus writers have been at the forefront of code that hides and obfuscates itself and VM type systems were developed to essentially run the code to determine its effects without actually running the code.
      So long as the code can be executed, it can be reverse-engineered.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by Dunbal ( 464142 ) * writes:
      
      Yup, it's the basic principle of reverse-engineering anything. If your CPU can read it, then you can read it.
      - Re:I Call BS (Score:5, Informative)
        
        by Half-pint HAL ( 718102 ) writes: on Thursday August 01, 2013 @04:42AM (#44444419)
        
        Not exactly. Reverse engineering starts with the analysis of a running program with the goal of obtaining enough data about the locations of executable code vs data, major variable locations/identifiers etc to allow you to start running automated disassembly of the total code. All you can get out of a run-time analysis is a specific execution path, which does not embody the full code.
        Now yes, "if your CPU can read it, then you can read it", but unfortunately in this situation, your CPU can only read it in certain circumstances, so you'll only be able to read it in those same circumtances: execution or simulated execution, leading back to the situation where you're stuck looking at traces of specific executions...
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by Dunbal ( 464142 ) * writes:
        
        No, if you really really want to, you can get the data from the hardware bus. Faster than farting around in someone's obfuscated software That's the way some consoles and blue ray were broken, if I recall. And of course it only takes one person to figure out the algorithm for encryption and publish it, and the lock is broken for everyone on the planet.
    - Re: (Score:2)
      
      by solidraven ( 1633185 ) writes:
      
      That's why you use virtualisation software to begin with when doing this sort of stuff. Then you can simply access anything without the software knowing about it. Memory isolation works both ways in terms of security.
  - Re:I Call BS (Score:5, Interesting)
    
    by Anonymous Coward writes: on Thursday August 01, 2013 @01:04AM (#44443567)
    
    Jigsawing is not encryption, it's splitting things up into shared object libraries (.dll's) that the linker then has to "reassemble" as the program is loaded into memory where it is executable ie the program counter can be pointed at it and let run. No the best obsfuscation I have yet seen is X86 assembler and even that doesn't run without a ton of overhead but... disassemblers exist for it. /. is right to scoff A) the write up is contradictory B) the article makes no sense because in order to run the code the processor has to solve the puzzle so you just stick it in an emulator and hit record. C) this looks like a school and or corporate overlords hyping an unpublished paper which many of us recognize as one way peer reviewers are pressured into signing off on snake oil D) who claims that this is the first attempt at obsfuscating code? This thing: http://www.ioccc.org/ is on it's 22nd year.
    
    Parent Share
    twitter facebook
    - misunderstanding (Score:3, Interesting)
      
      by Anonymous Coward writes:
      
      either you misunderstand or i do, but my understanding was that this system allows a function to be expressed so that the code will only execute with certain parameters, hence the function can only compute certain evaluations (lets say one for simplicity) of the original function, and the true function can be hidden. so that the function evaluated for other parameters cannot be computed with this expression.
      basically a really complex way of designing a look up table ?
      now as you say, you can never hide the v
- - Re: (Score:2)
    
    by Anubis IV ( 1279820 ) writes:
    
    So long as we're able to see the instructions being given to the processor (hint: we pretty much always can), you can reverse engineer the code. Performance penalties or not, the entire thing is pointless if it doesn't achieve what it's set out to do, and you don't have to be an expert in this area to see the hole that's big enough to steer a panamax-class cargo carrier through. Despite that, the problem went wholly unaddressed in the summary and I'm not seeing any comments here indicating otherwise, which
    - Re: (Score:3)
      
      by TheLink ( 130905 ) writes:
      
      OK as an example what if you have worked out a way to transform an arbitrary program in to a huge bunch of "if else", goto statements with "magic numbers" or worse (e.g. setaddress(mod(magic1+sha256memoizer(magic2+parm1+parm2+...),dataend))=parm3; linenumber=mod(magic3+sha256memoizer(magic4+parm5+parm6+...),maxlines); goto linenumber;) and some stores and loads so that it's still equivalent and does the same thing but just a bit slower. So when you disassemble it - it's still a big mess that makes it hard to
    - Re: (Score:2)
      
      by Half-pint HAL ( 718102 ) writes:
      
      So long as we're able to see the instructions being given to the processor (hint: we pretty much always can), you can reverse engineer the code.
      True. As long as you are able to follow every possible path of execution of the code. Which puts us in the realms of "theoretically possible, but unfeasible in practice".
      How do we normally reverse engineer code? We run the code through a debugger/logger to identify contiguous areas of code and data, and to highlight common destination points for branch conditions and data addresses frequently read from/written to. That information is used to inform the automated stages of disassembly. Probablistic models
    - Re: (Score:2)
      
      by somersault ( 912633 ) writes:
      
      You're claiming to be able to reverse engineer an entire cookbook just by chemically analysing a cookie from one recipe in that book. You'd have to make every single recipe in the book and analyse it before you had a complete representation of all the recipes.
      - Re: (Score:2)
        
        by stdarg ( 456557 ) writes:
        
        No, what you're describing is looking at a finished product and inferring the process of how it was created. That's very difficult.
        What GP is referring to is watching someone follow the recipe (without seeing the recipe itself), and then inferring the recipe. Much easier.
    - - Re: (Score:2)
        
        by gl4ss ( 559668 ) writes:
        
        yeah well release a working proof and make 5 billions.
        I'm not shitting you, if you can really do even a simple proof of concept that does that then you could bag a lot of money next week.
        I can accept that you could do it this system so that it will spit out right answers for right predetermined inputs but nothing else, and even then you could see it do it's thing with that input while it might not provide you the outputs for other inputs, but that's pointless as a program for most any uses a computer progra
  - Re: (Score:2)
    
    by Dunbal ( 464142 ) * writes:
    
    I call BS on the notion that my CPU is smarter than me. Article claims that it takes "hundreds of years" to break this. Obviously then it must take the program "hundreds of years" to run. Otherwise the CPU is using a short-cut. All I have to do is figure out what the short cut is (hint: figure out what the CPU is doing, where it got its instructions from) and it's cracked. If your computer can run it, you can figure it out. It's that simple.
    - Re: (Score:3)
      
      by Half-pint HAL ( 718102 ) writes:
      
      I call BS on the notion that my CPU is smarter than me. Article claims that it takes "hundreds of years" to break this. Obviously then it must take the program "hundreds of years" to run. Otherwise the CPU is using a short-cut. All I have to do is figure out what the short cut is (hint: figure out what the CPU is doing, where it got its instructions from) and it's cracked. If your computer can run it, you can figure it out. It's that simple.
      By this argument, PGP, RSA et al must be worthless too, because if it takes hundreds of years to break a 256-bit cypher it must take hundreds of years to run a 256-bit cypher. You know, if your computer can run it, you can figure it out.
      I hope you can see the flaw in your argument now.
      - Re: (Score:2)
        
        by stdarg ( 456557 ) writes:
        
        By GP's argument, for the CPU to decrypt the message so quickly, it must be using a shortcut. In your example, the shortcut is that the CPU knows the secret key.
        Do you honestly think that someone watching the CPU decrypt the message can't access the message themselves? http://en.wikipedia.org/wiki/DeCSS [wikipedia.org]
      - Re: (Score:2)
        
        by Half-pint HAL ( 718102 ) writes:
        
        How is this different? The public key for a machine-readable PGP message isn't "baked into the CPU", but that doesn't make PGP trivially easy to break -- far from it.
        
        Re: (Score:3)
        
        by Dunbal ( 464142 ) * writes:
        
        Besides, the NSA is holding all the keys anyway lol.
- - Re: (Score:3)
    
    by MightyMartian ( 840721 ) writes:
    
    Yeah, but once they add the GUI interface using VB, it'll be uncrackable!
  - Re: (Score:2)
    
    by IamTheRealMike ( 537420 ) writes:
    
    And this is a Computer Scientist? Are they sure they haven't accidentally hired the actor who played Charles Epps in "Numb3rs"?
    At some point this program will have to be executed by the CPU, but somehow even a disassembler would throw up its hands and declare defeat when presented with this "encrypted" code. In other news, Mr Sahai's found a way to turn your grocery list into a set of numbers that will make it impossible for anyone else to see what you want to buy. All they can do is turn it over to the cle
    - Re: (Score:3)
      
      by rgbatduke ( 1231380 ) writes:
      
      Even better than the grocery list is Searle's Chinese Room. A man sits in a sealed room. He doesn't speak or read chinese, but every now and then somebody pokes a slip of paper in under the door with some Chinese on it. The man picks it up and goes to a vast filing cabinet, matches the symbols on the paper, and pulls out a paper with other symbols on it which he pushes out the door. In this way the entire room can act like a "speaker of Chinese" even though the man in the room doesn't recognize a word o
- - Re: (Score:2)
    
    by MightyMartian ( 840721 ) writes:
    
    In other words it's going to do malloc() a big pile of memory, then throw the actual code all over the place, linked together with a helluva lot of jmps and indexes to keep track of them. Yes, I'm sure if every second or third clock cycle is spent leaping over the process's malloced space to fetch the next instruction or the next byte from the symbol table, it may be too complex to actually decompile to reproduce a reasonable representation of the original source code. But as others have pointed out, that l
Deciphering != Reverse Engineering (Score:4, Informative)

by Dynedain ( 141758 ) writes: <(slashdot2) (at) (anthonymclin.com)> on Thursday August 01, 2013 @12:37AM (#44443405) Homepage

Deciphering/Decrypting is not the same thing as Reverse Engineering.
Reverse Engineering is duplicating the functionality without seeing the source code. That should still be possible if you have the ability to run the program.

Share
twitter facebook
- Re:Deciphering != Reverse Engineering (Score:5, Insightful)
  
  by wierd_w ( 1375923 ) writes: on Thursday August 01, 2013 @12:52AM (#44443501)
  
  One way around this (for reverse engineering) would simply be to run it inside a VM with a built in stack trace debugger, like Bochs.
  You can peek at the raw instructions percolating through the VM's emulated CPU that way. The application itself is then the decryption key, since the system has to be able to run the application.
  PITA, but I don't see how this is anything more than just a "bigger" speedbump, at least as far as disassembly goes. Now, making said decryption require nasty bits of hardware to thwart using a VM, and clinging tightly to treacherous computing with a TPM and a hardware hypervisor on top of this? That's more dicey.
  The question here is... why do this? Is preventing Cracker Jack from the warez scene from simply disabling your "authenticity checks" so horrifically important? That's the major application for this that I really see, besides making epically horrific malware. (Though if you ask me, they are both the same thing.)
  Seriously, other than making personal computing become something from communist russia, what is the benefit of this?
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by MightyMartian ( 840721 ) writes:
    
    Yes, the only way this works is with hardware to back it up. But if we're back to effectively using some sort of dongle to execute, well that road has been well-traveled as well.
    - Re: (Score:2)
      
      by wierd_w ( 1375923 ) writes:
      
      I was thinking more along the lines of "special processor instructions", that make use of quirks of real silicon. Given how intel and amd both have taken to cramming lots of extra devices onto the die, a small "inside the CPU" black box designed for this very application would do the trick just fine, and would likewise ensure its presence on modern rigs.
      A virtual machine halting execution would be detected by the running software, because it wouldn't get the proper responses from said black box. You'd have
      - Re: (Score:2)
        
        by MightyMartian ( 840721 ) writes:
        
        Yup, and if you can run the code, no matter how obfuscated. in a controlled environment like a virtual machine, no matter how complicated it may be, its interaction with the virtualized hardware is going to be observable. I guess you could start writing software that sniffs out that it's in a VM, either by looking for paravirtualization or simply for looking for some subset of bugs or limits in the virtualized hardware that one wouldn't find on actual physical hardware, but that's just the better mouse trap
    - Re: (Score:2)
      
      by MikeBabcock ( 65886 ) writes:
      
      Hardware that can't be emulated you mean.
      Even high grade encryption hardware that refuses to allow debugging can be emulated, if slowly, once you have keys.
  - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    Seriously, other than making personal computing become something from communist russia, what is the benefit of this?
    Wait, isn't that considered to be feature enough?
  - Re: (Score:2)
    
    by bondsbw ( 888959 ) writes:
    
    Seriously, other than making personal computing become something from communist russia, what is the benefit of this?
    Security. One major benefit of obfuscation is making it much more difficult to find local data store encryption keys, service endpoints, etc. It makes it harder to find bugs/exploits such as SQL injection.
    Remember that not all attacks are aimed at the software in general. Many, many attacks on medical/banking/government systems are aimed at finding specific data on specific computers, and the attacker isn't running it on a VM. These attacks rely on perhaps a trojan or backdoor. The harder it is to buil
  - I don't Think That's The Point (Score:2)
    
    by The Real Nem ( 793299 ) writes:
    
    You don't need to emulate the hardware to see a program's output, you can just look at your screen. There is little point in hiding the output of a program from the user who's running it so I don't see that as the point (if one cannot observe the output of a program, it is of little utility).
    Which instructions a given program executes depends on the inputs to said program. For any given input, most programs only execute a tiny portion of their code. Therefore, in order to completely reverse engineer a pr
    - Re: (Score:2)
      
      by wierd_w ( 1375923 ) writes:
      
      Agreed, for each nested brach, complexity increases at LEAST geometrically. (A branch always has at least 2 paths.) However, one may not NEED to know *all* branches.
      Say for instance, the behavior you want to modify (we will assume you are a cracker making a memory poking drm defeat patch) is rigidly defined in where it gets called from (by this, I mean what code branches spawn the check). You don't really need to look at the other execution paths, just the ones that trigger the drm check, or better still, t
  - - Re:Deciphering != Reverse Engineering (Score:4, Informative)
      
      by IamTheRealMike ( 537420 ) writes: on Thursday August 01, 2013 @06:30AM (#44444773)
      
      Yes, it is robust. I read the paper a few days ago.
      All these comments about how you can "just look at the CPU instructions" are made by people who haven't been following developments in the field. The program never gets decrypted into CPU instructions. Heck, it was never even compiled into CPU instructions in the first place. It gets compiled into a form of boolean circuit, a mathematical equivalent of an electronic circuit that is composed of AND, NOT, OR, XOR gates and wires between them. Then that circuit is itself again transformed into a series of matrices and at that point I hit the limit of what I could understand without needing to read some of the cited papers.
      This is a very, very complicated technique that builds upon decades of cryptographic research. If they say it's secure in the cryptographic sense, I think it's very likely to be so.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by pjt33 ( 739471 ) writes:
        
        The thing is that "secure in the cryptographic sense" means "provably secure against a defined threat model", and the threat model may not be relevant to real world applications. My understanding of their proof of security is that the threat model is black-box evaluation, which doesn't seem very relevant to real obfuscation.
- - Re: (Score:2)
    
    by superwiz ( 655733 ) writes:
    
    Obfuscation is woefully poor way to stop decomposition. Any program is a graph. Any program analysis tool will traverse obfuscated code just as easily as it will traverse an non-obfuscated one. In fact, it won't even know the difference between obfuscated and non-obfuscated code. Obfuscation only stops eyeballing analysis. But if eyeballing is the only tool you use, you are an amateur. You might be a very adroit amateur, but the difference between an adroit amateur and a decent professional is that the
  - Re: (Score:2)
    
    by AK Marc ( 707885 ) writes:
    
    How can a CPU execute code when it doesn't know what that code is?
Gotta love articles without details (Score:2, Funny)

by Anonymous Coward writes:

Clearly the journalist had no idea what they were writing about.
- Re: (Score:2)
  
  by MrEricSir ( 398214 ) writes:
  
  You mean a link to the paper [iacr.org], like the one in the third paragraph of the article?
  - Re: (Score:2)
    
    by sydneyfong ( 410107 ) writes:
    
    When you get to that, then it's the GP turn to have no idea what the article is writing about.
    I took a glance at it. Probably don't really understand it unless I spend a week or two seriously reading it thoroughly and its cited articles...
Victory for virus writers (Score:4, Insightful)

by technosean ( 169328 ) writes: on Thursday August 01, 2013 @12:38AM (#44443417)

Who gains from this? Most of all? Virus writers.
Perhaps the real solution is to have the OS never execute code looking anything like this.

Share
twitter facebook
- Re: (Score:3)
  
  by MightyMartian ( 840721 ) writes:
  
  Heh. No fucking shit. It would be a malware writer's holy grail; software that has no identifiable signature.
  I still think it's BS, at least on any kind of processor or operating system we're running now.
  - Re: (Score:2)
    
    by ls671 ( 1122017 ) writes:
    
    Heh. No fucking shit. It would be a malware writer's holy grail; software that has no identifiable signature.
    I don't think anti-virus software needs to understand, reverse-engineer or decipher a virus program in order to identify its signature. That's not how it works. That's why there is false positives with programs that do not do what the anti-virus software thinks it does.
    - Re: (Score:2)
      
      by Opportunist ( 166417 ) writes:
      
      As long as the code is always identical, you're right. It is likely, though, that this machine is able to produce vastly different results from minuscle changes in code, resulting in a very easy way to generate an arbitrary amount of copies of your virus code that doesn't look anything like the others to a scanner.
    - - Re: (Score:3)
        
        by TheLink ( 130905 ) writes:
        
        100% AV/malware detection is arguably harder than solving the halting problem - since it's not certain you're given the full program and its inputs ;). In practice malware is not so obfuscated and while it seems to be a losing battle, I don't have to outrun the "bear", I just have to outrun the average user...
        
        That said it would be safer if OS makers solved it by better sandboxing and better sandboxing infra/UI. Sandboxing is like "solving" the halting problem by setting a time limit.
        
        In fact you're still in
- Hurry up, Europe is hungry for your fines (Score:5, Interesting)
  
  by c0lo ( 1497653 ) writes: on Thursday August 01, 2013 @02:59AM (#44444021)
  
  Sell a program protected like this in Europe [europa.eu] and you may end paying hundreds of millions [forbes.com]:
  (14) A person having a right to use a computer program should not be prevented from performing acts necessary to observe, study or test the functioning of the program, provided that those acts do not infringe the copyright in the program.
  (15) [...]Nevertheless, circumstances may exist when such a reproduction of the code and translation of its form are indispensable to obtain the necessary information to achieve the interoperability of an independently created program with other programs.
  It has therefore to be considered that, in these limited circumstances only, performance of the acts of reproduction and translation by or on behalf of a person having a right to use a copy of the program is legitimate and compatible with fair practice and must therefore be deemed not to require the authorisation of the rightholder. An objective of this exception is to make it possible to connect all components of a computer system, including those of different manufacturers, so that they can work together. [...].
  
  Parent Share
  twitter facebook
make a crackme (Score:3, Insightful)

by ZeroNullVoid ( 886675 ) writes: on Thursday August 01, 2013 @12:41AM (#44443433)

If they really think it is so good, then they should put their money where their mouth is.
Make it into a crackme, issue a large award for solving it.
Post it online. I give it a few weeks max, if that.
And who is to say it can't still be manipulated once running?
Think of the performance cost.
Either way, I have no faith in an article with little details.

Share
twitter facebook
- - Re: (Score:2)
    
    by Opportunist ( 166417 ) writes:
    
    Well, DUH!
    Be honest. Imagine you have something like that. Where would you present it?
    At a tech conference, where techs are present who can't throw money at you but can crack it?
    Or at a management conference where the second part of the above statement is inverted?
Performance hit (Score:2)

by Pinhedd ( 1661735 ) writes:

and programs will continue to run like it's 1987
A giant step backward for maintainability? (Score:2)

by __aaltlg1547 ( 2541114 ) writes:

Or is there an original unscrambled source code retained somewhere?
The program will have to DO something (Score:3)

by Sean ( 422 ) writes: on Thursday August 01, 2013 @12:53AM (#44443503)

Call the kernel to access files, sockets, etc.
Also unless the developer is super 31337 and likes to write everything I expect shares library calls too.
By watching calls to those interfaces we can figure out what it does.

Share
twitter facebook
- Re: (Score:2)
  
  by MightyMartian ( 840721 ) writes:
  
  Even if all the libraries are statically compiled, it's still going to be making system calls. Yes, it's behavior could be very hard to determine, but unless it's a standalone operating system that runs entirely on its own, it's going to be traceable. Even then, one could still run it in a VM. Perhaps harder to get useful information out of, but still...
- Re: (Score:2)
  
  by TheLink ( 130905 ) writes:
  
  It could be a custom/modified cryptographic hash used to verify and create signatures. In which case it won't make that many syscalls.
  
  Reverse engineering something like that will be hard if its obfuscated.
  
  If you can't reverse engineer it you can't make a compatible version without breaking copyright by copying the hash module as is.
The original paper (Score:5, Informative)

by HatofPig ( 904660 ) writes: <clintonthegeek&gmail,com> on Thursday August 01, 2013 @12:56AM (#44443517) Homepage

The original paper is available here [iacr.org].
Cryptology ePrint Archive: Report 2013/451
Candidate Indistinguishability Obfuscation and Functional Encryption for all circuits
Sanjam Garg and Craig Gentry and Shai Halevi and Mariana Raykova and Amit Sahai and Brent Waters
Abstract: In this work, we study indistinguishability obfuscation and functional encryption for general circuits:
Indistinguishability obfuscation requires that given any two equivalent circuits C_0 and C_1 of similar size, the obfuscations of C_0 and C_1 should be computationally indistinguishable.
In functional encryption, ciphertexts encrypt inputs x and keys are issued for circuits C. Using the key SK_C to decrypt a ciphertext CT_x = Enc(x), yields the value C(x) but does not reveal anything else about x. Furthermore, no collusion of secret key holders should be able to learn anything more than the union of what they can each learn individually.
We give constructions for indistinguishability obfuscation and functional encryption that supports all polynomial-size circuits. We accomplish this goal in three steps:
- We describe a candidate construction for indistinguishability obfuscation for NC1 circuits. The security of this construction is based on a new algebraic hardness assumption. The candidate and assumption use a simplified variant of multilinear maps, which we call Multilinear Jigsaw Puzzles.
- We show how to use indistinguishability obfuscation for NC1 together with Fully Homomorphic Encryption (with decryption in NC1) to achieve indistinguishability obfuscation for all circuits.
- Finally, we show how to use indistinguishability obfuscation for circuits, public-key encryption, and non-interactive zero knowledge to achieve functional encryption for all circuits. The functional encryption scheme we construct also enjoys succinct ciphertexts, which enables several other applications.
Category / Keywords: public-key cryptography / Obfuscation, Functional Encryption, Multilinear Maps
Date: received 20 Jul 2013, last revised 21 Jul 2013
Contact author: amitsahai at gmail com
Available format(s): PDF [iacr.org] | BibTeX Citation [iacr.org]

Share
twitter facebook
- - Re: (Score:2)
    
    by IamTheRealMike ( 537420 ) writes:
    
    I already read the paper some days ago when it was first uploaded to the IACR pre-print archives. Yes, the paper is the one being referred to. It's a very interesting result, although not really impactful at the moment for things like game DRM.
    The confusion arises from terminology. The technique applies (presently) to pure functions. You can write those functions in, for example, a subset of C because there exist compilers that transform such programs into boolean circuits, and circuit form is what they obf
Wish I had thought of this myself (Score:2)

by vikingpower ( 768921 ) writes:

I just scanned the original paper, reserving its detailed lecture for another moment. But it is one of those things that make me think: "Damn ! Wish I had thought of this myself..."
- - Re: (Score:2)
    
    by vikingpower ( 768921 ) writes:
    
    This reply is typical of a growing percentage of replies on /. : abuse & insults, under the cloak of Cowardness and Anonymity. Well done, I am impressed by how you immediately detected my grammatical error.
    I regret to inform you, though, that the correct word for what you speculate me to be is dumbfuck
  - Re: (Score:2)
    
    by andy.ruddock ( 821066 ) writes:
    
    At dictionary.com,
    
    1. to glance at or over or read hastily: to scan a page.
    
    And even if you had been correct, it was an unnecessarily harsh comment, but do please feel free to hide behind the mantle of the Anonymous Coward.
like DVDs and all the others (Score:2)

by raymorris ( 2726007 ) writes:

encryption used for DVDs was believed to be unbreakable, until it was broken. How many companies have released encryption schemes, for DRM or otherwise and it gets cracked within hours of actual release. more than once Microsoft encryption has been cracked before its official release.

Though I haven't studied this particular one, my general impression is that it was not designed by cryptography experts and then vetted the way well known cryptographic algorithms are vetted by other experts.
- Re: (Score:2)
  
  by Meneth ( 872868 ) writes:
  
  Aye, it's easy to construct an encryption scheme that you yourself can't break.
- Re: (Score:2)
  
  by IamTheRealMike ( 537420 ) writes:
  
  Actually the encryption itself on these schemes was not broken. Rather, player emulators became good enough that the industry could not revoke hacked players fast enough to keep up.
This actually has a name. (Score:2, Flamebait)

by eclectro ( 227083 ) writes:

It's called Forth [wikipedia.org]
What this means. I think. (Score:5, Interesting)

by Animats ( 122034 ) writes: on Thursday August 01, 2013 @02:22AM (#44443915) Homepage

This is fascinating, but hard to understand. It's not clear how broad a result this is.
This seems to apply to "circuits", which are loop-free arrangements of AND, OR and NOT gates. These take in some bit string with a fixed number of bits and emit another bit string with a fixed (but possibly different) number of bits. Many computations can be encoded into this form, but the "circuits" are typically very deep, since this is sort of like loop unwinding. Although this seems kind of limited, you could, for example, make a reasonably sized "circuit" to compute DES, which is a static pattern of Boolean operations.
"Obfuscation" here means taking in some "circuit" and producing a bit for bit equivalent "circuit" from which little information can be extracted. The obfuscated circuit may be (will be?) more complex than the original, because additional logic has been inserted in a somewhat random way.
The contribution in this paper seems to be that this might be useful for "functional encryption". This is where you have some values, such as A and B, which are encrypted and are to be protected, but the result of some function f(A,B) is not protected. The idea is to have an implementation of f which combines the decryption and the desired function, an implementation which cannot be reverse engineered to return decrypted versions of A or B. While this has been suggested as a possible feature for databases, it's hard to apply to useful functions, and so far, it's mostly a research idea there. It's been suggested for some complex access control schemes involving mutual mistrust, but that too is a research topic.
Anyway, this doesn't mean you're going to be able to run your big executable program through some kind of magic obfuscator that will prevent someone from figuring out how it works. Not yet, anyway.

Share
twitter facebook
Other aspects of the paper - health data (Score:4, Interesting)

by Badge 17 ( 613974 ) writes: on Thursday August 01, 2013 @02:27AM (#44443929)

I can't really comment on the slashdot summary, but take a look at the actual abstract: http://eprint.iacr.org/2013/451 [iacr.org]
"In functional encryption, ciphertexts encrypt inputs x and keys are issued for circuits C. Using the key SK_C to decrypt a ciphertext CT_x = Enc(x), yields the value C(x) but does not reveal anything else about x. Furthermore, no collusion of secret key holders should be able to learn anything more than the union of what they can each learn individually."
In other words, it seems that their technique allows you to encrypt some secret data, then (provably) only release the result of some arbitrary function of that data. It sounds like this means you could (in principle) release health data in encrypted form, then allow researchers to study some ensemble properties of it by giving out the appropriate keys. This aspect of it certainly seems pretty cool.

Share
twitter facebook
Nice try (Score:2)

by superwiz ( 655733 ) writes:

But UCLA doesn't get to claim the credit. MIT was first to present homomorphic encryption: http://web.mit.edu/newsoffice/2013/algorithm-solves-homomorphic-encryption-problem-0610.html [mit.edu]
Roughly speaking, homomorphism is a map which preservers the properties pertinent to a category. Now think of data as acting on code instead of code acting on data. Since "acting" is a mapping, acting in a homomorphic way would produce program results which are equivalent but without the decrypting step.
Malware in 3, 2, 1... (Score:3)

by Opportunist ( 166417 ) writes: on Thursday August 01, 2013 @03:20AM (#44444075)

Remember, kids, everything that can be applied for good can be applied for ill. And code that is impossible to decipher and analyze is the holy grail of malware.

Share
twitter facebook
Encrypting the algorithm, not the code? (Score:3)

by Kazoo the Clown ( 644526 ) writes: on Thursday August 01, 2013 @04:09AM (#44444287)

It sounds like what they are trying to do here is to refactor the algorythm such that f(x)=y produces the same answer but it's not practical to modify because the new function is mathematically scrambled-- the effect of each component of the code is so obscured that it's not so easy to tell how it contributes to the results. It's like using a neural net to implement your solution, the contribution of each neural node to the overall result can be significantly obscure. Won't stop someone from stealing the algorythm as is, but hacking it to produce an alternate result set will be out of the question, as long as they can't just build a truth table of inputs to outputs...

Share
twitter facebook
Poor RTFA (Score:2)

by ThePhilips ( 752041 ) writes:

RTFA is shallow on details, but it sounds very much like the "phantom" self-mutating MS-DOS viruses from the later 90. (The ones which forced anti-virus vendors to effectively implement a debugger/a interpreter for the code to detect the malicious code not by the look but by the actual work it does.)
Otherwise, the main problems with such intrusive obfuscations is that code is never 100% bug free and bugs sometimes trigger undefined/unspecified behaviors. It might work before obfuscation - but would break
This is poorly described, but is a breakthrough (Score:2)

by Wierdy1024 ( 902573 ) writes:

Normally obfuscation is bad in cryptography - it means that the system is theoretically broken, but that the way to break it is quite well hidden.
This refers to cryptographically secure obfuscation. This is an entirely new field, and hasn't been possible till now. This paper doesn't prove how to do it, but proves that it is possible for a certain subset of operations.
Basicly it boils down to the fact it is possible to make a computer program that, for a given set of inputs a) generates a set of outputs b)
- Re:Seems improbable (Score:4, Insightful)
  
  by MightyMartian ( 840721 ) writes: on Thursday August 01, 2013 @12:44AM (#44443447) Journal
  
  Yup. At the end of the day, if this is at all useful and the hardware and OSs out there now, it;s still going to have to execute, and if it executes, you can run it through a debugger and watch it.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by shentino ( 1139071 ) writes:
    
    Not on a protected OS you can't.
    I imagine that if this is implemented it will be patented much like Blu-ray, and the only way to get a license is to swallow the DRM.
- - Re: (Score:2)
    
    by wierd_w ( 1375923 ) writes:
    
    Some VMs have "save state" features.
    Let's say we have a program that branches many times, as you describe above. The VM is controlled with a script. On the first pass, it looks for and records all instruction branches on the first program run along with the data compared against for each branch to execute, then reloads the saved state at initial program load, and begins systematically walking and triggering code.
    On each pass, it spits out what the CPU did, with what inputs.
    Since it is a given that new path
- Re: (Score:2)
  
  by MightyMartian ( 840721 ) writes:
  
  If crappy performance is no obstacle, I'm sure you could produce compiled code that is insanely obfuscated. Still, as I said elsewhere, if you can execute it, it can be executed in a debugger and you can watch what it does, and even if it's doing mad long jumps all over its allocated memory, you're going to be able to trace execution. It will betray its functionality.
- Re: (Score:2)
  
  by Opportunist ( 166417 ) writes:
  
  As if that has ever been a concern of software makers. "Our Software runs too slowly? Well, get a better machine, your ancient 1 year old crate of course cannot run our superspecialawesome piece of artificially deoptimized code".
- Re: (Score:2)
  
  by Dunbal ( 464142 ) * writes:
  
  I give it a month before someone cracks it.
- Re: (Score:2)
  
  by vikingpower ( 768921 ) writes:
  
  I do see major difference. The patent you cite is based upon DES, the authors' work is not, as they develop their own encoding technique, on the basis of multilinear maps. Moreover, in the patent you cite it is assumed that one function can have n outputs, whereas in the authors' work, each function has only 1 output. I am still going over both the original peer-reviewed paper and the US patent you cite. The latter seems too amateuristic to be taken seriously by any corporation willing to implement an obfus

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

I Call BS (Score:5, Insightful)

Re: (Score:2)

Re:I Call BS (Score:5, Informative)

Re:I Call BS (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:I Call BS (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Just doesn't work... (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:I Call BS (Score:5, Insightful)

Re:I Call BS (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re:I Call BS (Score:5, Insightful)

Re: (Score:3)

Re:I Call BS (Score:4, Insightful)

Re: (Score:2)

Re:I Call BS (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re:I Call BS (Score:5, Interesting)

misunderstanding (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Deciphering != Reverse Engineering (Score:4, Informative)

Re:Deciphering != Reverse Engineering (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

I don't Think That's The Point (Score:2)

Re: (Score:2)

Re:Deciphering != Reverse Engineering (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Gotta love articles without details (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Victory for virus writers (Score:4, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Hurry up, Europe is hungry for your fines (Score:5, Interesting)

make a crackme (Score:3, Insightful)

Re: (Score:2)

Performance hit (Score:2)

A giant step backward for maintainability? (Score:2)

The program will have to DO something (Score:3)

Re: (Score:2)

Re: (Score:2)

The original paper (Score:5, Informative)

Re: (Score:2)

Wish I had thought of this myself (Score:2)