Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Businesses Software Google Privacy Security

A Group of Ex-NSA and Amazon Engineers Are Building a 'GitHub For Data' (techcrunch.com) 21

A group of engineers and developers with backgrounds from the National Security Agency, Google, and Amazon Web Services are working on Gretel, an early-stage startup that aims to help developers safely share and collaborate with sensitive data in real time. TechCrunch reports: It's not as niche of a problem as you might think, said Alex Watson, one of the co-founders. Developers can face this problem at any company, he said. Often, developers don't need full access to a bank of user data -- they just need a portion or a sample to work with. In many cases, developers could suffice with data that looks like real user data. "It starts with making data safe to share," Watson said. "There's all these really cool use cases that people have been able to do with data." He said companies like GitHub, a widely used source code sharing platform, helped to make source code accessible and collaboration easy. "But there's no GitHub equivalent for data," he said.

And that's how Watson and his co-founders, John Myers, Ali Golshan and Laszlo Bock came up with Gretel. "We're building right now software that enables developers to automatically check out an anonymized version of the data set," said Watson. This so-called "synthetic data" is essentially artificial data that looks and works just like regular sensitive user data. Gretel uses machine learning to categorize the data -- like names, addresses and other customer identifiers -- and classify as many labels to the data as possible. Once that data is labeled, it can be applied access policies. Then, the platform applies differential privacy -- a technique used to anonymize vast amounts of data -- so that it's no longer tied to customer information. "It's an entirely fake data set that was generated by machine learning," said Watson.
The startup has already raised $3.5 million in seed funding. "Gretel said it will charge customers based on consumption -- a similar structure to how Amazon prices access to its cloud computing services," adds TechCrunch.
This discussion has been archived. No new comments can be posted.

A Group of Ex-NSA and Amazon Engineers Are Building a 'GitHub For Data'

Comments Filter:
  • have a code of conduct?
    A policy on facial recognition?
    The support of/for the US mil?
    The ability to talk about DRM and crypto?
    • It's all good as long as it doesn't use algorithms, because everyone know those are horribly racist and sexist by now.
    • by schwit1 ( 797399 )

      "The support of/for the US mil?"

      If as a citizen I wish to enjoy the benefits and protection of the British empire it would be wrong of me not to help its defense.
      Gandhi

      China, FBI, IRS, Facebook and Google are greater threats than the US military.

      • by AHuxley ( 892839 )
        Seems that was lost on all the big US tech brands so very happy to fully support Communist China.
      • Won't surprise me if Skynet will be built and/or funded by the US military. From that sense it'd be the gravest threat to humanity.
    • This could be a great service... as long as you could take it and install it internally with no connection to the company supplying it.

      It's not clear how they're planning for it to work, but based on the comparisons they're making, the idea that large companies with sensitive data are going to want to push all that data to an outside provider who will store it out of their control in order to allow their internal folks to get an anonymized version of it sort of misses the point of only allowing testing with

      • by AHuxley ( 892839 )
        Make the wrong kind of AI out of the data sets?
        Use code in the wrong political/mil/nation/gov/city/police way years later?
        Some US brands CoC will follow projects, code and resulting productive work around looking for political issues.
  • Facebook will acquire it.

  • by thedarb ( 181754 ) on Thursday February 20, 2020 @09:27PM (#59748972)

    1. Once NSA, always NSA. Shouldn't be allowed to work in the private sector. They're spying on you when they do.
    2. Fake data my butt. How many times have we seen "anonymized" data get de-anonymized?

    How about no.

  • ..oh wait... we already got that.

  • Ex-NSA guys want you to upload sensitive secret data to their totally non-NSA accessible server. The NSA, I suspect, is like the CIA. No one ever completely leaves the agency (agencies).
    • Close. But who will pay for all the XOR operations to blank memory against garbage collection and system dumps that a bluescreen may get. For Intel, wiping undocumented buffers is now another CVE headache.
  • NSA/Google/Amazon product named after a german broad that was sneaky and cooked and old person in an oven sounds like a wonderful thing to trust your most sensitive data to. If they got Equifax, Comcast, Electronic Arts and Monsanto in on this project it could really be a win-win.
  • DB2 and SQL already does this. The Dba's are the data admins, and there is a data and field name repository. Usually the problem is that old hands have been let go, Data treasure maps on ABC Flowcharter (Ditched with the Win 10 purge) and a data scientist(not a real one) says lets make a data lake and fix this, so duplicating storage costs overnight. DB2 has table permissions - no AI needed. I understand RUST is pretty cool too. I've been told SQL is obsolete, and managers want pre-canned reports so that t
  • by sad_ ( 7868 )

    one of the reasons why git is so successful and popular is because it's open source.
    i don't see this taking over the industry by storm, unless they make it open source too.
    the product isn't even unique, i've had several product sales pitches from multiple vendors that promise this kind of functionality.

If you have a procedure with 10 parameters, you probably missed some.

Working...