Open Source

Fintech OpenBB Aims To Be More Than an 'Open Source Bloomberg Terminal' (techcrunch.com) 7

TechCrunch's Paul Sawers reports: Fledgling fintech startup OpenBB is revealing the next step in its plans to take on the heavyweights of the investment research world. The company is launching a new, free version of a product that will open its arsenal of data and financial tooling to more users. OpenBB is the handiwork of software engineer Didier Lopes, who launched the Python-based platform back in 2021 as a way for amateur investors and enthusiasts to do investment research using different datasets for free, via a command line interface (CLI). The company went on to raise $8.5 million in seed funding from OSS Capital and angel investors such as Ram Shriram, an early backer of Google. While the community-based, open source project has amassed some 50,000 users, OpenBB has also been building an enterprise incarnation called Terminal Pro. This paid version gives teams access to an interface, pre-built database integrations, an Excel add-in, and various security and support bolt-ons that would appeal to larger businesses. [...]

The all-new OpenBB Terminal -- not to be confused with the previous CLI-based OpenBB Terminal that the startup sunsetted in March -- is a full-fledged web app, though it strips out many of the premium features of Terminal Pro. It's fully customizable, can run on any operating system or platform, and provides access to an AI-enabled OpenBB copilot. Like the previous OpenBB Terminal, the all-new web app is also free to use. OpenBB Terminal is perhaps something of a middle ground between the CLI centricity of the open source project and the bells-and-whistles feature set of the enterprise product.

The OpenBB Terminal serves as a single end point for accessing financial information from some 100 data sources, spanning equity, options, forex, the macro economy, and more. Users can also throw all their new data into the mix -- the community has previously contributed financial datasets such as historical currency exchange rates and crypto pricing data. There are also a slew of extensions and toolkits to bring more functionality to OpenBB -- such as an AI stock analysis agent. Users are free to incorporate their own AI systems and large language models (LLMs), which might be particularly important for security and compliance use cases. But with the OpenBB Copilot, categorized as a "compound AI system," users can run natural-language queries about their data out of the box.
While OpenBB has been likened to an "open-source Bloomberg," TechCrunch notes that it's not a direct competitor due to Bloomberg's massive data resources and built-in chat functionality. OpenBB, however, offers flexibility with its open-source platform and customization options.

OpenBB filed for a trademark, but Bloomberg has requested an extension to potentially oppose it, despite the company asserting there's no link between OpenBB and Bloomberg's abbreviation "BBG". Lopes says the name originates from BlackBerry stock, where the founders had lost money during the meme stock craze.
Privacy

License Plate Readers Are Creating a US-Wide Database of More Than Just Cars (wired.com) 109

Wired reports on "AI-powered cameras mounted on cars and trucks, initially designed to capture license plates, but which are now photographing political lawn signs outside private homes, individuals wearing T-shirts with text, and vehicles displaying pro-abortion bumper stickers — all while recordi00ng the precise locations of these observations..."

The detailed photographs all surfaced in search results produced by the systems of DRN Data, a license-plate-recognition (LPR) company owned by Motorola Solutions. The LPR system can be used by private investigators, repossession agents, and insurance companies; a related Motorola business, called Vigilant, gives cops access to the same LPR data. However, files shared with WIRED by artist Julia Weist, who is documenting restricted datasets as part of her work, show how those with access to the LPR system can search for common phrases or names, such as those of politicians, and be served with photographs where the search term is present, even if it is not displayed on license plates... Beyond highlighting the far-reaching nature of LPR technology, which has collected billions of images of license plates, the research also shows how people's personal political views and their homes can be recorded into vast databases that can be queried.

"It really reveals the extent to which surveillance is happening on a mass scale in the quiet streets of America," says Jay Stanley, a senior policy analyst at the American Civil Liberties Union. "That surveillance is not limited just to license plates, but also to a lot of other potentially very revealing information about people."

DRN, in a statement issued to WIRED, said it complies with "all applicable laws and regulations...." Over more than a decade, DRN has amassed more than 15 billion "vehicle sightings" across the United States, and it claims in its marketing materials that it amasses more than 250 million sightings per month. Images in DRN's commercial database are shared with police using its Vigilant system, but images captured by law enforcement are not shared back into the wider database. The system is partly fueled by DRN "affiliates" who install cameras in their vehicles, such as repossession trucks, and capture license plates as they drive around. Each vehicle can have up to four cameras attached to it, capturing images in all angles. These affiliates earn monthly bonuses and can also receive free cameras and search credits...

"License plate recognition (LPR) technology supports public safety and community services, from helping to find abducted children and stolen vehicles to automating toll collection and lowering insurance premiums by mitigating insurance fraud," Jeremiah Wheeler, the president of DRN, says in a statement... Wheeler did not respond to WIRED's questions about whether there are limits on what can be searched in license plate databases, why images of homes with lawn signs but no vehicles in sight appeared in search results, or if filters are used to reduce such images.

Privacy experts shared their reactions with Wired
  • "Perhaps [people] want to express themselves in their communities, to their neighbors, but they don't necessarily want to be logged into a nationwide database that's accessible to police authorities." — Jay Stanley, a senior policy analyst at the American Civil Liberties Union
  • "When government or private companies promote license plate readers, they make it sound like the technology is only looking for lawbreakers or people suspected of stealing a car or involved in an amber alert, but that's just not how the technology works. The technology collects everyone's data and stores that data often for immense periods of time." — Dave Maass, an EFF director of investigations
  • "The way that the country is set up was to protect citizens from government overreach, but there's not a lot put in place to protect us from private actors who are engaged in business meant to make money." — Nicole McConlogue, associate law professor at Mitchell Hamline School of Law (who has researched license-plate-surveillance systems)

Thanks to long-time Slashdot reader schwit1 for sharing the article.


Biotech

23andMe Is On the Brink. What Happens To All Its DNA Data? (npr.org) 60

The one-and-done nature of 23andMe is "indicative of a core business problem with the once high-flying biotech company that is now teetering on the brink of collapse," reports NPR. As 23andMe struggles for survival, many of its 15 million customers are left wondering what the company plans to do with all the data it has collected since it was founded in 2006. An anonymous reader shares an excerpt from the report: Andy Kill, a spokesperson for 23andMe, would not comment on what the company might do with its trove of genetic data beyond general pronouncements about its commitment to privacy. "For our customers, our focus continues to be on transparency and choice over how they want their data to be managed," he said. When signing up for the service, about 80% of 23andMe's customers have opted in to having their genetic data analyzed for medical research. "This rate has held steady for many years," Kill added. The company has an agreement with pharmaceutical giant GlaxoSmithKline, or GSK, that allows the drugmaker to tap the tech company's customer data to develop new treatments for disease. Anya Prince, a law professor at the University of Iowa's College of Law who focuses on genetic privacy, said those worried about their sensitive DNA information may not realize just how few federal protections exist. For instance, the Health Insurance Portability and Accountability Act, also known as HIPAA, does not apply to 23andMe since it is a company outside of the health care realm. "HIPAA does not protect data that's held by direct-to-consumer companies like 23andMe," she said.

Although DNA data has no federal safeguards, some states, like California and Florida, do give consumers rights over their genetic information. "If customers are really worried, they could ask for their samples to be withdrawn from these databases under those laws," said Prince. According to the company, all of its genetic data is anonymized, meaning there is no way for GSK, or any other third party, to connect the sample to a real person. That, however, could make it nearly impossible for a customer to renege on their decision to allow researchers to access their DNA data. "I couldn't go to GSK and say, 'Hey, my sample was given to you -- I want that taken out -- if it was anonymized, right? Because they're not going to re-identify it just to pull it out of the database," Prince said.

Vera Eidelman, a staff attorney with the American Civil Liberties Union who specializes in privacy and technology policy, said the patchwork of state laws governing DNA data makes the generic data of millions potentially vulnerable to being sold off, or even mined by law enforcement. "Having to rely on a private company's terms of service or bottom line to protect that kind of information is troubling -- particularly given the level of interest we've seen from government actors in accessing such information during criminal investigations," Eidelman said. She points to how investigators used a genealogy website to identify the man known as the Golden State Killer, and how police homed in on an Idaho murder suspect by turning to similar databases of genetic profiles. "This has happened without people's knowledge, much less their express consent," Eidelman said.

Neither case relied on 23andMe, and spokesperson Kill said the company does not allow law enforcement to search its database. The company has, however, received subpoenas to access its genetic information. According to 23andMe's transparency report, authorities have sought genetic data on 15 individuals since 2015, but the company has resisted the requests and never produced data for investigators. "We treat law enforcement inquiries, such as a valid subpoena or court order, with the utmost seriousness. We use all legal measures to resist any and all requests in order to protect our customers' privacy," Kill said. [...] In a September filing to financial regulators, [23andMe CEO Anne Wojcicki] wrote: "I remain committed to our customers' privacy and pledge," meaning the company's rules requiring consent for DNA to be used for research would remain in place, as well as allowing customers to delete their data. Wojcicki added that she is no longer considering offers to buy the company after previously saying she was.

AI

'Forget ChatGPT: Why Researchers Now Run Small AIs On Their Laptops' (nature.com) 48

Nature published an introduction to running an LLM locally, starting with the example of a bioinformatician who's using AI to generate readable summaries for his database of immune-system protein structures. "But he doesn't use ChatGPT, or any other web-based LLM." He just runs the AI on his Mac... Two more recent trends have blossomed. First, organizations are making 'open weights' versions of LLMs, in which the weights and biases used to train a model are publicly available, so that users can download and run them locally, if they have the computing power. Second, technology firms are making scaled-down versions that can be run on consumer hardware — and that rival the performance of older, larger models. Researchers might use such tools to save money, protect the confidentiality of patients or corporations, or ensure reproducibility... As computers get faster and models become more efficient, people will increasingly have AIs running on their laptops or mobile devices for all but the most intensive needs. Scientists will finally have AI assistants at their fingertips — but the actual algorithms, not just remote access to them.
The article's list of small open-weights models includes Meta's Llama, Google DeepMind's Gemma, Alibaba's Qwen, Apple's DCLM, Mistral's NeMo, and OLMo from the Allen Institute for AI. And then there's Microsoft: Although the California tech firm OpenAI hasn't open-weighted its current GPT models, its partner Microsoft in Redmond, Washington, has been on a spree, releasing the small language models Phi-1, Phi-1.5 and Phi-2 in 2023, then four versions of Phi-3 and three versions of Phi-3.5 this year. The Phi-3 and Phi-3.5 models have between 3.8 billion and 14 billion active parameters, and two models (Phi-3-vision and Phi-3.5-vision) handle images1. By some benchmarks, even the smallest Phi model outperforms OpenAI's GPT-3.5 Turbo from 2023, rumoured to have 20 billion parameters... Microsoft used LLMs to write millions of short stories and textbooks in which one thing builds on another. The result of training on this text, says Sébastien Bubeck, Microsoft's vice-president for generative AI, is a model that fits on a mobile phone but has the power of the initial 2022 version of ChatGPT. "If you are able to craft a data set that is very rich in those reasoning tokens, then the signal will be much richer," he says...

Sharon Machlis, a former editor at the website InfoWorld, who lives in Framingham, Massachusetts, wrote a guide to using LLMs locally, covering a dozen options.

The bioinformatician shares another benefit: you don't have to worry about the company updating their models (leading to different outputs). "In most of science, you want things that are reproducible. And it's always a worry if you're not in control of the reproducibility of what you're generating."

And finally, the article reminds readers that "Researchers can build on these tools to create custom applications..." Whichever approach you choose, local LLMs should soon be good enough for most applications, says Stephen Hood, who heads open-source AI at the tech firm Mozilla in San Francisco. "The rate of progress on those over the past year has been astounding," he says. As for what those applications might be, that's for users to decide. "Don't be afraid to get your hands dirty," Zakka says. "You might be pleasantly surprised by the results."
AI

Ellison Declares Oracle 'All In' On AI Mass Surveillance 114

Oracle cofounder Larry Ellison envisions AI as the backbone of a new era of mass surveillance, positioning Oracle as a key player in AI infrastructure through its unique networking architecture and partnerships with AWS and Microsoft. The Register reports: Ellison made the comments near the end of an hour-long chat at the Oracle financial analyst meeting last week during a question and answer session in which he painted Oracle as the AI infrastructure player to beat in light of its recent deals with AWS and Microsoft. Many companies, Ellison touted, build AI models at Oracle because of its "unique networking architecture," which dates back to the database era.

"AI is hot, and databases are not," he said, making Oracle's part of the puzzle less sexy, but no less important, at least according to the man himself - AI systems have to have well-organized data, or else they won't be that valuable. The fact that some of the biggest names in cloud computing (and Elon Musk's Grok) have turned to Oracle to run their AI infrastructure means it's clear that Oracle is doing something right, claimed now-CTO Ellison. "If Elon and Satya [Nadella] want to pick us, that's a good sign - we have tech that's valuable and differentiated," Ellison said, adding: One of the ideal uses of that differentiated offering? Maximizing AI's pubic security capabilities.

"The police will be on their best behavior because we're constantly watching and recording everything that's going on," Ellison told analysts. He described police body cameras that were constantly on, with no ability for officers to disable the feed to Oracle. Even requesting privacy for a bathroom break or a meal only meant sections of recording would require a subpoena to view - not that the video feed was ever stopped. AI would be trained to monitor officer feeds for anything untoward, which Ellison said could prevent abuse of police power and save lives. [...] "Citizens will be on their best behavior because we're constantly recording and reporting," Ellison added, though it's not clear what he sees as the source of those recordings - police body cams or publicly placed security cameras. "There are so many opportunities to exploit AI," he said.
Programming

The Rust Foundation is Reviewing and Improving Rust's Security (i-programmer.info) 22

The Rust foundation is making "considerable progress" on a complete security audit of the Rust ecosystem, according to the coding news site I Programmer, citing a newly-released report from the nonprofit Rust foundation: The foundation is investigating the development of a Public Key Infrastructure (PKI) model for the Rust language, including the design and implementation for a PKI CA and a resilient Quorum model for the project to implement, and the report says that language updates suggested by members of the Project were nearly ready for implementation.

Following the XZ backdoor vulnerability, the Security Initiative has focused on supply chain security, including work on provenance-tracking, verifying that a given crate is actually associated with the repository it claims to be. The top 5,000 crates by download count have been checked and verified.

Threat modeling has now been completed on the Crates ecosystem. Rust Infrastructure, crates.io and the Rust Project.

Two open source security tools, Painter and Typomania, have been developed and released. Painter can be used to build a graph database of dependencies and invocations between all crates within the crates.io ecosystem, including the ability to obtain 'unsafe' statistics, better call graph pruning, and FFI boundary mapping. Typomania ports typogard to Rust, and can be used to detect potential typosquatting as a reusable library that can be adapted to any registry.

They've also tightened admin privileges for Rust's package registry, according to the article. And "In addition to the work on the Security Initiative, the Foundation has also been working on improving interoperability between Rust and C++, supported by a $1 million contribution from Google."

According to the Rust foundation's technology director, they've made "impressive technical strides and developed new strategies to reinforce the safety, security, and longevity of the Rust programming language." And the director says the new report "paints a clear picture of the impact of our technical projects like the Security Initiative, Safety-Critical Rust Consortium, infrastructure and crates.io support, Interop Initiative, and much more."
United States

'The IRS Says There's Always Next Year' (msn.com) 131

The tax agency again delays a vital software upgrade, at the cost of billions. WSJ's Editorial Board: Taxpayers endure drudgery to file on time each year, but the tax collectors seem less concerned with deadlines. A new Internal Revenue Service database, more than a decade in the making, will be delayed another year. And its cost is billions of dollars and climbing. The IRS told the press this week that it won't replace its Individual Master File until the 2026 tax year, at the earliest. That falls short of Commissioner Danny Werfel's goal of launching a new system in time for 2025 taxes, and the delay could mean another year of grief for countless taxpayers. The file is the digital silo in which more than 154 million tax files are held, and keeping it up-to-date helps to enable speedy, accurate refunds.

The code that powers the database was written in the 1960s by IBM engineers at the same time their colleagues worked on the Apollo program. The system runs on a nearly extinct computer language known as Cobol, and though it retains its basic functionality, maintaining it requires bespoke service. By 2018 the IRS had only 17 remaining developers considered to be experts on the system. The agency has sought and failed to overhaul or replace the database since the 1980s. It spent $4 billion over 14 years to devise upgrades, but it canceled that effort in 2000 "without receiving expected benefits," according to the Government Accountability Office.

The costs continue to mount. IRS spending on operating and maintaining its IT systems has risen 35% in the past four years, to $2.7 billion last year from $2 billion in 2019. These costs will "likely continue to increase until a majority of legacy systems are decommissioned," according to a report last month by the agency's inspector general. Each year major upgrades are pushed back adds a larger sum to the final tab. The IRS usually pleads poverty as an excuse for failing to stay up-to-date. Yet Congress gave the agency billions of extra dollars through the Inflation Reduction Act to fund a speedy database overhaul. Since 2022 it has spent $1.3 billion beyond its ordinary budget to modernize its business systems. Taxpayers will have to wait at least another year to see if that investment has paid off.

Oracle

'Oracle's Missteps in Cloud Computing Are Paying Dividends in AI' (msn.com) 26

Oracle missed the tech industry's move to cloud computing last decade and ended up an also-ran. Now the AI boom has given it another shot. WSJ: The 47-year-old company that made its name on relational database software has emerged as an attractive cloud-computing provider for AI developers such as OpenAI, sending its long-stagnant stock to new heights. Oracle shares are up 34% since January, well outpacing the Nasdaq's 14% rise and those of bigger competitors Microsoft, Amazon.com and Google.

It is a surprising revitalization for a company many in the tech industry had dismissed as a dinosaur of a bygone, precloud era. Oracle appears to be successfully making a case to investors that it has become a strong fourth-place player in a cloud market surging thanks to AI. Its lateness to the game may have played to its advantage, as a number of its 162 data centers were built in recent years and are designed for the development of AI models, known as training.

In addition, Oracle isn't developing its own large AI models that compete with potential clients. The company is considered such a neutral and unthreatening player that it now has partnerships with Microsoft, Google and Amazon, all of which let Oracle's databases run in their clouds. Microsoft is also running its Bing AI chatbot on Oracle's servers.

The Courts

Clearview AI Fined $33.7 Million Over 'Illegal Database' of Faces (apnews.com) 40

An anonymous reader quotes a report from the Associated Press: The Dutch data protection watchdog on Tuesday issued facial recognition startup Clearview AI with a fine of $33.7 million over its creation of what the agency called an "illegal database" of billion of photos of faces. The Netherlands' Data Protection Agency, or DPA, also warned Dutch companies that using Clearview's services is also banned. The data agency said that New York-based Clearview "has not objected to this decision and is therefore unable to appeal against the fine."

But in a statement emailed to The Associated Press, Clearview's chief legal officer, Jack Mulcaire, said that the decision is "unlawful, devoid of due process and is unenforceable." The Dutch agency said that building the database and insufficiently informing people whose images appear in the database amounted to serious breaches of the European Union's General Data Protection Regulation, or GDPR. "Facial recognition is a highly intrusive technology, that you cannot simply unleash on anyone in the world," DPA chairman Aleid Wolfsen said in a statement. "If there is a photo of you on the Internet -- and doesn't that apply to all of us? -- then you can end up in the database of Clearview and be tracked. This is not a doom scenario from a scary film. Nor is it something that could only be done in China," he said. DPA said that if Clearview doesn't halt the breaches of the regulation, it faces noncompliance penalties of up to $5.6 million on top of the fine.
Mulcaire said Clearview doesn't fall under EU data protection regulations. "Clearview AI does not have a place of business in the Netherlands or the EU, it does not have any customers in the Netherlands or the EU, and does not undertake any activities that would otherwise mean it is subject to the GDPR," he said.
United States

Investigation Finds 'Little Oversight' Over Crucial Supply Chain for US Election Software (politico.com) 94

Politico reports U.S. states have no uniform way of policing the use of overseas subcontractors in election technology, "let alone to understand which individual software components make up a piece of code."

For example, to replace New Hampshire's old voter registration database, state election officials "turned to one of the best — and only — choices on the market," Politico: "a small, Connecticut-based IT firm that was just getting into election software." But last fall, as the new company, WSD Digital, raced to complete the project, New Hampshire officials made an unsettling discovery: The firm had offshored part of the work. That meant unknown coders outside the U.S. had access to the software that would determine which New Hampshirites would be welcome at the polls this November.

The revelation prompted the state to take a precaution that is rare among election officials: It hired a forensic firm to scour the technology for signs that hackers had hidden malware deep inside the coding supply chain. The probe unearthed some unwelcome surprises: software misconfigured to connect to servers in Russia ["probably by accident," they write later] and the use of open-source code — which is freely available online — overseen by a Russian computer engineer convicted of manslaughter, according to a person familiar with the examination and granted anonymity because they were not authorized to speak about it... New Hampshire officials say the scan revealed another issue: A programmer had hard-coded the Ukrainian national anthem into the database, in an apparent gesture of solidarity with Kyiv.

None of the findings amounted to evidence of wrongdoing, the officials said, and the company resolved the issues before the new database came into use ahead of the presidential vote this spring. This was "a disaster averted," said the person familiar with the probe, citing the risk that hackers could have exploited the first two issues to surreptitiously edit the state's voter rolls, or use them and the presence of the Ukrainian national anthem to stoke election conspiracies. [Though WSD only maintains one other state's voter registration database — Vermont] the supply-chain scare in New Hampshire — which has not been reported before — underscores a broader vulnerability in the U.S. election system, POLITICO found during a six-month-long investigation: There is little oversight of the supply chain that produces crucial election software, leaving financially strapped state and county offices to do the best they can with scant resources and expertise.

The technology vendors who build software used on Election Day face razor-thin profit margins in a market that is unforgiving commercially and toxic politically. That provides little room for needed investments in security, POLITICO found. It also leaves states with minimal leverage over underperforming vendors, who provide them with everything from software to check in Americans at their polling stations to voting machines and election night reporting systems. Many states lack a uniform or rigorous system to verify what goes into software used on Election Day and whether it is secure.

The article also points out that many state and federal election officials "insist there has been significant progress" since 2016, with more regular state-federal communication. "The Cybersecurity and Infrastructure Security Agency, now the lead federal agency on election security, didn't even exist back then.

"Perhaps most importantly, more than 95% of U.S. voters now vote by hand or on machines that leave some type of paper trail, which officials can audit after Election Day."
Open Source

Open Source Redis Fork 'Valkey' Has Momentum, Improvements, and Speed, Says Dirk Hohndel (thenewstack.io) 16

"Dirk Hohndel, a Linux kernel developer and long-time open source leader, wanted his audience at KubeCon + CloudNativeCon + Open Source Summit China 2024 Summit China to know he's not a Valkey developer," writes Steven J. Vaughan-Nichols. "He's a Valkey user and fan." [Hohndel] opened his speech by recalling how the open source, high-performance key/value datastore Valkey had been forked from Redis... Hohndel emphasized that "forks are good. Forks are one of the key things that open source licenses are for. So, if the maintainer starts doing things you don't like, you can fork the code under the same license and do better..." In this case, though, Redis had done a "bait-and-switch" with the Redis code, Hohndale argued. This was because they had made an all-too-common business failure: They hadn't realized that "open source is not a business model...."

While the licensing change is what prompted the fork, Hohndel sees leadership and technical reasons why the Valkey fork is likely to succeed. First, two-thirds of the formerly top Redis maintainers and developers have switched to Valkey. In addition, AWS, Google Cloud, and Oracle, under the Linux Foundation's auspices, all support Valkey. When both the technical and money people agree, good things can happen.

The other reason is that Valkey already looks like it will be the better technical choice. That's because the recently announced Valkey 8.0, which builds upon the last open source version of Redis, 7.2.4, introduces serious speed improvements and new features that Redis users have wanted for some time. As [AWS principal engineer Madelyn] Olson said at Open Source Summit North America earlier this year, "Redis really didn't want to break anything." Valkey wants to move a bit faster. How much faster? A lot. Valkey 8.0 overhauls Redis's single-threaded event loop threading model with a more sophisticated multithreaded approach to I/O operations. Hohndel reported that on his small Valkey-powered aircraft tracking system, "I see roughly a threefold improvement in performance, and I stream a lot of data, 60 million data points a day."

The article notes that Valkey is already being supported by major Linux distros including AlmaLinux, Fedora, and Alpine.
Idle

How a Group of Teenagers Pranked 'One Million Checkboxes' (kottke.org) 20

After game developer Nolen Royalty launched his short-lived viral site "One Million Checkboxes" in June. (Any visitor could check or uncheck a box in the grid — which would change how it displayed for every other visitor to the site, in near real-time.) "Within days there were half a million people on the site," he says in a new video, "and people checked over 650 million boxes in the two weeks that I kept the site online."

But he also explains how what happened next was even more amazing: He'd stored the state of his one million checkboxes in a million-bit database — 125 kilobytes — and got a surprise after rewriting the backend in Go. Looking at the raw bytes (converted into their value in the 256-character ASCII table)... they spelled out a URL.

Had someone hacked into his database? No, the answer was even stranger. Somebody was writing me a message in binary."

"Someone was sitting there, checking and unchecking boxes to form numbers that formed letters that spelled out this URL. And they were probably doing this with a bot, to make sure those boxes remained checked and unchecked in exactly the way that they wanted them to." The URL led to a Discord channel, where he found himself talking to the orchestrators of the elaborate prank. And it was then that they asked him: "Have you seen your checkboxes as a 1,000 x 1,000 image yet?" It turns out they'd also input two alternate versions of the same message — one in base64, and one drawn out as a fully-functional QR code. (And some drawings....)

"The Discord was full of very sharp teens, and they were writing this message in secret — with tens of thousands of people on the web site — to gather other very sharp teens. And it totally worked. There were 15 people when I joined, over 60 people in the Discord by the time that i left.

"I tried to make it hard for them to draw, but... no problem. They found a way. And they started drawing some very cool things. They put a Windows blue-screen-of-death on the site. They put sexy Jake Gyllenhaal gifs on the site. At the end I removed all my rate limits for an hour as a treat, and they did a real-time [animated] Rickroll across the entire site."

The video ends with the webmaster explaining why he thought their project was so cool. "As I kid, I spent a lot of time doing dum stuff on the computer, and I didn't get into too much trouble when I, for example, repeatedly crashed my high school mail server. There is no way that I would be doing what I do now without the encouragement of people back then. So providing a playground like this, getting to see what they were doing, provide some encouragement and say, 'Hey this is amazing' — was so special for me.

"The people in that Discord are so extraordinarily talented, so creative, so cool, I cannot wait to see what they go on to make."

Link via Kottke.org
Television

ESPN's 'Where To Watch' Tries To Solve Sports' Most Frustrating Problem (arstechnica.com) 67

An anonymous reader quotes a report from Ars Technica: Too often, new tech product or service launches seem like solutions in search of a problem, but not this one: ESPN is launching software that lets you figure out just where you can watch the specific game you want to see amid an overcomplicated web of streaming services, cable channels, and arcane licensing agreements. Every sports fan is all too familiar with today's convoluted streaming schedules. Launching today on ESPN.com and the various ESPN mobile and streaming device apps, the new guide offers various views, including one that lists all the sporting events in a single day and a search function, among other things. You can also flag favorite sports or teams to customize those views.

"At the core of Where to Watch is an event database created and managed by the ESPN Stats and Information Group (SIG), which aggregates ESPN and partner data feeds along with originally sourced information and programming details from more than 250 media sources, including television networks and streaming platforms," ESPN's press release says. ESPN previously offered browsable lists of games like this, but it didn't identify where you could actually watch all the games. There's no guarantee that you'll have access to the services needed to watch the games in the list, though. Those of us who cut the cable cord long ago know that some games -- especially those local to your city -- are unavailable without cable.

The Internet

Ikea Takes On Craigslist With Classifieds Site For Its Used Furniture (arstechnica.com) 40

An anonymous reader quotes a report from the Financial Times: Ikea is taking on the likes of eBay, Craigslist, and Gumtree with a peer-to-peer marketplace for customers to sell secondhand furniture to each other. Ikea Preowned will be tested in Madrid and Oslo until the end of the year with the aim of rolling out the buying and selling platform globally, according to Jesper Brodin, chief executive of Ingka, the main operator of Ikea stores. [...] Ikea has had a small offering under which it buys used furniture from customers and resells it in store. But the new platform is more ambitious, aiming to tackle the secondhand market for customers selling directly to each other -- an area where Brodin estimates Ikea has a higher market share than in new furniture sales. Customers enter their product, their own pictures, and a selling price, while Ikea's own artificial intelligence-enabled database brings in its own promotional images and measurements. The buyer collects the furniture directly from the seller, who has the option of receiving money or a voucher from Ikea with a 15 percent bonus.

"Very often there is a monopoly or oligopoly on platforms that operate," said Brodin, talking about eBay or digital classified ad services such as Gumtree in the UK and Finn in Norway. Finn has 8,700 items from Ikea listed in Oslo alone. Early offerings on Ikea Preowned include large items such as sofas for up to $670 (600 euros) and wardrobes for $500 (450 euros) as well as smaller items such as a toilet roll holder for $4.50 (4 euros). Listings are free, but Brodin said Ikea could eventually charge "a symbolic fee, a humble fee." He added: "We're going to verify the full scope including the economics. If a lot of people use the offer to get a discount with Ikea -- it's a good way to reconnect with customers. I am very curious. I think it makes business sense." Ikea has previously tested selling its new furniture on third-party platforms such as Alibaba's Tmall in China, but the Preowned platform marks its first foray into secondhand marketplaces. It also dovetails with the retailer's wish to become "circular and climate positive" by 2030.

Microsoft

Microsoft Will Try the Data-Scraping Windows Recall Feature Again in October (arstechnica.com) 62

Microsoft will begin sending a revised version of its controversial Recall feature to Windows Insider PCs beginning in October, according to an update published to the company's original blog post about the Recall controversy. From a report: The company didn't elaborate further on specific changes it's making to Recall beyond what it already announced in June.

For those unfamiliar, Recall is a Windows service that runs in the background on compatible PCs, continuously taking screenshots of user activity, scanning those screenshots with optical character recognition (OCR), and saving the OCR text and the screenshots to a giant searchable database on your PC. The goal, according to Microsoft, is to help users retrace their steps and dig up information about things they had used their PCs to find or do in the past.

Privacy

National Public Data Published Its Own Passwords (krebsonsecurity.com) 35

Security researcher Brian Krebs writes: New details are emerging about a breach at National Public Data (NPD), a consumer data broker that recently spilled hundreds of millions of Americans' Social Security Numbers, addresses, and phone numbers online. KrebsOnSecurity has learned that another NPD data broker which shares access to the same consumer records inadvertently published the passwords to its back-end database in a file that was freely available from its homepage until today. In April, a cybercriminal named USDoD began selling data stolen from NPD. In July, someone leaked what was taken, including the names, addresses, phone numbers and in some cases email addresses for more than 272 million people (including many who are now deceased). NPD acknowledged the intrusion on Aug. 12, saying it dates back to a security incident in December 2023. In an interview last week, USDoD blamed the July data leak on another malicious hacker who also had access to the company's database, which they claimed has been floating around the underground since December 2023.

Following last week's story on the breadth of the NPD breach, a reader alerted KrebsOnSecurity that a sister NPD property -- the background search service recordscheck.net -- was hosting an archive that included the usernames and password for the site's administrator. A review of that archive, which was available from the Records Check website until just before publication this morning (August 19), shows it includes the source code and plain text usernames and passwords for different components of recordscheck.net, which is visually similar to nationalpublicdata.com and features identical login pages. The exposed archive, which was named "members.zip," indicates RecordsCheck users were all initially assigned the same six-character password and instructed to change it, but many did not. According to the breach tracking service Constella Intelligence, the passwords included in the source code archive are identical to credentials exposed in previous data breaches that involved email accounts belonging to NPD's founder, an actor and retired sheriff's deputy from Florida named Salvatore "Sal" Verini.

Reached via email, Mr. Verini said the exposed archive (a .zip file) containing recordscheck.net credentials has been removed from the company's website, and that the site is slated to cease operations "in the next week or so." "Regarding the zip, it has been removed but was an old version of the site with non-working code and passwords," Verini told KrebsOnSecurity. "Regarding your question, it is an active investigation, in which we cannot comment on at this point. But once we can, we will [be] with you, as we follow your blog. Very informative." The leaked recordscheck.net source code indicates the website was created by a web development firm based in Lahore, Pakistan called creationnext.com, which did not return messages seeking comment. CreationNext.com's homepage features a positive testimonial from Sal Verini.

Programming

GitHub Promises 'Additional Guardrails' After Wednesday's Update Triggers Short Outage (githubstatus.com) 12

Wednesday GitHub "broke itself," reports the Register, writing that "the Microsoft-owned code-hosting outfit says it made a change involving its database infrastructure, which sparked a global outage of its various services."

Or, as the Verge puts it, GitHub experienced "some major issues" which apparently lasted for 36 minutes: When we first published this story, navigating to the main GitHub website showed an error message that said "no server is currently available to service your request," but the website was working again soon after. (The error message also featured an image of an angry unicorn.) GitHub's report of the incident also listed problems with things like pull requests, GitHub Pages, Copilot, and the GitHub API.
GitHub attributed the downtime to "an erroneous configuration change rolled out to all GitHub.com databases that impacted the ability of the database to respond to health check pings from the routing service. As a result, the routing service could not detect healthy databases to route application traffic to. This led to widespread impact on GitHub.com starting at 23:02 UTC." (Downdetector showed "more than 10,000 user reports of problems," according to the Verge, "and that the problems were reported quite suddenly.")

GitHub's incident report adds that "Given the severity of this incident, follow-up items are the highest priority work for teams at this time." To prevent recurrence we are implementing additional guardrails in our database change management process. We are also prioritizing several repair items such as faster rollback functionality and more resilience to dependency failures.
Privacy

National Public Data Confirms Breach Exposing Social Security Numbers (bleepingcomputer.com) 56

BleepingComputer's Ionut Ilascu reports: Background check service National Public Data confirms that hackers breached its systems after threat actors leaked a stolen database with millions of social security numbers and other sensitive personal information. The company states that the breached data may include names, email addresses, phone numbers, social security numbers (SSNs), and postal addresses.

In the statement disclosing the security incident, National Public Data says that "the information that was suspected of being breached contained name, email address, phone number, social security number, and mailing address(es)." The company acknowledges the "leaks of certain data in April 2024 and summer 2024" and believes the breach is associated with a threat actor "that was trying to hack into data in late December 2023." NPD says they investigated the incident, cooperated with law enforcement, and reviewed the potentially affected records. If significant developments occur, the company "will try to notify" the impacted individuals.

Programming

'The Best, Worst Codebase' 29

Jimmy Miller, programmer and co-host of the future of coding podcast, writes in a blog: When I started programming as a kid, I didn't know people were paid to program. Even as I graduated high school, I assumed that the world of "professional development" looked quite different from the code I wrote in my spare time. When I lucked my way into my first software job, I quickly learned just how wrong and how right I had been. My first job was a trial by fire, to this day, that codebase remains the worst and the best codebase I ever had the pleasure of working in. While the codebase will forever remain locked by proprietary walls of that particular company, I hope I can share with you some of its most fun and scary stories.

[...] Every morning at 7:15 the employees table was dropped. All the data completely gone. Then a csv from adp was uploaded into the table. During this time you couldn't login to the system. Sometimes this process failed. But this wasn't the end of the process. The data needed to be replicated to headquarters. So an email was sent to a man, who every day would push a button to copy the data.

[...] But what is a database without a codebase. And what a magnificent codebase it was. When I joined everything was in Team Foundation Server. If you aren't familiar, this was a Microsoft-made centralized source control system. The main codebase I worked in was half VB, half C#. It ran on IIS and used session state for everything. What did this mean in practice? If you navigated to a page via Path A or Path B you'd see very different things on that page. But to describe this codebase as merely half VB, half C# would be to do it a disservice. Every javascript framework that existed at the time was checked into this repository. Typically, with some custom changes the author believed needed to be made. Most notably, knockout, backbone, and marionette. But of course, there was a smattering of jquery and jquery plugins.
Security

Thousands of Corporate Secrets Were Left Exposed. This Guy Found Them All 7

Security researcher Bill Demirkapi unveiled a massive trove of leaked developer secrets and website vulnerabilities at the Defcon conference in Las Vegas. Using unconventional data sources, Demirkapi identified over 15,000 exposed secrets, including credentials for Nebraska's Supreme Court IT systems and Stanford University's Slack channels.

The researcher also discovered 66,000 websites with dangling subdomain issues, making them vulnerable to attacks. Among the affected sites was a New York Times development domain. Demirkapi's tack involved scanning VirusTotal's database and passive DNS replication data to identify vulnerabilities at scale. He developed an automated method to revoke exposed secrets, working with companies like OpenAI to implement self-service deactivation of compromised API keys.

Slashdot Top Deals