




The Future of Databases 315
gManZboy writes "Ever wonder where database technology is going? This is something that Turing award winner Jim Gray from Microsoft has given a lot of thought to. He recently published an article in which he looks at the many forces pushing database technologies forward, and what those new technologies will look like. Gray writes, 'the greatest of these [research challenges] will have to do with the unification of approximate and exact reasoning. Most of us come from the exact-reasoning world -- but most of our clients are now asking questions that require approximate or probabilistic answers.'"
Turing award winner? (Score:5, Funny)
Re:Turing award winner? (Score:2, Insightful)
Re:Turing award winner? (Score:3, Funny)
--
He must not be a slashdot user then.
--
Why do you say He must not be a slashdot user then.?
Re:Turing award winner? (Score:3, Funny)
approximate answers.. (Score:5, Funny)
Re:approximate answers.. (Score:2, Offtopic)
>
> * 42..ish
Larry: Of course, he only had the two arms and the one head, and he called himself Jim Gray.
Melinda: But you must admit, he did turn out to be from another planet.
Larry: By my yacht! Melinda Gates!
it.slashdot.org: Infinity minus 1. Improbability sum now complete.
Larry: What are you doing here?
Melinda: With a degree in human-computer interaction and another i
LIKE '%approximate answers..%' (Score:2)
[1] http://lucene.apache.org/ [apache.org]
[2] http://incubator.apache.org/lucene4c/ [apache.org]
Why complicate things so much? (Score:5, Interesting)
In my opinion, the future of databases is nothing so complicated as pitched here -- but rather a move to simpler, more reliable back ends where the filesystem is the database. This is certainly the vision pitched by Hans Reiser and reiserfs [namesys.com], which aims to put more database like intelligence within the filesystem. So you eliminate extra unnecessary layers that just eat up resources and create fragile databases.
Re:Why complicate things so much? (Score:4, Informative)
Certainly that's not going to lead to more crashes.
Certainly it's a better idea than, for example, distributing the databases and using load-balancing and regularly scheduled back-ups to ameliorate the loss of the least realiable portions of a databases design - the harddrives.
When you've only got a hammer, everything seems like a nail...what does Hans Reiser do? He could be right. Microsoft is jumping on the filesystem-database wagon with their new filesystem, and we all know that if anyone knows and cares about reliability it's Microsoft.
Re:Why complicate things so much? (Score:4, Informative)
No, they're not. WinFS is *not* a filesystem, it's a DB layer that sits on top of the filesystem.
And when you consider NTFS *on its own* (like BFS) has the capabilities to do most of what WinFS is supposed to achieve, WinFS just looks sillier and sillier...
Re:Why complicate things so much? (Score:5, Insightful)
Re:Why complicate things so much? (Score:2)
Re:Why complicate things so much? (Score:2)
Re:Why complicate things so much? (Score:3, Interesting)
That's not very big. It's down right small, in fact.
These figures, on one of many systems I manage, are about 30 minutes old. And they don't include index space, rollforward logs, etc, etc.
Names have been changed for privacy, of course.
TABLE_NAME CARDINALITY TOT_BYTES
TABLE_1 850,719,662 195,665,522,260
TABLE_2 756,309,106 223,867,495,376
TABLE_3 317,181,446 72,951,732,580
TABLE_4 179,099,344 11,462,358,0
Re:Why complicate things so much? (Score:3, Insightful)
I must be missing something.
Jason.
Re:Why complicate things so much? (Score:2)
GP thinks that 150GB is a large database.
Atomicity in filestores is a great benefit (Score:2, Informative)
Databases require a mechanism for atomicity to create their transactions, and because no common operating system has ever provided such, they need to implement it themselves at application level. It's like the bad old days before PCs provided networking, and you had to run up your own networking stack if your application needed comms.
Well reiserfs has the goal of pro
Re:Atomicity in filestores is a great benefit (Score:3, Informative)
Re:Atomicity in filestores is a great benefit (Score:3, Interesting)
Re:Why complicate things so much? (Score:3, Interesting)
Do we evolve the file system into a database (Reiser approach) or evolve a database into a file system (Microsoft WinFS approach)?
Re:Why complicate things so much? (Score:2)
Note: huge sites don't go down because of 1% performance loss due to an extra filesystem layer.
And I doubt very much that applications that use filebased storage instead of a r
Re:Why complicate things so much? (Score:2)
Well, I suppose we could go back to the original UNIX way of doing things:
Everything is a string of bytes on the hard disk.
No file system at all.
Somehow I don't think that's going to take off. Grep is great - but it's not THAT great.
Re:Why complicate things so much? (Score:3, Insightful)
How many times? Not all that many, in my experience. How many times when the sites were running off a hardy RDBMS like Oracle, rather than something in the MySQL range? Even fewer.
Of course, "websites going down" is not exactly the best indicator of database reliability in the first place...
While you're proposing making database
Re:Why complicate things so much? (Score:2)
Works for search, not quite so well for concurrent updates, and the security controls would sort of suck.
You'd be loading all the locks back into the applications layer. Google model works fairly well for lookup, distributed db leafs across multiple (many tuples) systems. However I suspect the lock mechanism behind the search, the population of the database, would be fairly arcane, and not general-purpose at all.
A real problem comes full circle (Score:5, Interesting)
Add code to it, and you have data+code.
OF course, code is data, and thus data can be treated as code, and handled by other code. LISP does this moderately well.
But you can't avoid the fact that, as it stands, databases are just engines for keeping your data structures outside your code, or when you add code to them, engines for reading your data structure for you so that you don't have to think about how to do it.
I'm getting rather tired of the fad that databases should be tacked on to everything, ranging from a shopping list to guidance systems. When did adding overhead become the mark of skill?
Re:A real problem comes full circle (Score:4, Insightful)
If you want to be able to ask probablistic type queries of a database, you need to add some code between you and the database.
More to the point, the fuzzier your logic is, the higher the probability that your database will not contain all of the answers on its own, and you will have to cross reference your data to the data owned by someone else or gathered from a different disparate source.
It sounds like M$ is going to try to re-invent data warehousing? and then of course, patent it.
Trying to make the database do everything is not right and simply doesn't make sense. The code that accesses the data for you needs to do the fuzzy probablistic stuff.
P.S. I have no faith that M$ (no matter who they hire) can effectively provide the code required to make it work in the idealistic manner spoken of... mostly because they would have to patent accessing other people's data before they could do it.
Just my thoughts
Re:A real problem comes full circle (Score:3, Insightful)
To my mind databases are broken beyond belief.
Well let us know when you think of a
Re:A real problem comes full circle (Score:2, Insightful)
The second it became profitable to market it as such.
KFG
Thank you, Wikipedia (Score:2)
. . . I no longer have to debunk the sweeping claims of an AC--I can simply provide a link [wikipedia.org].
Great Article (Score:5, Insightful)
Now that data mining is a $[insert large number here]million industry, databases are being asked to do a lot more processing with this data than before. For example: old database query = get these attributes from tuples that match this pattern. New database query = determine how likely a user who has accessed 30 or more times this last month is to subscribe to the second-level pay service within the next ninety days, with or without an email advertising said service.
In other words ... (Score:5, Insightful)
Fortunately, Microsoft will be there to take their money.
Re:In other words ... (Score:4, Funny)
If I had to pick between a magic glowy box and an MBA to show signs of intelligence, I'm definitely going with the magic glowy box.
Re:In other words ... (Score:2)
Re:In other words ... (Score:2)
Re:In other words ... (Score:2, Interesting)
Jeremy
Re:In other words ... (Score:5, Funny)
[MBA tool]"I want to come in in the morning, push a button, and have the program distribute all my stuff."
[me]"If I could make it do that, I could make it push its own button, and the company wouldn't need you anymore."
[MBA tool]"Oh."
$article_title by $blowhard. (Score:5, Funny)
Re:$clever_title (Score:3, Funny)
Accountable bitemporal DBs (Score:4, Informative)
In contrast, a secure bitemporal DB would record not only the date of the what the data refers to (e.g., the purchase order was entered on March 3rd, 2004) but also the date(s) of any modifications of the data (the quantity and total was changed on December 31, 2004, Uh-Oh!).
This is more than just securing the DB with a hierarchy of privileges, it means that no one can overwrite the old data or change any data without creating an audit trail. This, of course, also means changes in the DB, OS or file system to make critical data only accessible through a secure DB layer that tracks changes (e.g., no accessible plain-text DB data structures). These same concepts could be used (probably are, for all I know) for OSS version control to track who did what and when to the code.
Re:Accountable bitemporal DBs (Score:2)
Re:Accountable bitemporal DBs (Score:4, Insightful)
Yeah, I've heard that one too. Reality has a way of factoring out the ambiguity of such abstract, open-ended claims.
On way to deal with the problem of DBAs and their ability to access/modify financial data is to register them with the exchange, just like the finance and executive types. Now they're Sarbanes-Oxley insider compliant! That's what has been done where I earn my living.
Thus, we may dispense with elaborate schemes of secure data version control using unspecified, hypothetical systems, paid for with budgets that don't exist. Next!
Until some future revision of Sarbanes-Oxley begins to specify the design and implementation of electronic finance systems, no one can claim a database is more or less susceptible to malfeasance than a locked filing cabinet. That's why the auditors stop once they've concluded you're changing your password with adequate frequency.
Re:Accountable bitemporal DBs (Score:2)
You mean your RDBMS doesn't have full auditing capabilities?
What are you using SQL Server?
Any "enterprise" RDBMS worth it's salt has had such features for 20 years.
Of course, before you enable full auditing, you'd better double your IO capacity, well as increasing your CPU capacity.
Re:Accountable bitemporal DBs (Score:2)
You misspelled 'unholy birth'.
I predict... (Score:4, Interesting)
That's... about... it...
Object relational was the "new thing" that didn't really take off as well as they'd hoped.
Hell, I work with people who still can't handle compound keys and joins well...
Re:I predict... (Score:2)
The worst is every oracle db runs on some expensive production environment hardware. Every MySQL db runs on a cheap PC. Until this is changed, oracle is stuck as the industry standard.
Re:I predict... (Score:2)
What are you smoking? I've seen oracle running on compaq proliant pentium pro systems, and i've worked with mysql installations running on sun enterprise 5000's. Oracle is not the industry standard because it runs on bigger hardware. Oracle is the industry standard because even though it is the biggest pain in the ass to
Re:I predict... (Score:2)
You're right. They should use PostgreSQL instead.
But seriously, an Oracle DBA (who's not dependent on GUIs) would feel at home with PostgreSQL 8.0.
You're thinking of Object Databases (Score:2)
You're thinking of Object databases, which indeed did not take off at all.
However Object-Relational systems are EVERYWHERE. There's hardly a big database anymore that doesn't have several object-relational mapping systems between it and code...
Object->Relational mappers have taken off in a big way, which is good in a way since the databases can remain the nice placid solid systems they've always been and you ca
I want clustered databases for high-availability (Score:5, Interesting)
Re:I want clustered databases for high-availabilit (Score:2)
Re:I want clustered databases for high-availabilit (Score:5, Informative)
>them act as a single database server. Our database server is the most expensive item in our datacenter
>because it's an N-way IBM server.
lol, IBM has supported *exactly* what you are talking about for at least five years.
That is, you can spread your db2 database across 10,100, or 1000+ linux commodity boxes (ideally blades). Or you can use windows, or aix, or solaris, or hp-ux, etc. Of course, those individual boxes can be SMPs in their own right - so a thousand 8-way aix boxes is certainly possible, if not cheap.
Oracle is now in this game as well - oracle 10g can certainly support 32, and maybe 64 individual linux boxes in a cluster. The techniques are different between the two - oracle might be better at transactional systems. db2 is definitely better at data warehousing, data mining, etc.
Of course, there are still benefits to a big smp: a single P570 16-way will cost you $250k. But each of those 16 cpus is multi-code (and far faster than intel or amd), and with its micro-partitioning - it can run at least 150 linux or aix lpars (logical partitions). These lpars can grow or shink as they need - so you aren't always over-buying for size, buying new hardly-used hardware, or having to colocate apps on a busy server - when a different os would be preferable. Not to say everyone should go this way - but there are definite benefits.
Re:I want clustered databases for high-availabilit (Score:2)
Re:I want clustered databases for high-availabilit (Score:2)
> from Digital was the cluster support. However it
seems they tool a long time to get the technology
> into their own RDMS.
I've got fond memories of an 800 gbyte billing & customer data warehouse on rdb around 1995 - giving sub-second access and running on a vms quad. That was such a slow system compared to what we've got now - but it sure handled a ton of data well.
On the other hand, I don't re
Re:I want clustered databases for high-availabilit (Score:3, Informative)
2 servers acting as a single database server has been available for many years...e.g., Oracle 9i RAC, Oracle 10g, DB/2's something or other, etc.
Re:I want clustered databases for high-availabilit (Score:2)
The "next great advancement" in databases will be when I can setup 2 or more linux servers and have them act as a single database server.
i.e. Out of the box, I can setup a database cluster. I'm talking about costs for HA. If I can dump the big iron for 2- or 4-way x86 servers, I'd save money. But if I need to pay a lot for support for Oracle's RAC, or a custom setup/installation, etc., then I'm not saving money. It all comes down to the bottom line.
Re:I want clustered databases for high-availabilit (Score:3, Funny)
Re:I want clustered databases for high-availabilit (Score:2)
Why would you want to?
With shared storage (hello, VAXcluster 1984!), you still have access to all of your data as long as one of the nodes stays up.
Re:I want clustered databases for high-availabilit (Score:2)
This actually works too. Stably too, go figure.
--chitlenz
Re:I want clustered databases for high-availabilit (Score:2)
Well, since Microsoft recommends running a separate server for every server function, I imagine they'd say if you want to run two SQL Server databases, you'd best use two SQL Server engines running on two separate Windows Servers on two separate hardware systems - for which of course, you pay for two licenses (and two more for the Windows Servers).
Of course, Oracle with their database layout basically says the same thing - except they want you to put your indexes, your tablespaces, your logs and everything
It warms my heart... (Score:3, Interesting)
I can see they've no hope of being any competition at all come the real db revolution.
Re:It warms my heart... (Score:2)
Thanks.
How it this news? (Score:2)
this is almost the same content as his SIGMOD 2004 speech,
which is available since April 2004
http://research.microsoft.com/research/pubs/view. a spx?tr_id=735 [microsoft.com]
How is the refurbishing of an one year old article news?
(And, BTW, I find the keynote speech better structured then the refurbishment)
Re:How it this news? (Score:2)
Personally, despite some of the crap on slashdot, I still stick around, mostly
That is what SAS is for... (Score:4, Insightful)
What are my chances of getting laid tonight...
What are the odds of my winning the lottery...
What are the chances that my boss will find out about that phoney dinner reciept...
Seriously, SAS stat analysis software does exactly what this numbskull is talking about. You don't need a new kind of database, merely somebody with training in stats.
Re:That is what SAS is for... (Score:2)
Re:That is what SAS is for... (Score:2)
Over the course of my lifetime I have had the opportunity of working with a number of people who have won Nobel Prizes. Here are some clues:
- The Turing Prize is NOT an equivalent to the Nobel Prize. Not even close.
- Computer Science is NOT a science.
- Aliasing two disparate problem domains and then calling for a fusion to combine them results in a design that solves neither problem well.
Hmmm Databases (Score:5, Interesting)
Next, there was some inane reference to reiserfs above, which clearly ignores what a database fundamentaly both is, and is becoming. It really began (and I hate to admit this as a former Solaris/Oracle admin) with SQL Server 7 and Oracle 8, and the concept that a database should be object programmable. Reiser is not going to be streaming still frames of image data fast enough to a remote client to rebuild seamlessly into a movie, for instance. Or recalculate all of a company's business logic for point of sale systems so that, for instance, the wrong type of credit card gets rejected, or so a supply chain gets populated, the list is endless. Reiser, and for that matter VFS and the other myriad of database enhanced filesystems, are tools. Good ones, but tools...
It's interesting to note that MS has finally figured out that the "n-tier" was a dumb idea. It's almost like, well you take all this shit, then sell it through a middle man, but expect to not have to pay him anything for brokering. Like, duh. We actively benchamrked this process, in fact, and discovered that it does, not suprisingly, take time to pass data through an extra server.
Workflow is life. It's what make this page exist (SD is I believe run in MySQL). The idea of publishing-subscribers with atomic transactions is hardly new, but I agree with the authors that this is the direction of the market, simply because businesses now are getting spread all over. Read - If your job just went to India, learn to be a DBA, cuz when all that shit they sent over there comes back, you can bet its going to be a mess (and is a mess actually already, which is why, in particular, people in ERP fields that intertwine with mine(as a DBA) demand and recieve very large salaries, 200$US an hour is not unusual). The reason this particular ramble is relevant, is because lots of global companies are either looking at, or are already implementing, the idea of data grids, where all the data servers inside a global network stay in sync. Suzy the secretary checks out a document in Baltimore, and that document flags as in use in Madrid through transactional replication within a kind of database trust-relationship network. It's a very very good way for companies with lots of data to keep it all together, but today it's still a pain in the ass to manage.
Vertical partioning is pretty much worthless except to data warehousing installations, most of whom are probably running on strong equipment already (to have that much data). Not to mention, I believe (I'd have to check, since it's not a feature I'd really use) Oracle's 10G product allows for this already if you really want it. Materialized views is another point here that raises my hackles. This guy is writing about the wonder of materialized views and column partitions, which ARE a cool performance cheat in large systems, but make no mistake that by the time you get to this point, you are probably rearranging deck chairs on the titanic anyway. Essentially Materialized views precache SQL resultsets into a temporary table which gets constantly updated so it can always provide a full resultset without having to parse the parent table. This is processor and space expensive. Vertical par
Re:Hmmm Databases (Score:2)
agreed - as a human-readable way of persisting data it stinks. as a way of persisting data it stinks. But it isn't bad as an over-the-wire protocol.
> vertical partitioning...
Hmmm, don't see much vertical partitioning in data warehouses any more. Used to on oracle years ago, but can't even remember why. But I am finding that both mean range & hash partitioning work well together. The range is cheaper & easier to implement but only gives a performance benefit when
Re:Hmmm Databases (Score:4, Interesting)
protocol.
Yes, it is. I've worked on a project that allowed offline modification of a database by replicating a copy to user's PCs, and it originally used XML as the format for data transfer. We got a 30% speedup by switching to tab-separated variables with a line of metadata at the start of each chunk of the stream. Any technology that costs that much in overhead and provides little or no perceivable benefit is a waste of time. (Of course, if your data isn't relational, this is probably not much use to you, but then... what are you storing it in? XML documents?)
The only justification for XML is that there are a lot of tools out there that work with -- I use it is an intermediate interchange format between different environments because the libraries available make it easy with just about anything I want to access the data with.
Unfortunately... (Score:2)
The big end of the database world has always seemed strange to me. Your post provides some interesting views on that area.
time to retire, perhaps (Score:2)
On the other hand, those databases have already been pushed far beyond their limits: people have been using them inappropriately in many applications. Much of the "hot" recent stuff Gray mentions is not new technology: smart people have been proposing it and using it for years, only to be beaten down in the market by the relentless push behind relational te
Google is a good example... (Score:3, Insightful)
I give Gray a lot of respect in most cases because he's a really smart guy. But the math and computationally-intensive parts should be focused in the probabilistic searches.
In one sense, though, Gray is quite right. And this is the direction of speech recognition. I might add that the Speech Server beta out by Microsoft is quite good...even at this stage.
The Future Of Databases? (Score:2, Interesting)
The future of databases is... no Database at all!! (Score:3, Interesting)
Crazy idea, huh? What if I said that this can be as fast as 8000 times faster than Oracle? And 3000 times faster than MySQL!
Crash recovery? No big deal, keep a serialized version of your in-memory-objects, and a transaction log and you're set!
Read more at:
http://www.prevayler.org/ [prevayler.org]
Re:The future of databases is... no Database at al (Score:3, Informative)
> instances of your data in-memory for near instant access?
because a *well-tuned* relational database with a 1:4 ratio of memory to disk is almost as fast as an in-memory database - due to efficient caching
because some queries require an enormous amount of temp space. supporting them can easily double your space requirements - which have to be purchased in memory.
because if you just want to run your datab
Re:The future of databases is... no Database at al (Score:5, Insightful)
http://www.prevayler.org/ [prevayler.org]
Oh my dear god. You've never actually used Prevayler have you? Prevayler isn't nearly as useful on actual data problems as Prevayler's worshippers would have you believe.
I know this because I tried to use it. If you'd ever tried to use it, you'd know how unbelievably poorly it performed when attempting to implement real world queries. You have to implement every query in Java, and Java is a particularly poor implementation choice for creating complex queries.
What if I said that this can be as fast as 8000 times faster than Oracle?
This "performance comparison" that the Prevayler group trots out is particularly funny as their test uses a single ArrayList of objects as in-memory "storage" and then "queries" it by index. Not exactly a realistic problem. Try a query across four classes with a few million instances of each class and you'll quickly discover what relational databases are good for.
Regards,
Ross
Bullshit artist... (Score:2, Informative)
Database as file system (Score:2, Interesting)
The use of a database as a file system will require radical new technological advances in database theory as the current methods break down under the new requirements. The functionality of the file sys
Re:Database as file system (Score:2)
Like, one assumes, one would pray for a sick friend?
What is now considered the traditional file system API is not well designed for databases, but there have been other ones that might be better used in the past: an API that does for databases what the UNIX API (after all, virtually all file system APIs these days are based on it) did for files is needed.
What ever happened to OODBs? (Score:2, Interesting)
regex (Score:2)
Regex! As processors get faster, memory gets cheaper.... I wouldn't be surprised to see more better, faster, etc. implementations of regex that allow doing what full blown databases do today. Of course that's in a read/only context, but I've implemented full blown "database" applications centered around the regex. And some will point out regex doesn't deal with integrity and data management issues, I would point out many databases are implemented in overkill mode where data integrity and management are h
One (At Least) Problem I Have With The Article (Score:2)
This notion of "active databases" seems to me to be interesting but fraught with problems.
Not least of which is the old bugaboo - documentation. How do you document a system composed of myriad triggers scattered on myriad tables in myriad databases communicating over the Net?
All I know from trying to decipher ONE Oracle Forms application at City College of San Francisco is that it is nearly impossible to get a handle on what happens where when. There appears to have been NO effort made by Oracle to enab
To answer the question... (Score:3, Funny)
Yeah, all the time.
You mean in the last 25 years? Nowhere. (Score:4, Insightful)
On top of that - and this is the worse part - what we call databases today is nohing much more of a historically grown apocalyptic chaos. With one of the crappiest programming languages ever as a cornerstone of its technology. A weedy mumbojumbo of wanna-be virtual machines, wanna-be server daemons, makeshift security layers, obstrusive user management and pseudo operating systems and a bazillion proprietary variants of said programmin language. With features bolted on left right and center. This basically is the case with any current DB in widespread use, be it MySQL, Oracle or anything inbetween.
And if you look at the core of it Database technology and how long it has been that way there isn't much hope that DB's will go anywhere anytime soon.
Then again, if you want to get a glimpse of a possibly brighter future, I'd actually recomend Zope [zope.org]. I consider it's object relational DB a working proof of avantgarde "database" concepts and a prototype of what DBs generally could look like in the future if anyone were interested.
Most of our clients... (Score:3, Insightful)
I suspect that may translate as "most of our clients want to be given easy answers to difficult questions".
I'm sure there'd be a big market for a database system that stored flight bookings and could answer the question "which of our customers is a terrorist?". You don't address that market with new technology, though, but by developing new sources of snake oil.
Re:Umm, Yep! (Score:5, Interesting)
Bioinformaticists (and spies) use this a lot (Score:5, Informative)
most of our clients are now asking questions that require approximate or probabilistic answers
Bioinformatics databases are a good example of this. DNA and protein sequence databases are often searched by approximate string-matching algorithms based on "dynamic programming" to hidden Markov models [ox.ac.uk] and other stochastic grammars [wustl.edu].
Historically, drug target-hunters in Big Pharma created a market for accelerated hardware [timelogic.com] to facilitate dynamic programming searches, some of which (e.g. Paracel's Fast Data Finder chip) was originally marketed to government agencies [geocities.com] who, um, shared an interest in approximate string-matching ;)
Re:Generalizing too much? (Score:2)
Taken out of context that's a pretty funny statement. It makes us sound like pathetic, capitalist pigs. To wonder, the first thing that comes to our minds is our credit rating. I hope it's only a troll.
Re:Generalizing too much? (Score:2)
Your credit rating is a probability rating of your ability to pay your future debts. I mean, I know that you're kidding, but I think both of your examples show an actual probability. It would be thought that there is an X% that you might be Al Qaeda, so therefore you won't be granted the ability to purchase a ticket on this flight.
Sure... (Score:5, Funny)
Could someone summarize it without using the letter 'e'?
Sure.
Th Futur of Databass
Postd by timothy on Monday May 02, @08:12PM
from th your-flight-status-is-'mayb' dpt.
gManZboy writs "vr wondr whr databas tchnology is going? This is somthing that Turing award winnr Jim Gray from Microsoft has givn a lot of thought to. H rcntly publishd an articl in which h looks at th many forcs pushing databas tchnologis forward, and what thos nw tchnologis will look lik. Gray writs, 'th gratst of ths [rsarch challngs] will hav to do with th unification of approximat and xact rasoning. Most of us com from th xact-rasoning world -- but most of our clints ar now asking qustions that rquir approximat or probabilistic answrs.'"
Hmmm, I kind of like 'databass'.
Re:The clowns down the hall (Score:2)
Re:The clowns down the hall (Score:2)
Ah, not so fast, cowboy...
Just last week my boss at City College of San Francisco was "fixing" something in the production database and managed to delete a few thousand records he shouldn't have. He got it back from elsewhere okay, but it shouldn't have happened.
There's a reason developers get development and test databases and DBA's don't allow them to touch production databases.
In defense of my boss, we don't have a snapshot of the production database every night - which you need if you expect your sys
A difference between "DBA" and "clown" (Score:5, Interesting)
The problem is that in a lot of corporations (e.g., the one I work for), they -- and all other admins -- have been taken and put in a different building. And more importantly they don't actually have to cooperate with any team.
Their job's goal is no longer the same as the developpers: to get a program done by a deadline. They've been turned into a bureaucracy whose only job is to see that the servers run. No more.
That's an _awful_ job description, because it directly makes the developpers their enemy. I'm not even talking "slippery slope", but direct cause-effect. Instead of being "the other half of the team that will make this program work", developpers just become "those assholes who crash our servers."
It's not hard to get from that point of view to pathologic cases like the admin that limited our productive servers to 3 connections per server. He kept his own servers running perfectly (which is his job description) at the expense of making the company's productive programs grind to a halt (hey, it's not in his job description to care about those.)
That's the problem with that kind of internal organization. As one BOFH-wannabe once said "The source of the problems on my network are the users. Would you prefer that I cut your access? Then there wouldn't be any problems any more." Another one threw a hissy fit that we dared ask that he does his job, during work hours. Yeah, how dare we bother him by asking if he could please reboot the test server he's managing.
That's the underlying problem. Instead of providing a service _to_ the users, a whole caste has been created whose job is to serve the computer, and the users are just those pesky assholes disturbing his majesty the computer. That's a very unproductive situation to create.
Worse yet, a bunch of companies invented the devastating practice of internal invoices. The admins in one department won't even go to the toilet unless they can send an bill to another department for it.
They won't even talk to each other (e.g., the WebSphere admin telling the DBA and the Unix admin that he needs a Solaris patch and a newer version of Oracle for the "transactionBranchesLooselyCoupled" setting.) No, you have to personally talk to all three of them, because otherwise they can't send three bills for it.
And predictably, they'll do _nothing_ more than the bare minimum that was requested and billed. E.g., you have to tell the DBA explicitly to set this and that, to this and that value, because she won't do that on her own. Which basically means you already need to have all the knowledge of a DBA, and she is just acting as a proxy over the phone... and sending you a bill for it.
Basically if you're not that kind of a DBA, you have my respect. All I'm saying is that when you read about "teams of clowns" or about people who'd rather invent their own storage than deal with a DBA... well, they're not necessarily avoiding _your_ kind, but the kind of clown I've described above.
Re:A difference between "DBA" and "clown" (Score:4, Informative)
My sympathy, however, does indeed go out to the poor devs who get stuck with some tool that doesn't really understand, or even want to understand, his position as an admin. Too many people slipped into the field with dollars in their eyes in the 90s, and it's led to some truly spectacular screwups. Essentailly, in my mind, almost every single failed ERP implementation could and should be blamed on insufficient database administration, and there are LOTS of flameouts there.
The upside
--chitlenz
Re:moving past relational model? I thinketh not (Score:2, Interesting)
I will admit I was around before relational databases. Back then there was good old hierarchical databases, and they did a damn good job of what a relational database does 50% of the times these days. The problem was the other 50% they couldn't do. So along came relational databases. Now to think that there is nothing beyond relatio
Re:moving past relational model? I thinketh not (Score:3, Insightful)
a couple of thoughts on that -
1. relational databases are really quite wonderful for analytical apps. Need to store two years of firewall/sales/whatever data - then churn away analysis? Great - no problem. And it's easy enough to do either through hand-written sql or via a tool. There's plenty that requires third-party tools (and data stores), but even in this scenario the staging area is almost
Re:moving past relational model? I thinketh not (Score:2)
On #7
Re:moving past relational model? I thinketh not (Score:2)
yep
> So in order to have effective heuristic or "fuzzy logic" queries, somebody will need to work
> out indexes and hashes for each fuzzy logic matching operator, or write an algorithm that
> figures out how to make those indexes and hashes. And that's ummm... rather more difficult.
> So until then, we have to catch as catch can with with analyzers and ag
Re:moving past relational model? I thinketh not (Score:2)
Not being an expert on the current interpretation of the relational model, I wouldn't assume that the needs of analysis necessarily "break" that model.
My limited understanding is that the current crop of SQL languages and database implementation fall short (how much I'm not sure) of what the relational model is capable of. Perhaps that's where we need to start looking for improvements.
In any event, as I've said numerous times before, without some adequate simulation of conceptual processing, it's not lik
Re:moving past relational model? I thinketh not (Score:2)
If you think that the only kind of DBMS is the Relational DBMS, you must have flunked your Database Theory class, or gone to a piss-poor school.