MySQL Beats Commercial Databases in Labs Test 419
An anonymous reader writes "Many of the big players now offer free or 'light' versions of their databases, some would call them crippleware. Builder AU compared databases from Oracle, IBM, Microsoft and MySQL, and the open source offering came out on top."
Re:Correctness isn't negotiable (Score:1, Informative)
MySQL solves this by counting inserts and maintaining this in a separate segment in each BDB table. If you don't do a lot of DELETE or ROLLBACK:s this number should be accurate enough for the MySQL optimizer, but as MySQL only store the number on close, it may be wrong if MySQL dies unexpectedly. It should not be fatal even if this number is not 100 % correct.
Comment removed (Score:3, Informative)
Re:Correctness isn't negotiable (Score:3, Informative)
Pretty fast(I never benched it, but its never been an issue), very portable, works in all major languages, overall a very nice tool to have.
Re:Obligatory.... (Score:5, Informative)
Indeed. In fact, tests in the past showed postgres was a better choice over mysql. But don't take my word, compare it yourself with those in the article...
http://www.postgresql.org/ [postgresql.org]
It makes them look bad (Score:3, Informative)
BTW, from what I read of the article it doesn't look like they used the free version of Oracle. It listed a unlimited cpu lic. fee of $19K.
I was impressed with some of the features of MySQL. Since I looked at it years ago it looks like it has come a long way. However I now work at a big company with a site lic. for Oracle, its unlikely I'll use it professionally.
Re:Correctness isn't negotiable (Score:5, Informative)
If there's nothing going on in the database, then the 'summary' value that MySQL keeps is probably spot on accurate. But, if there's lots of simultaneous inserts and deletes, then it's really going to be very approximate. Until things are all flushed, the summary may include all the inserts and none of the deletes, or vice versa. If you wanted to make the summary information accurate, then you'd have to establish locks and the like around that summary value, and THAT will slow the database down. As it stands, inserts and deletes can be executed with ZERO regard to each other.
Postgresql has a similar problem, except instead of offering a summary value and informed that it's an estimate, whenever you do a count(*) it actually scans the entire table file looking for 'valid' rows. Ie, count(*) is not instantaneous. I think they were going to address this issue in a later release (or perhaps it's in 8.1 already), but it's NOT a simple thing. However, if you wanted instant answers in Postgresql NOW, you can do it by setting up a trigger on insert and delete, and maintaining your own summaries. This is a performance hit, of course... but you'd get the same, or a similar hit, if the database was maintaining for you.
What the 'big guys' do, I don't know. But... don't knock MySQL for doing something weird
Postgresql, for comparison, will give you an 'accurate' value, but it actually has to create rows: it can't rely on summary information.
Re:Correctness isn't negotiable (Score:5, Informative)
Re:Correctness isn't negotiable (Score:2, Informative)
It states that SHOW TABLE STATUS differs from SELECT COUNT(*) because SHOW TABLE STATUS guesses.
SHOW TABLE STATUS is just a misc admin command.
Re:Two things... (Score:5, Informative)
Comment removed (Score:5, Informative)
Re:Not surprising (Score:5, Informative)
Re:Correctness isn't negotiable (Score:4, Informative)
To be fair to MySQL, that's only for tables of type InnoDB. MyISAM and other storage engines do return an accurate count. From here [mysql.com]:
It should be noted, though, that you have to use InnoDB tables for all those "modern" database features like transactional support* and foreign key constraints.
It may be a bit of a bother, but it's not that hard to create the "counter table" for whatever it is you need to count. All the major DBs have something that's a pain in the ass...at least with MySQL you didn't have to pay for the pain.
*BDB and NDB Cluster are apparently transaction safe as well, but I have no experience with them; and for whatever reason, they don't seem to be popularly used.
OMG (Score:5, Informative)
It's things like that where you just ultimately conclude that the writer(s) of this article just does not know what the hell he's talking about and doesn't have a basic understanding of the concepts or products under review. It's just more OSS nonsensical propaganda in my opinion. And don't fool yourselves into thinking that an article like this is going to change any IT manager's mind about what DBMS he's going to deploy in his enterprise.
Crippled Versions (Score:5, Informative)
SHOOT!! you want to see MySQL get its bum kicked on performance? Run a test on a filesystem against MySQL.
Comparing performance among databases is only meaningful if all of the candidates have the features of which you need. MySQL has come a long way, and I use it in production every day, but this is kind of a silly comparison. The free versions of the big DB's are meant to provide an easy migration path to more feature-complete versions; if you use Sequel Server Express and want to upgrade to something that that supports clustering and log shipping, you may your money and get your features. With MySQL, if you outgrow it, you either need to start writing code, migrating to something else, or sitting on your hands waiting for it to get there.
Recap, for those who won't RTFC and want to slag me: I like MySQL. I use it for mission critical purposes in production environments. However, comparing a simpler product's performance to (crippled versions of) more robust products is silly.
Cheers
-AC
Re:SQL Server Express Is Mostly for Developers (Score:3, Informative)
Which is why you use it for personal sites, not for "real" applications, just as I said. The people who would use SQL Express are the same people who used Access databases for their sites... but this is far better than Access in almost every way.
By the time you find Express won't cut it anymore, it'd probably take quite a while to migrate from it.
Well, perhaps from a financial point of view, but that was poor planning on your part. From a technical point of view, SQL Server Express is a subset of SQL Server, and any application that runs on Express will run on SQL Server after running it though the upgrade wizard, which takes all of 5 seconds and 2 clicks of the mouse button.
An example for Express: no backup, no replication. How are you going to backup your site's DB, take the server down, detach the database and make a copy?
No, you copy/paste the
The writer doesn't know his head from his ass (Score:5, Informative)
This is utterly false.
MSDE is based off SQL Server 2000, which itself a revision to SQL Server 7. MS Access has NOTHING to do with SQL Server (excpet proving nice single DB front ends via ADPs). When your dishonest (or just stupid) so early in a article, you loose your reader.
Re:Maybe not in Australia (Score:5, Informative)
That's how it should be everywhere.
Lab test? (Score:5, Informative)
Re:like having my grandma rate sports cars (Score:3, Informative)
The partitioning implementation in MySQL 5.1 is still very new (pre-alpha quality) and is not production-ready at this time. Much the same is true of this chapter: Some of the features described herein are not yet actually implemented (partitioning maintenance and repartitioning commands), and others might not yet function exactly as described (for example, the DATA DIRECTORY and INDEX DIRECTORY options for partitions are adversely affected by Bug #13520). We have attempted to note these discrepancies in this chapter. Before filing bug reports, we encourage you to check the following..
From MySql 5.1 Manual
Re:like having my grandma rate sports cars (Score:3, Informative)
> compliant mode and the standard seem like slim pickings for a critic.
Yeah, they're covering their butts in the documentation much better than the old days in which they blatantly stated that transactions and pk/fk constraints were bad. Now, it's much harder to find the things that they are embarressed about - like their old licensing faq, or compatibility issues. The url you provided only shows those compatibility problems that they have worked out. It does not list the existing ones for which there is no good solution. And I'm not in the mood to go through their documentation to find it all.
> Also, partitioning has been in there forever
Well, there's this: http://dev.mysql.com/doc/refman/5.0/en/partitionin g.html [mysql.com]
Which says only:
"Chapter 15. MySQL Partitioning
This is the beginning of the Partitioning chapter. "
Maybe you're thinking of their separate clustering product? Which being limited to your available memory, isn't really the samething. Partitioning splits your data by either hash key or ranges into multiple physical tables. These tables are kept on either the same server (range partitioning via oracle, mdc via db2) or on multiple servers (hash partitioning via db2, think beowulf). In either case your application is unaware, the database handles the details. You just get speed. MySQL doesn't do this.
> as has memory tuning.
Not really. And keeping in mind that memory tuning is one of the most critical features of a database - one that allows you to eke out the maximum benefit from your expensive hardware...
The mysql documentation barely mentions memory tuning at all. It does give you a large set of pools, heaps, etc to tweak (about 28):
- http://dev.mysql.com/doc/refman/5.0/en/server-para meters.html [mysql.com]
But this looks more like a haphazzard collection that has grown over time, rather than anything methodical. I couldn't actually find any recommendations for how to tune these in the documentation. Maybe it was hidden somewhere, but searches on 'memory' didn't turn it up. Moreover:
1. it fails to give you the flexibility you need: it appears to still be impossible to set up multiple buffer pools for read caching that are shared between sets of like tables. This is typically the *first* thing you should plan & tune. It allows you to ensure that some large tables get 99.9% of their hits from cache rather than memory - without having to move them into the memory storage engine in mysql - which you may not have sufficient memory to do.
2. it inefficiently uses memory: buffers can't be dedicated to a functional purpose, they often have to be dedicated to a storage engine. This means that if you are using both innodb & myisam you will pay a penalty in your memory usage.
MySQL does offer a lot of knobs to twist - really as many as db2 or oracle. It just doesn't seem to document what they do or how to analyze their result. And they aren't well thought out or complete, they're bizarre. Maybe duct-table and bailing wire to catch problems like:
"Note that if you use any other option to ALTER TABLE than RENAME, MySQL always creates a
temporary table, even if the data wouldn't strictly need to be copied (such as when you change the
name of a column). For MyISAM tables, you can speed up the index re-creation operation (which is the
slowest part of the alteration process) by setting the myisam_sort_buffer_size
system variable to a high value."
Here you can see that mysql has some mysterious problem doing ALTER TABLE - in which i
Horses for Courses (Score:5, Informative)
If you want to distribute MySQL with your application to a customer, you have to pay a license fee. That means that for many people, MS SQL Express may be better.
If I wanted to do some complex database logic, I'd probably consider MS SQL Express, as stored procedures on MySQL haven't been out there for long.
If you are building a database to go on low-cost LAMP hosting, MySQL does the job well.
For a piece of shareware requiring a small database, something like SQLLite is probably better than these options.
Re:I like MySQL, but... (Score:5, Informative)
Re:Isn't this an EULA violation? (Score:3, Informative)
i guess mysql would have beaten the rest in simple queries and perhaps would have gotten it's ass kicked when it goes down to many subselects that depend on indexes to join up, mysql had some bug or not yet include optimization lack there some time ago, dunno where it is now.
but i was very surprised how "deep" the comparision between features was
stored procedures: mysql has them all right, but the pointers are not yet fully implemented in 5.0 they miss some features that were promised to come in 5.1
stored procedures in custom languages: mysql promised it in 5.1 (afaik postgres and oracle already support this).
jdbc and java support: the last time i saw mysql's jdbc driver, it literally sucked, the whole idea of resultsets was misinterpreted and the whole query results were fetched into the client machine at once which cause terrible overheads in statistic software that was ported to jdbc/mysql. perhaps they have fixed it now, but 2 years back this driver was literally unusable in such cases where you needed processing of huge amounts of data.
there are probably zillion other things that weren't quite closely examined in the review, i guess who ever uses what he has to with mysql can see some points that definitely need impovement compared to the other mature rdb-s.
my car can have a sledge attached to it, but this doesn't mean that it's a snowmobile and ready to conquer the north pole.
i use mysql every day here, i like it's simplicity. i hope it makes good progress in near future and will become even more compilant with other databases. but i don't like if people give software the "make-up" to make it look really ready for some applications and it turns out later that it wasn't really not so ready, sometimes not ready at all.
if i need lots of features for free, i still choose postgres. but mysql is gaining up and may become the one choice in some years.
ps. people should start to get worried about mysql gaining lots of features too, lots of more features always mean some slowdown in the whole application and sometimes this really isn't what we need.
Re:Except the license might make MySQL cost $$$... (Score:3, Informative)
Re:I like MySQL, but... (Score:3, Informative)
http://www.mysql.com/why-mysql/case-studies/ [mysql.com]
then there's this story
http://xooglers.blogspot.com/2005/12/lets-get-rea
hoping that "you get what you pay for" will get a swarm of professional bloodsuckers that will make sure to sell you the most expensive plan, even you could get the same quality for much less.
and, adiitionally, i don't hear much "you can trust big systems to ms products" nowadays
usually stories are quite opposite...
While everyone seems to agree (Score:5, Informative)
Have you checked the licenses on Oracle for instance. If I remember correctly, the commercial license prevents publications of benchmarks without approval from Oracle.
Having said that, if *I* were supreme overload of database comparisons, here's what I would do:
- Decide on a reference hardware platform in both 32 and 64 bit. I would also include a non-x86_64 hardware platform such as pSeries. Of course this will limit the SQL Server tests but that's Microsoft's own choice.
- Also decide on a common disk layout for the databases. Many commercial databases and even PostgreSQL will perform poorly out of the box on a flat disk layout. Seperate index, data and logs on unique volumes. If you decide to go RAID5 for any LUN, stick at least 6 disks under that LUN. RAID1 for log files. You also need to decide on which filesystem you want to use. This all of course determines which OS you use. I'm assuming Linux in this scenario. Most PostgreSQL recommendations I've seen recommend XFS on RAID10 but RHEL and SUSE don't include XFS support without going unsupported with the vendor in a kernel recompile.
- Bring in a skilled DBA for each product. It shouldn't be too hard to find someone who wants to get published in his respective product.
- Provide no OS tuning except the defaults recommended by the manufacturer of the database. OS tuning varies from vendor to vendor. Some suggest SHMMAX to be one setting while others suggest another number. You can't compare apples to apples when you've tuned I/O at 64k blocks for DB2 and 128K for Oracle (not that you would for either).
- Test all workloads. You may notice that some vendors provide a different product configuration for DSS, OLTP and OLAP. Some vendors even provide a different version of the product for a specific workload.
- Use the same DDL where possible. Really think about this for a moment. Alot of tests I've seen determine raw select, raw insert and raw update speeds but don't take into account the complex DDL that most business have. Take our layout for instance:
1) We have an OLTP system.
2) It also has a schema for OLAP that is populated by triggers from the OLTP tables.
3) We load our warehouse off of the denormalized tables and also provide the OLAP functions within our application from those tables. (Our warehouse is updated each morning but we have a requirement in the application for realtime data for the current business day)
Now with those above requirements, INSERT and UPDATE are going to perform much slower than what a raw benchmark would tell me and IMHO is much more indicative of real world design.
- Note which "levers" you have available to pull. With DB2, I can put specific tables on different LUNs via tablespaces. I can also assign tables and indexes to different bufferpools. Quite honestly, I can't do any of that with MySQL (well with InnoDB I can via some symlink madness). I can accomplish the tablespaces option with PostgreSQL but not the unique bufferpools for certain tablespaces or indexspaces.
- Also note what maintenance is required to actually keep the database performing. REORGs in DB2. VACs in PGSQL. I can update and insert 10mil rows to DB2/MYSQL/PGSQL but what happens when I need to go back and select out those rows? This leads to the next test:
- Test the optimizer! This is probably the biggest thing for me. How does the optimizer determine which access path to take? What factors influence that? I would not intentionally write shitty SQL but developers aren't DBAs. They don't normally concern themselves with the BEST path or even the quickest path to the data as long as they get the data they need. Don't talk to me about OR functions or LEFT OUTER JOINs that I've seen spit out by ORM products or worse yet SELECT * and doing the logic in the application. Run EXPLAIN plans on all queries you're testing. In the end, the optimizer is the biggest factor in the database per
Re:I like MySQL, but... (Score:2, Informative)
What version of postgres are you running? We run 8.0.3 in production and do 1000 transactions a second with ease.
Re:I like MySQL, but... (Score:3, Informative)
Re:Isn't this an EULA violation? (Score:5, Informative)
And I'm an avowed MySQL hater. I think it's a shitty hack of a database and there is no problem domain where there isn't at least one other product is a better solution. But this "no support" line is almost as wrong as it is stupid.
By the way, the kind of support you're claiming you need (24/7 on call support with access to someone who can provide a patch for a new problem) will be enormously expensive, where it's even available - there's no support plan Microsoft provides that makes those sort of promises, for example. You might get a patch in a half hour if it's a known problem and they have one ready for it, but you'll probably spend more than that half hour just convincing the rep that you have that specific problem. IBM or some of the other more service oriented companies may provide that kind of support. It's going to be expensive though - expensive enough that if you're worried about how much it's going to cost, it's probaby more than your company is worth.