Forgot your password?
typodupeerror
Data Storage Databases

Enthusiasts Convene To Say No To SQL, Hash Out New DB Breed 423

Posted by timothy
from the sql-like-a-pig dept.
ericatcw writes "The inaugural NoSQL meet-up in San Francisco during last month's Yahoo! Apache Hadoop Summit had a whiff of revolution about it, like a latter-day techie version of the American Patriots planning the Boston Tea Party. Like the Patriots, who rebelled against Britain's heavy taxes, NoSQLers came to share how they had overthrown the tyranny of burdensome, expensive relational databases in favor of more efficient and cheaper ways of managing data, reports Computerworld."
This discussion has been archived. No new comments can be posted.

Enthusiasts Convene To Say No To SQL, Hash Out New DB Breed

Comments Filter:
  • by Anonymous Coward on Thursday July 02, 2009 @06:38PM (#28565539)

    select * from customers c, bids b where c.customer_id=b.fk_customer_id order by c.customer_id, b.bid_date

    Seems pretty simple. What's wrong with an inner join? Your getting exactly the number of rows that you need to answer your question, no more no less.

    A cartesian product would be more like: select * from customers c, bids b. But that's not what you want.

    As for hierarchical structures, Oracle db has ways to do this, although I admit the syntax isn't that straight forward: http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries003.htm

  • Re:Flat Earth (Score:4, Informative)

    by Vellmont (569020) on Thursday July 02, 2009 @06:54PM (#28565713)

    And so you're saying this is all the fault of the relational database, and would all be solved by using some sort of object based database? That's the topic at hand here, not developers dealing with legacy systems patched together.

  • by Strudelkugel (594414) on Thursday July 02, 2009 @07:09PM (#28565861)
    OLAP [wikipedia.org] was designed to answer that type of question. MDX [wikipedia.org] is the language used to perform multi-dimensional queries.
  • Re:Flat Earth (Score:2, Informative)

    by Threni (635302) on Thursday July 02, 2009 @07:25PM (#28566045)

    > Second, SQL hasn't caused people to stop using spreadsheets or Access databases

    If if weren't for SQL there wouldn't be any Access databases...

  • by julesh (229690) on Thursday July 02, 2009 @08:11PM (#28566523)

    SQL is great for financial data.

    Actually, this isn't true either. See this article [dbmsmag.com] for pointers to some of the failings of SQL in dealing with financial data, particularly time series (e.g. sales figures, share prices, etc.). Here's another take on the problem [nyu.edu], which essentially is that SQL doesn't recognise that there can be relationships between the rows of a table (e.g., "this happened after this").

  • by GryMor (88799) on Thursday July 02, 2009 @08:19PM (#28566597)

    Yah, good luck with that. Unless the index already exists to do nearly exactly what you want, queries against multi terabyte production oracle tables have this bad tenancy of never completing, if you are lucky, and (effectively, from the perspective of the app running on top of it) taking down the database if you are not so lucky.

    If the index doesn't exist, good luck adding in less than a week or two.

    For the most part, novel trend and behavior information is trivial to instrument in the service layer as a side effect of the apps normal operation, at which point you record it in your query logs and build reports based on the logs.

  • by diamondmagic (877411) on Thursday July 02, 2009 @08:58PM (#28566911) Homepage

    Design an efficient table relating a tree structure.

    Huh? Tree structures are best handled by relational databases, as it is far faster then recursion. Give row a unique ID and a parent ID, and in addition, a left hand and right hand number, the root node having a left-hand value of 1 and a right hand value of (number rows * 2), the first child node has a left-hand value of one more than the parent's, the right-hand value is one less then the left-hand of a younger sibling.

    Then design queries to answer questions such as:
    * Find the nodes in the subtree under B.

    SELECT * FROM rows WHERE left > [left hand value of B] AND right < [right hand value of B]

    * Find all ancesters of G

    SELECT * FROM rows WHERE left < [left hand value of G] AND right > [right hand value of G]

    * Find the nearest common ancestor of D and H

    SELECT * FROM rows WHERE left < [lowest left hand value from D,H] AND right > [highest right hand value from D,H] ORDER BY right LIMIT 1

    Trees is a wellknown problem of SQL, but the fact is that SQL can't handle most datastructures and complex relations, only very simple one dimensional ones.

    Are you saying trees are easy or hard? And for more complex systems, that is what JOINs are for. SQL is by far the most powerful way and often the fastest way to manipulate data that I know of. The only time I can recall that I had to use a non-SQL solution that was faster then the SQL solution was a matrix operation.

  • by g2devi (898503) on Thursday July 02, 2009 @09:08PM (#28566995)

    Sure. There are several.

    If you do clinical work, you're fairly familiar with EAV databases:
    http://en.wikipedia.org/wiki/Entity-attribute-value_model [wikipedia.org]

    and The Associative Model of Data:
    http://www.lazysoft.com/docs/other_docs/AMD.pdf [lazysoft.com]

    These data models are best when either your schema is inherently hazy (e.g. in case of patient information) of where the schema is so big that it's impossible to manage (e.g. enterprise data warehousing).

  • by allenthelee (159279) on Thursday July 02, 2009 @09:13PM (#28567037)

    That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

    I wouldn't call SQL the natural way to model a tree data structure. It's possible but it comes with a price.

  • by raddan (519638) * on Thursday July 02, 2009 @09:25PM (#28567135)
    A basic premise of the relational model is that there is no relationship between rows. So it isn't surprising that SQL can't see any. Maybe you need to organize that data differently? You can solve a lot of problems in SQL using triggers, temporary tables, and the built-in aggregate and sorting functions.
  • by lawpoop (604919) on Thursday July 02, 2009 @10:43PM (#28567643) Homepage Journal
    For anyone wondering, parent is talking about a preorder tree traversal algorithm:
    Link 1 [mysql.com]
    Link 2 [sitepoint.com]

    And parent it right. I was doing an adjacency list in MySQL for a while, because I thought that preorder trees were just a little too complicated, but they are *way* easier and more intuitive.
  • Re:Quit Whining (Score:2, Informative)

    by Anonymous Coward on Thursday July 02, 2009 @11:50PM (#28568101)

    The horrible lag I get when using address completion in Firefox 3 makes me wish less people thought that way!

    fixed that for ya. your address completion would probably not lag if there was an efficient data structure behind it, like a btree, like a rdbms would use

  • by Kjella (173770) on Friday July 03, 2009 @01:39AM (#28568637) Homepage

    1) 1996 called, they want their arguments back. For example, most RDBMS have ranking functions now.
    2) Even in 1996, he doesn't know SQL worth shit

    SELECT (prev.sales+now.sales+next.sales)/3 three_day_average
    FROM sales prev,
                  sales now,
                  sales next
    WHERE prev.day_number = now.day_number-1
    AND next.day_number = now.day_number+1

    Easy as pie making most of the calculations he wants. Maybe he should ask someone knowledgable in SQL?

  • by Rich0 (548339) on Friday July 03, 2009 @07:03AM (#28570025) Homepage

    I hate to nitpick, but:

    1. Your database needs normalization. Almost all of that data should have been in one table, using fields to indicate what kind of statements they were.

    2. Your use of random() isn't going to work - unless the random number generator just happens to generate a number that just happens to be the ID of one of your records. The typical way to do what you're doing is to order by random() and limit the query to one record.

  • by lawpoop (604919) on Friday July 03, 2009 @10:37AM (#28571561) Homepage Journal

    That's true, but you should mention that this data representation comes at a tradeoff for update efficiency - insertion of new nodes force you to update the entire subtree's left and right hand values.

    This is not entirely correct. If you want to insert a new node, you need only update the values of the nodes to the right of the insertion point. This covers more and more nodes the closer you get to the left side of the tree. If you wanted to add a new node onto the right of the tree, you need update only one value, the rgt value of the root node.

  • Re:Quit Whining (Score:3, Informative)

    by Chelloveck (14643) on Friday July 03, 2009 @12:03PM (#28572457) Homepage

    The horrible lag I get when using address completion in Firefox 3 makes me wish fewer people thought that way!

    fixed that for ya.

    Fixed that [quickanddirtytips.com] for ya!

Our informal mission is to improve the love life of operators worldwide. -- Peter Behrendt, president of Exabyte

Working...