Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Databases

SQLite or PostgreSQL? It's Complicated! (twilio.com) 101

Miguel Grinberg, a Principal Software Engineer for Technical Content at Twilio, writes in a blog post: We take blogging very seriously at Twilio. To help us understand what content works well and what doesn't on our blog, we have a dashboard that combines the metadata that we maintain for each article such as author, team, product, publication date, etc., with traffic information from Google Analytics. Users can interactively request charts and tables while filtering and grouping the data in many different ways. I chose SQLite for the database that supports this dashboard, which in early 2021 when I built this system, seemed like a perfect choice for what I thought would be a small, niche application that my teammates and I can use to improve our blogging. But almost a year and a half later, this application tracks daily traffic for close to 8000 articles across the Twilio and SendGrid blogs, with about 6.5 million individual daily traffic records, and with a user base that grew to over 200 employees.

At some point I realized that some queries were taking a few seconds to produce results, so I started to wonder if a more robust database such as PostgreSQL would provide better performance. Having publicly professed my dislike of performance benchmarks, I resisted the urge to look up any comparisons online, and instead embarked on a series of experiments to accurately measure the performance of these two databases for the specific use cases of this application. What follows is a detailed account of my effort, the results of my testing (including a surprising twist!), and my analysis and final decision, which ended up being more involved than I expected. [...] If you are going to take one thing away from this article, I hope it is that the only benchmarks that are valuable are those that run on your own platform, with your own stack, with your own data, and with your own software. And even then, you may need to add custom optimizations to get the best performance.

This discussion has been archived. No new comments can be posted.

SQLite or PostgreSQL? It's Complicated!

Comments Filter:
  • by Antique Geekmeister ( 740220 ) on Monday July 04, 2022 @05:15PM (#62673354)

    Was the cost of the analysis itself, in delays before starting the work and tests of configurations were out of date 20 minutes after they were completed, worth the time of the analysis? Or wouldyou have been vastly further along simply rolling a dice at the start to pick a technology and spending your time optimizing for the designated technology?

    I've seen this occur far, far too often with designs that attempt to solve problems before they occur.

    • by Anonymous Coward

      No, what you need is a database that uses its incredible powers of self-optimization to suddenly screw over the plans for queries that have been running fine on virtually the same data every effing day for a couple of years. Unless of course you put in place all kinds of defensive countermeasures. Some expensive database that starts with an 'O'.

      • Interesting comment, given that Oracle RDBMS actually allows you to pin the query plan for a query so it doesn't change ("SQL plan management"). Of course, that means you don't get any improvements in optimiser technology - but people who statically link their code to avoid improving anything have always been with us.

        The other database from the big red O company, MySQL, does recompute the query plan every time it runs the query. You have a tiny chance of regression if there's some bug in a new release, and

        • by Lothsahn ( 221388 ) <Lothsahn@@@SPAM_ ... tardsgooglmailcm> on Monday July 04, 2022 @10:49PM (#62673890)
          As someone who's worked on both database from the big red O company as well as MSSQL, I can assure you that it doesn't require a new release to cause a query plan regression. A simple change in the data going over some threshold causing the heuristics to be different can suddenly cause a massive performance issue in a production database that has been successfully running for years.

          Especially for OLTP workloads, the optimal query plan is often fairly easy to compute and there often isn't "a better one in the future". Having the plan unexpectedly change isn't a feature, it's a bug. Plan pinning and SQL plan management is actually a great feature, unfortunately, it's nearly impossible without the enterprise version, which is literally $$millions$$.

          When you're running a web application serving millions of users, consistent performance is key. 10% performance is great, but avoiding hangs, deadlocks, system overload, or 1000x performance regressions is key. Unfortunately, relational databases have tons of ways to encounter the latter. For instance, in Oracle, you absolutely can end up with a deadlock just from inserting rows too quickly--never reading or altering them. Theoretically, this shouldn't be able to cause a deadlock, but on Oracle, it can (Note: it also can on MSSQL but for totally different reasons and depends on isolation level). Hint: It's caused by the ITL (interested transaction list).

          Having random deadlocks, hangs, performance issues, spins, or system overloads caused by the database misbehaving only under certain workloads is one of the most frustrating things to deal with in OLTP at scale. If you can't tell, I hate dealing with databases for their unpredictability (and am a huge fan of plan management and other designs to provide better predictability).
      • In my experience, Oracle's RDBMS gets a lot of bad press, but I've never seen any complaints about it for poor self-optimization. Stuff like reorganizing the database is one thing, but in my experience, it is remarkably good at handling really crappy queries (devs that do "select * from whatever", pipe it into grep as part of a shell script, as opposed to having a WHERE clause. Oracle's RDBMS seems to do a good job with crappy design, especially developers who don't under the concept of normalizing their

        • In my experience, Oracle's RDBMS gets a lot of bad press, but I've never seen any complaints about it for poor self-optimization.

          Oracle's RDBMS seems to do a good job with crappy design, especially developers who don't under the concept of normalizing their DB schema.

          Try creating a table whose primary key is not the clustered index. Oracle allocates an entire page for each and every row unless you manually specify otherwise.

          Between incorrect null handling, lack of basic features such as upserts, procedures returning result sets, local temp tables, crummy syntax, autonumber fields without having to use worthless triggers, and my favorite "mutation" errors that reflect absolutely nothing except for arbitrary limitations of the RDBMS. It deserves all of the bad press wit

      • by DontBeAMoran ( 4843879 ) on Tuesday July 05, 2022 @07:27AM (#62674410)

        Ah yes, the databases services from Opple, Oogle, Omazon and Oicrosoft.

    • What surprises me is that the title wasn't postgres vs mariadb vs sqlite!

      mariadb is a fork of mysql made by the creator of mysql after he sold the rights to mysql to oracle but, keeping all the rights to fork it if he wished to. Maria is the name of his baby girl.

      He's added a bunch of functionality to mariadb since then. I have switched most of my mysql instances to mariadb transparently.

      Of course, I use postgres as well although I have less expertise into it such as replication.

      I am also aware that postgr

      • by msql, I meant sequel MS server

        msql use to mean mini-sql, a very efficient database engine that was unfortunately single threaded query wise!

        A lot of fun discovering that fact on the busiest site in my country in 1998,

        Another funny thing in 2000, I discovered that MS-SQL server was also single threaded with regards to login so it could be easily DOS by telnetting into the port and just sleeping there until timeout.

        MS was grateful to me to have reported it. They flew me to Redmont, had me visit the campus

        • Cool story. Must have been a pretty nice experience.

  • by nicolaiplum ( 169077 ) on Monday July 04, 2022 @05:20PM (#62673362)

    Grinberg has, as many before him, produced a system perfectly tuned to run his application today.
    Assuming his application has any ongoing development, it will not be same application tomorrow, or the day after. A few months or even years in future, it will be a very different application - with different query patterns and different loads imposed on the database.

    It is usual that if you strongly optimise for something, you pessimise for something else - particularly in the world of relational databases. A simple example is whether you choose to index a column or not. Indexing all columns takes space and slows inserts, so you just pessimised for size and insert speed if you did that.

    For true long-term success you must optimise your database so that it not only works well today, but is stable in the face of changes to the query mix and even the types of queries being submitted. It must not have very bad cases with slight changes in queries, or you could have your app performance face-plant with a minor code release tomorrow.

    Grinberg does not address this problem at all.

    (See the presentation by Dr Daniel Seybold at Percona Live 2022 for one way to do this. I have no connection with Dr Seybold.)

    • by edwdig ( 47888 ) on Monday July 04, 2022 @09:45PM (#62673828)

      They made a simple system using ORM tools and a database not really meant for use on a server. They realized they outgrew it and swapped the database for a more appropriate one. They thought it would be interesting to benchmark the performance of the two systems.

      They barely did any work here. The extent of the "optimization" they did was tweaking a config file to say PostgreSQL was allowed to use more memory per query, allowing their large queries to fit in memory.

      It sounds like they've got a fairly simple problem to solve, with a pretty straightforward solution in place. They swapped out the weak link in the toolset, which was pretty obviously a weak point, and got the expected result. It sounds like they're keeping things pretty simple and not over-engineering anything.

    • We take blogging very seriously at Twilio.

      Ugh. The template "we take X very seriously" can only ever be used in the sentence "we take security very seriously", where everyone knows it means "we didn't even know about security until we got pwned earlier on". Other than that, "we take X very seriously" should never be used in a sentence.

      • Blogs but not databases, apparently.

        I get that he's just learning what a DBMS is - the error is the neophyte offering advice to the public. Just a dumb choice for the audience here.

  • I'm suspicious of anyone who 'embarks' on anything except a wooden boa - why is everything a 'journey' these days? And only corporate puff pieces use so many exclamation points and other exciting words for normal computer thingies.
    • I meant to say 'boat'. not 'boa', as in constrictor. I just got back from a hashicorp conference so I'm a little burned out on artificial marketing excitement about normal computer thingies.
  • Memory (Score:4, Insightful)

    by stabiesoft ( 733417 ) on Monday July 04, 2022 @05:22PM (#62673376) Homepage
    I read thru the article. Was somewhat surprised he did not benchmark doubling memory on the cloud box to see how much it got. The test mac box had 16G vs his cloud production server with 2G.
  • by nixer ( 692046 )
    This could have been so much more. Is this what a "Principal Engineer" does these days? I kept thinking that pre-computed tables would have solved the issue better (the results being more stable and consistent) and would have been faster to implement.
    • Apparently so. I learned all this the "hard way" (actually, the fun way) 20 years ago building a website for a MUD. I was hoping for something more in depth than just changing backends, discovering the schema was messed up -- a mistake you'd hope a principal engineer would not be making -- and then making one of the most basic tweaks to your server configuration possible.

      On one hand, good for these guys who are making a lot of money. But on the other hand, I'd never even get an interview for that role despi

  • by devslash0 ( 4203435 ) on Monday July 04, 2022 @05:27PM (#62673396)

    SQlite is simpleton whereas Postgresql is a fully-featured database engine. I won't even try to list all the critical difference because there are so many. It kind of makes you wonder if this guy is in one of those "principal" roles which don't do any hands-on work.

    In my experience, performance issues are caused by poorly designed database schemes or just leaving everything to your ORM and hoping for the best. It's all as slow as your slowest link.

    • by seth_hartbecke ( 27500 ) on Monday July 04, 2022 @05:48PM (#62673464) Homepage

      I think the true performance benefit of the larger / older database engines isn't as much on the read side but the write site.

      A lot of new / simple database engines have very large locking domains. ie: when you insert a row, does the engine lock just the blocks the rows going into lock? Or the whole table? Or the ENTIRE DATABASE (yes, early mongo DB, I'm looking at you).

      And who is blocked by that lock? Do writers block writers or do they block readers too? (and can readers prevent a writer?)

      That's where Postgres is going to shine over more simple DB engines. Not so much on how fast they can read the database for 1-5 users. But what happens when you have hundreds of inserts a minute second from multiple users.

    • performan ce issues are caused by poorly designed database schemes or just leaving everything to your ORM and hoping for the best.

      This is true, assuming you are starting with a mature, full-featured, well-engineered database engine. There are some deficiencies that you just can't overcome with good schemas or proper indexing.

      It is certainly true that if you need optimal performance, you're not going to get it from an ORM. Like any architecture decision, using an ORM is a decision focused on ease of programming, not robustness or performance.

    • by Antique Geekmeister ( 740220 ) on Monday July 04, 2022 @07:42PM (#62673654)

      He may have the kind of "principal" rule where he blows through 2 years of salary and testing environments and sophisticated plans before producing anything of any use whatsoever. I've encountered a few of those in my career. They're often responsible for "big picturee" projects doomed to failure.

    • Comment removed based on user account deletion
      • Which probably means the next step is falling into the NoSQL camp, and once a man is lost to NoSQL, then... *shudders* *draws deep drag of pipe* That way requires you keep your wits about you, or else ye be lost to the NoSQL demons forever

        Sounds like an idiot who does not know what "NoSQL" is.

        Hint: before SQL: everything was no SQL. And during the rise of SQL, everything else was: NoSQL.

        Perhaps you want to get rid of your ignorance and google at least what a Graph-database is, or what an object oriented dat

        • Object oriented database are more relevant to today's data, if you're still using SQL-anything in 2022, your data is extremely specific.

        • Comment removed based on user account deletion
          • No idea what your rant is about.

            SQL is not hard.

            But the things modern NoSQL databes do: is har dto do with SQL databases.

            I suggest to read a bit about it.

            AGAIN: before SQL, everything was no SQL. Neverthelss thy were Databases. And during the rise of SQL / relational DBs, plenty of non SQL databases were kept into use till the late 1990s. And then "NoSQL" emerged as an new/alternative paradigm. E.g. Hadoop, Casandra etc.

            Sorry: you have no clue what you are talking about. But, to your relieve: there are book

        • Perhaps you want to get rid of your ignorance and google at least what a Graph-database is, or what an object oriented database is.

          They are things that tell anyone who wants to write a report to go fuck themselves.

    • Have you ever used SQLite, because your statement suggests you haven't. It's quite a good engine for certain projects - Firefox springs to mind. Read the article. In some situations SQLite was offering better performance, until PostgreSQL was tuned a bit. That's a compelling reason to consider it for some projects.
      • Yep.

        They even tell you this: https://www.sqlite.org/whentou... [sqlite.org]

      • SQLLite is a one use, one process, one thread database.
        For idiots who only can speak SQL and are to stupid to write a program that writes int/reads from a set of files.

        SQLLite has absolutely nothing to do with any real database, SQL driven or not. It is just a file, which you can access with SQL syntax. And that is it.

        • SQLLite is a one use, one process, one thread database.

          From what I remember the SQL lite people themselves were describing it as a replacement for fopen. It's an amazing system for what it is. The API, syntax and features are quite good. I have a lot of respect for the developers.

    • This is also all I could think. Postgres and SQLite are completely different technologies done in completely different ways, which happen to share a generally similar interface (SQL queries).

      SQLite has been great for me to do quick DB tests or very compact deployment of microservices which need simple local persistent storage, but I'd never compare it against a full SQL server option like Postgres, MySQL, or MS SQL for performance or stability. It also sounds like the guy in the article was using wildly
    • I was thinking the same thing too. He didnâ(TM)t bat an eyelid about using UUID in string format as a key. Really look at your schema if you want to make big performance improvements. Then look at how you run your queries.

      Funny: somebody where I work was having performance problems with an internal web site. He wanted a fast SSD to improve the speed of the database. Never mind he was storing JSON data in cells because of development convenience and then parsing them in Python in the web server f

  • by fahrbot-bot ( 874524 ) on Monday July 04, 2022 @05:30PM (#62673408)

    Having publicly professed my dislike of performance benchmarks, I resisted the urge to look up any comparisons online, and instead embarked on a series of experiments to accurately measure the performance of these two databases for the specific use cases of this application.

    So... because you dislike performance benchmarks, you decided to run your own performance benchmarks? Even if your "specific use cases" never change, some more general performance information may be useful to consider. They may at least give you insights into things you haven't thought of.

    For example, a *while* ago, as a sysadmin for several large HP-UX systems, I helped the DB admins increase performance of a large Oracle database by tweaking the LVM fail-over configuration so both paths were in use simultaneously -- half using "a" failing over to "b" and half using "b" failing over to "a" -- instead of the fail-over paths being idle.

    • by computer_tot ( 5285731 ) on Monday July 04, 2022 @09:35PM (#62673806)
      The author doesn't like other people's benchmarks, running tests not relevant to their situation. Which is really quite reasonable and I wish more people thought that way. It's so frustrating people claiming project A is better than project B for a task because they read a benchmark from five years ago that tested something completely different. Admins should do their own testing in their own environment, otherwise the benchmarks are pretty useless.
  • by darkain ( 749283 ) on Monday July 04, 2022 @05:32PM (#62673416) Homepage

    What's more important: performance or reliability? SQLite is limited to a single node and no networking capabilities. If it dies, its dead, game over. Backup support is limited. PostgreSQL, or other databases support complete backup solutions, replication, and better vertical and horizontal scalability. If your only metric on which DB to choose is JUST "performance", you're already doing it wrong.

    • Sure, but it was like a lot of applications where it started as a "Quick and Dirty" and now it's used to an extent that was never envisioned in the beginning.

    • Backup support is limited? "cp database.sql database-backup.sql" is pretty easy.
    • What's more important: performance or reliability? SQLite is limited to a single node and no networking capabilities. If it dies, its dead, game over. Backup support is limited.

      Huh?

      To backup an SQLite database you just copy a single file. You can do it once per hour on cron if you want to.

      Try that with one of the others.

      • by Bert64 ( 520050 )

        Works quite well if the file is small, causes problems if the file is large especially if any writes occur while you're copying the file.

        • Whatever. I find it's way easier to back up a SQLite database than any of the others.

      • Yeah, idiot.
        And if that file is in the middle of a transaction, the back up will be corrupt and unusable to be reused in another DB.
        (* facepalm *)

    • by Bert64 ( 520050 )

      Well that really depends what the data is and how important it is.
      If the data is unimportant or temporary, then perhaps performance is more important than reliability. Perhaps it's read from somewhere, processed, and then exported somewhere else such that the intermediate stage isn't really important - ie you can just run it again using the original source data. In instances like this i would want the database to be fast, and if it gets lost eg to a power failure, i'd just restart it with a blank database a

  • by RandomDisposibleAcc ( 9476493 ) on Monday July 04, 2022 @05:34PM (#62673422)
    It's amazing what qualifies as newsworthy - there's no thought like defective thought, and I'm astonished that a reasonably established firm like Twilio thought an architectural misadventure was worth sharing. Most shops would have pulled out the duffle-bag to dispose of the body here.

    TLDR for people - They used an inappropriate technology initially (SQLLite) and switched to a more traditional (Postgres) when it became readily apparent that once you exceed a certain volume of data, or want to start using the system for ad-hoc analytics and dashboarding that the embedded-oriented solution started to show strain. It took much analysis and review for them to arrive at the same conclusion anyone with half a brain would have had at the outset.

    In the process of this journey of self-aggrandising re-discovery, there's a few good sledges at "other benchmarks" to try and discourage anyone from using established benchmarks to make basic decisions - which itself is a slightly nuts position to take. The most important part of using any benchmark as a decision is to understand what's actually been tested.
    • by Ecuador ( 740021 )

      Exactly, this was like a complete beginner using databases for the first time. Oh, look postgres is faster than sqlite if I do this specific thing on my laptop. Oh, look now it is slower if I do it on this VM, oh, wait it is faster again if I actually touch the configuration. Weee!

      I mean, the fact that he compares postgres to sqlite is a huge WTF by itself. Even comparing Postgres to MariaDB/MySQL/Percona is much more than their performance. OMG, what if he finds out about those and writes another blog post

  • by DidgetMaster ( 2739009 ) on Monday July 04, 2022 @06:39PM (#62673566) Homepage
    Like many startups, they seem to have gone with the simplest solution that worked for the size of the data and the nature of their initial queries at the time; and then were surprised when the solution started to bog down once the size of the data grew beyond their estimates. This happens quite often in small businesses that still use spreadsheets to manage their data instead of a relational database. Once your data reaches a certain size or the scope of your queries expands significantly, the old solution may no longer work properly.

    When I was developing my own database engine, I found it really hard to find reliable benchmarks for other systems. Each benchmark can be affected by a hundred different factors like configuration settings, indexes, or query plans. Like this blogger, I had to run a bunch of different kinds of queries using a variety of data sets against other DBs like Postgres, MySQL, and SQL Server before I could determine that my own solution was faster in most instances. I tried to make the competition as fast as possible by creating indexes and tweaking settings. Even then, I had to wonder if I was missing some important setting in each of those other DBs that made my benchmark tests 'unfair'.
  • I'm curious when SQLite's use case tops out, and one needs to go for a true RDBMS like PostgreSQL. In a previous job, I used Zetetic's SQLCipher and SQLite's SEE (a licensed package) with good results, to ensure that stuff is encrypted. However, this application was designed to run for a single user, with encryption present for a data at rest layer.

    I'm guessing if you need to have more than just single user data, you move to PostgreSQL, but if it is just a single user app, perhaps single threaded, SQLite

  • I'm not going to knock someone who actually did their homework and measured something and reported the results dispassionately. Even if it was on a smallish and not-too-complex problem. Too many people just won't do that, will make wild assumptions, and will just fiddle with things until they sort of work.

    And the way you learn how to do large optimization problems is by practicing on smaller ones.

  • I thought PostgreSQL was multi user and SQLLite was single user?
  • SQLite, WTF? (Score:5, Insightful)

    by dskoll ( 99328 ) on Monday July 04, 2022 @07:10PM (#62673604) Homepage

    Don't get me wrong; I love SQLite. It's great for these use cases:

    • o A single-user application that wants to store some data and use SQL to retrieve it. For example, Firefox uses SQLite to store history, cookies, etc. It's great for that!
    • o As a test database for unit and regression tests in a big project. Rather than spinning up a real database for all your tests, running them against SQLite can be pretty easy, and having SQL that works with both SQLite and your real RDBM engine can help a lot with cross-database portability.
    • o As an embedded database in a small system that has no networking.
    • But using it for a multi-user production web app? YIIIKES!!!!

    • SQLite is actually really good in a multi-user web app situation. It's super light, portable, user friendly. Not saying it's always the best tool or scales well if you need multiple nodes. But there is a lot going in its favour.
  • by geekopus ( 130194 ) on Monday July 04, 2022 @08:37PM (#62673720)

    This is, without a doubt, one of the dumbest articles I've ever seen on Slashdot. Anyone, much less someone with the title "Principle Software Engineer", could have predicted that even a slightly tuned Postgres install would be faster than SQLlite.

    I admit, I am waiting with baited breath for the amazed reaction when said engineer discovers what indexes and multiple cores can do.

    Ridiculous. Been reading slashdot since the late 90's and man, this one takes the cake.

    • This is, without a doubt, one of the dumbest articles I've ever seen on Slashdot. Anyone, much less someone with the title "Principle Software Engineer", could have predicted that even a slightly tuned Postgres install would be faster than SQLlite.

      Yep, and when the "fix" is to change the allowed memory usage from 4MB to 16MB on a machine with gigabytes of RAM, then... (insert Picard facepalm)

      (which raises the question of why it only has 4Mb by default in 2022)

      • Because it is called: SQLlite
        And lite is jargon for light.

        If you are using SQLligt you are not supposed to do anything data heavy. It is a poor man's DB engine for people who do not know how to store and retrieve data, unless it is via SQL. E.g. in Browser local storage etc.

    • Now I know to ignore their requests on LinkedIn.

      Could you imagine reporting to a principal engineer that thought increasing base memory was a surprising twist, or something even worth writing about? What a revelation!

      The real lesson here is that the type of person that is good at writing blogs and making pointless graphs is the type of person to get senior positions, because pointy haired bosses like that stuff, even if it wastes resources.

      • by nadass ( 3963991 )
        I 100% agree with all of the gaffaws they're getting, but I'll offer up a bit of sympathy for the person...

        They're @ Twilio/SendGrid. They're on the chopping block in this wave of corporate downsizing. Before year's end, they'll be working elsewhere. Their only hope of remaining employed in their current capacity is writing up puff pieces on their public tech blog about banal topics like "Hey! RTFM before deploying to production!" and other headline grabbing quips like "Hey! I don't like benchmarks! Hers
  • I mean, this IS a shim in between sqlite or postgres. Guy says he hates benchmarks but then goes and benchmarks a databate+shim as a way to decide which database is better. As a long time DBA, I can tell you that postgres scales dramatically better with concurrency than sqlite. SQLAlchemy literally queues and batches queries up and if run incorrectly could and likely is creating individual connections instead of a single per execution.

    So guys doesn't like benchmarks so makes a worse benchmarks to show ho

    • As a long time DBA, I can tell you that postgres scales dramatically better with concurrency than sqlite.

      You don't need to be a "long time DBA" to know that, it says so right there on the SQLite web site:

      https://www.sqlite.org/whentou... [sqlite.org]

    • In matters if SQLite I always refer people to the SQLite code of ethics. It's either something you don't like or something you like.

      https://sqlite.org/codeofethics.html
      • I went in expecting the usual politicised diversity stuff, but this is hilarious and refreshingly unpretentious. Or rather a middle finger to corporate bureaucracy.

  • While it's fun to do, was the work worth the performance boost? Probably not.

    In general, the problem wasn't that they picked the wrong tool for the job, it's they didn't understand the limits of the tool they picked.

    Every tool is incorrect once you get to a certain point in time. The important part is understanding that and designing for a migration to the next tool.

    • by edwdig ( 47888 )

      They used an ORM layer. None of their code directly touches the database, it all goes thru the ORM.

      The migration was just changing a few database URLs in the code and running a tool to copy the data. The only complication was fixing a few errors caused by SQLite not enforcing the declared datatype for a column.

      Sounds like they did everything right, and they probably spent more time putting together the blog post than doing the work.

  • by pchasco ( 651819 ) on Monday July 04, 2022 @09:54PM (#62673838)
    What is actually sad is how much time this PRINCIPLE engineer wasted performing this experiment. Relying on an ORM to produce optimized queries for different SQL engines is naive. I expect that if the author took a few hours to do some research and write some optimized views instead of relying on ORM he would have seen 2x or better response times over SQLite.
  • My comments, as a full stack developer with experience using many different types of databases.

    What I see is a design problem not related with what type of database is used

    SqLite is a very fast embedded relational database that can work for many different use cases; it is not a toy and have a really big capacity. BUT, it blocks the entire database when a change is made on it.

    The network databases as PostgreSQL or MySQL have record level blocking and a better mechanism to deal with concurrency, but they are

  • You're probably better off with MongoDB

  • SQLite is useful for its niche but, as others have said, it was a total error to have chosen it in the first place. Secondly, the most common problem we see nowadays (and I’ve been in web-apps since 1996) is that while pens are really great abstraction models, developers so really still need to know how SQL works. As most of us here know, good index modelling can reduce most queries to near zero responses. None of this was mentioned in the blog. The developer is using naive SQL data models inherited
  • Seriously, how is any of this complicated?

    We all know what SQLite is, and we all know what PostgreSQL is. We know what we can expect from either and what their strengths and weaknesses are.

    Running your own benchmarks - sure, go ahead. Other people's benchmarks are still informative, and it's much faster to read three benchmarks and understand that if they all agree one something, your own test is likely to produce the same result. If they don't agree, or if your use case isn't covered by them, you go run yo

  • And why didn't he also try mariadb or mysql?

    • "And why didn't he also try mariadb or mysql?"

      Because he wasn't trying to do a comprehensive down-select?

      He was just satisfying his own curiosity, and sharing the results.

  • by Elixon ( 832904 ) on Tuesday July 05, 2022 @04:59AM (#62674246) Homepage Journal

    "Having publicly professed my dislike of performance benchmarks" ... so he publicly put out his own performance analysis... hmm.

    Would he read this analysis if there was not his own name underneath it?

  • What happens when you port a database over and rely on defaults instead of tuning for performance? You get random performance vs the other database.

    This kind of lazy thinking is why so much code out there is poorly-designed low-performing nonsense. The author sounds like a web programmer who doesn't know a damn thing about optimization or what actually causes databases to perform well vs poorly.

    I don't know why he bothered to write a blog post about it. The lack of thought it demonstrates is not a go

  • No, it's not. Really. Nothing about this effort sounds complicated...or deep...at all. This is the type of analysis I'd expect my new hire straight out of college to do. In fact, it even has that "college class project write up" feel to it.

    You wrote some little python scripts to make some calls and throw some data up in Tableau for visualization (or whichever visualizer you're using...doesn't matter) and got expected results. The only "unexpected twist" you got was when you needed to bump the memory li

  • How is the author a principal software engineer

  • One of the things that makes SQLite popular is you don't need to install it as a daemon process, create user and admin accounts, you can basically just dive in without RTFM. The reason Postgres listens to a port (5432 by default) and requires you to RTFM is because it's a proper client/server database, meaning it's scalable etc. SQLite is a great learning tool. But its developers should stress that it doesn't do all the things proper databases are supposed to do.
  • Unless things have changed much in recent years, the default postgresql configuration is designed to run out of the box on almost any hardware - in other words, it is not even remotely tuned for real-world workloads. Granted, 1 vCPU and 2GB of RAM are pretty limited resources for a database, and performance of concurrent queries is going to be impacted by the lack of multiple cores, but there was surely much more performance to be recovered from postgres than simply increasing work_mem, never mind what wou

  • I mean, I think this might be more of trying to save face for selecting SQLite for anything

In the long run, every program becomes rococco, and then rubble. -- Alan Perlis

Working...