somat 2 days ago

Here is my rather naive take on the same subject. But I had a very different motivation than the author. See I actually quite like SQL and enjoy programming in it, but what I don't like is mixing sql and python. So one night in a flash of inspiration or perhaps a fever dream I wrote this thing that lets you have stand alone parameterized sql queries and you call them like a python function or generator. It is one of those overly clever things where I sort of hate the magic, but I find myself using it more and more which I will probably regret one day.

https://nl1.outband.net/fossil/query/file?name=query.py&ci=t...

In short you have your query in file sql/dept_personal.sql and you call it like

    for row in q.dept_personal(db_cursor, department='manpower'):
  • atombender 21 hours ago

    That's basically the idea behind Sqlc [1]. By letting SQL be SQL, you avoid the many awkward mechanisms ORMs need to integrate SQLisms into the native language, and you define the query only in terms of its inputs and outputs, which can be made type-safe since they're declaratively defined.

    The downside is that parameterized queries are a bit of a chore; for example, if a query should support an optional filter on user_id, you need to craft it like this:

        WHERE ...
          AND CASE
            WHEN sqlc.narg('user_id') IS NOT NULL THEN sqlc.narg('user_id')
            ELSE true
          END
    
    This is not too bad, though, and the conditionals get optimized away by the database planner.

    [1] https://sqlc.dev/

  • truculent 2 days ago

    You may enjoy https://pugsql.org

    • getnormality 2 days ago

      This is a nice package, and a great illustration of how languages other than R suffer from the lack of an aesthetically elegant way to select list elements with bare words, like R's $ operator.

      Because their lists don't have selection by bare words, they have to go one of several other specialized, distinct, built-in Abstract Data Types to get it. They have to create whole so-called "Classes" and "Modules", when all they really needed was a list whose elements can be accessed with a dot and a bare word.

      The pandas package for tabular data manipulation requires even more complicated workarounds. It has a DataFrame Class composed of objects of Column Class. Then it makes an arbitrary bunch of common functions, so common that many are built into Python itself, Methods of said Columns. (In R, a table is just a list of vectors, and no Methods are needed.)

      So now you've got a thing that's supposedly a real Class, but it's really just a container of completely arbitrary fields and data types. These fields are themselves instances of another Class that is supposedly specific to pandas, but is really just a vector, and a vector doesn't necessarily have anything to do with being part of a table. And that Class has some random methods that give you additional ways to do basic things the language already does, and are often not the functions you actually need to work with the data therein.

      All that just so that we can write stuff like df.col.max(), and... gosh, what is that even supposed to mean? Can we all just admit that we like writing code in chains separated by dots, and stop tying that capability to hierarchies of Official Abstract Data Types?

      These non-R languages make you utter such strange incantations just to put something in a key-value container and access that thing with nice-looking code. I feel like this makes it harder to realize that very often this is the best way of doing things.

      R has a bit more varied and sometimes mildly ugly syntax than other languages, but once you get used to the building blocks it gives you, it has all these powers to do very dynamic things in very easy ways, without a bunch of ponderous specialized concepts.

      • cluckindan 2 days ago

        What do you mean? Many languages allow accessing named properties like that. Even JavaScript :)

        The strange thing here seems to be R’s use of ”list” as a name for a map-like key-value structure. The word ”list” is commonly understood to refer to a data structure which needs to be linearly (linked list) or partially (skiplist) iterated through to access a value at a particular index.

        • nevertoolate 2 days ago

          Total nitpick - you say list is commonly understood to be linearly iterated. I’d expect a list to refer to an ordered sequence - default implementation of access and mutation varies wildly between languages. E.g. java code usually defaults to ArrayList, lisps to cons cells, C++ doubly linked list, etc.

          Sql has “tuples” for the rows of a result-set which are neither tuples nor lists in the “general sense” and are of a “record” type - names with values.

          So what is a list? Depends on the context.

        • getnormality 2 days ago

          I guess I don't know enough about enough other languages to make broad generalizations. Oh well, it's too late to edit now.

          My impression is that JavaScript is another language like R that values flexibility a lot.

          And yeah, I agree that R is rather casual about lists vs maps. It doesn't really care that maps are a great data structure in their own right. It just wants to slap names on list elements when it's convenient to access elements of the list by name.

      • getnormality 2 days ago

        Correction, the Column object is called Series, but the same points apply.

  • digdugdirk 2 days ago

    I honestly have no opinion in this discussion, but I will 100% upvote the first Fossil repository I've seen shared on here! How do you find developing on the Fossil platform?

    • somat 2 days ago

      It has a sane ui/syntax, all the additional infrastructure you might want is built in (webpage, wiki, forum, tickets) and is trivial to set up. It is nearly the perfect version control system. Especially for the small independent amateur developer.

Rajni07 6 days ago

Really like the idea of keeping SQL explicit while still getting ORM conveniences. The @sqlfunc syntax feels clean, and the no-session approach makes cross DB work simpler. Curious how it handles async or pooling. Seems like a solid middle ground between raw SQL and heavy ORMs.

itopaloglu83 2 days ago

A common problem I found myself in is that I have to develop the query in one file and frequently run it to verify the data accuracy. I define the variables at top reminding me types and limitations etc. int vs varchar(10) vs varchar(50)

So I made a very simple module that takes those sql files and turns them into SQLAlchemy text objects with variables in them.

Would it be possible to add something like this to the project or does it require many sql parsing libraries etc. to ensure sql validity to find variables in the sql file?

tcdent 2 days ago

Kudos for making the leap.

Your pattern of re-interpreting __doc__ is kinda weird though. Why not just add a `return` statement?

  • duncanfwalker 2 days ago

    I guess it's more clear that it should be a to statically readable value? eg you shouldn't do things like use arguments to build the str

    • jcz_nz a day ago

      I would def use this if there was a return “select …” option. There are heaps of scenarios where sql is modified based on parameters. If no doc string just use the return value maybe?

      Our queries are typically large, not 3-5 liners.

      (Filter view queries where you might add additional CTA’s to provide the necessary filter conditions, but aren’t desirable if particular filter parameter is nill, etc.)

      • tcdent 12 hours ago

        Just keep in mind best practice is to use the built-in parameter interpolation that comes with your db library, since it handles escaping SQL injection for you.

        Be very careful if you ever use bare string formatting to construct your queries.

harvey9 2 days ago

The author says "the name isn't great", but I like it. Very close to 'squirm' or even 'slurm'.

whinvik 2 days ago

I find the following way of expressing args awkward.

with engine: fn(args)

I would rather have

with engine: fn(args, engine)

This makes testing way easier.

develatio 2 days ago

I think this is very similar to Django’s ORM.

JodieBenitez 2 days ago

> However, I've always felt some of the design choices didn't fit how I like to use an ORM. Notably:

I feel the same, hence why I prefer a Django-like ORM to SQLAlchemy in spite of all the praises it gets. The author says "SQLAlchemy is the best. I don't like the API or codebase of the others", but actually what he describes feels like the Django ORM (or Tortoise, or many others).

Also, sometimes just a thin layer above SQL is fine. For small personal projects I use my own wrapper above sqlite like so:

    import oora
    from dataclasses import dataclass

    db = oora.DB(
        db_path=":memory:",  # or /path/to/your/db.sqlite3
        # migrations are just pairs of key=>val where key is an arbitrary (but unique) label and val is a SQL script or a callable.
        # If val is a callable, it must take a sqlite3.Cursor as first parameter.
        # migrations are executed in order
        migrations={
            # here's an initial migration:
            "0000": "CREATE TABLE IF NOT EXISTS user(id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL);",
            # simulating a schema evolution, let's add a field:
            "0001": "ALTER TABLE user ADD COLUMN email TEXT NULL;",
        },
    )
    db.migrate()

    db.insert("user", {"name": "John"})
    db.insert("user", {"name": "Jack"})
    db.insert("user", {"name": "Jill"})


    # dataclasses are perfect to represent rows
    # while still allowing custom behaviour
    @dataclass
    class User:
        id: int
        name: str
        email: str

        def __str__(self):
            return self.name


    # fetch a random instance
    user = db.hydrate(User, db.execute("select * from user ORDER BY RANDOM() limit 1").fetchone())
    print(f"User(id {user.id}), original name: {user}")

    # change name and email
    user.name = "Richard"
    user.email = "richard@acme.tld"
    db.save(user) # name of table is infered from the dataclass name
    print(f"User(id {user.id}), updated name: {user} <{user.email}>")

    # persist changes
    db.commit()import oora
    from dataclasses import dataclass

    db = oora.DB(
        db_path=":memory:",  # or /path/to/your/db.sqlite3
        # migrations are just pairs of key=>val where key is an arbitrary (but unique) label and val is a SQL script or a callable.
        # If val is a callable, it must take a sqlite3.Cursor as first parameter.
        # migrations are executed in order
        migrations={
            # here's an initial migration:
            "0000": "CREATE TABLE IF NOT EXISTS user(id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL);",
            # simulating a schema evolution, let's add a field:
            "0001": "ALTER TABLE user ADD COLUMN email TEXT NULL;",
        },
    )
    db.migrate()

    db.insert("user", {"name": "John"})
    db.insert("user", {"name": "Jack"})
    db.insert("user", {"name": "Jill"})


    # dataclasses are perfect to represent rows
    # while still allowing custom behaviour
    @dataclass
    class User:
        id: int
        name: str
        email: str

        def __str__(self):
            return self.name


    # fetch a random instance
    user = db.hydrate(User, db.execute("select * from user ORDER BY RANDOM() limit 1").fetchone())
    print(f"User(id {user.id}), original name: {user}")

    # change name and email
    user.name = "Richard"
    user.email = "richard@acme.tld"
    db.save(user) # name of table is infered from the dataclass name
    print(f"User(id {user.id}), updated name: {user} <{user.email}>")

    # persist changes
    db.commit()