Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for calling stored procedures #35

Open
komu opened this issue Mar 29, 2017 · 26 comments
Open

Support for calling stored procedures #35

komu opened this issue Mar 29, 2017 · 26 comments

Comments

@komu
Copy link
Member

komu commented Mar 29, 2017

Currently there no special assistance for calling stored procedures, but this could probably be made simpler.

@Conrad-T-Pino
Copy link
Contributor

Up vote; I bury all transactions in MariaDB procedures.

@komu
Copy link
Member Author

komu commented Aug 17, 2024

Do you have any suggestions what the API should look like? What would be the simplest thing that would cover your use cases?

@Conrad-T-Pino
Copy link
Contributor

Conrad-T-Pino commented Aug 25, 2024

Yes, I have ideas but I'm in project crunch time at the moment. I'll mark as unread and return later.

@Conrad-T-Pino
Copy link
Contributor

My schedule loosens up middle of next week. What's your preferred format for questions and design discussions?

@komu
Copy link
Member Author

komu commented Sep 2, 2024

I suppose we could have the discussion right here in this issue. Nothing too formal, just some concrete API and call examples to chew on. Make sure that there's no big gotchas.

Most of Dalesbred's API follows the practice that there are very general abstractions on the bottom and then series of higher level things built on top of that, for example:

  • <T> T executeQuery(@NotNull ResultSetProcessor<T> processor, @NotNull SqlQuery query) - execute the query, but let the caller handle all the details of result set processing.
  • <T> List<T> findXXX(@NotNull RowMapper<T> rowMapper, @NotNull SqlQuery query) - let user process each row of the result set manually
  • <T> @NotNull List<T> findAll(@NotNull Class<T> cl, @NotNull SqlQuery query) - go though all the default row-mapping machinery

Furthermore, for each method that takes SqlQuery as an argument there's also a convenience overload that takes @Sql String sql, Object... args.

So I suppose we'd want something similar: a way to create calls to stored procedures quite flexibly and then on top of that a simpler way to handle 95% of the usual calls.

@Conrad-T-Pino
Copy link
Contributor

Thank you; IMO very useful.

  • Do we have an API object life-cycle strategy?
  • In particular are "Statements" reusable?
  • Across "Connection" changes?

JDBC specifies Connection.close() handles it's related offspring but JDBC history has seen broken driver and connection pools. I worked on project having connection pool that "wrapped" key JDBC interfaces for tracking child resources to close or reset when Connection returned to the pool as naive connections pools are server resource leak train wrecks without such.

For performance, I prefer to define once and reuse safely especially if definition is expensive relative to reuse.

However the smartest choice is continuing the existing pattern; hence my life-cycle strategy questions.

JDBC stored procedures and functions are so related that I propose we just do both.

@Conrad-T-Pino
Copy link
Contributor

This package enumeration suggests adding a package name to encompass the combination of stored procedure and function call implementations. I prefer others choosing a name consistent with prior practice.

@Conrad-T-Pino
Copy link
Contributor

I am Apache Ant proficient and Apache Maven intermediate.
What can I expect trying to build on headless (shell, no GUI) vanilla Debian?

@komu
Copy link
Member Author

komu commented Sep 6, 2024

Do we have an API object life-cycle strategy?
In particular are "Statements" reusable?
Across "Connection" changes?

In general, Dalesbred tries to provide a simple solution for the 90% case and limit the lifecycle of all resources to a lexical scope. When you make a call, you can be sure that once the call has finished, all the resources have been closed. Only when you use the lower level APIs and open resources manually, are you responsible for performing any cleanups. For example:

db.withTransaction(tx -> {
    // Since I create the Statement manually, I'm responsible for closing it. The connection itself, was provided
    // by Dalesbred so it will be closed automatically if withTransaction had to open it.
    PreparedStatement stmt = tx.connection.prepareStatement(...);
});

So, there are no cases where the API returns back an object that you need to close. That certainly limits the use cases, but if the API would try to cover everything that JDBC does, you might as well use JDBC to begin with. Personally I use Hibernate for most database access, reach for Dalesbred when the solution calls for raw SQL and raw JDBC when I really need more advanced functionality.

JDBC specifies Connection.close() handles it's related offspring but JDBC history has seen broken driver and connection pools. I worked on project having connection pool that "wrapped" key JDBC interfaces for tracking child resources to close or reset when Connection returned to the pool as naive connections pools are server resource leak train wrecks without such.

Yeah, Dalesbred does not try to hold on to resources and it does not rely on connection pool to close ResultSets and other objects when releasing objects to pool. If it opens a resource, it will close it before returning from the call.

For performance, I prefer to define once and reuse safely especially if definition is expensive relative to reuse.

However the smartest choice is continuing the existing pattern; hence my life-cycle strategy questions.

Right, the thing to remember is that Dalesbred does not try to be the most performant or flexible API. It tries to provide an easy-to-use and hard-to-misuse API that is good enough most of the time.

Of course this doesn't mean that performance is of no concern in the implementation — just that it shouldn't affect the interface adversely.

JDBC stored procedures and functions are so related that I propose we just do both.

Yes, that was my implicit assumption as well, but it was good that you made that explicit.

This package enumeration suggests adding a package name to encompass the combination of stored procedure and function call implementations. I prefer others choosing a name consistent with prior practice.

I'm not entirely sure if this warrants a new package, but I'm not opposed to one either. Of the top of my head, I think that the objects representing the calls might find a good home in org.dalesbred.query. I would think that the API for results would stay the same, but if it needs new concepts, those probably fit in org.dalesbred.result. As far as the internal packages go, it's not that important because those can always be changed later without breaking client code, but I would think that most of the implementation would go into org.dalesbred.internal.jdbc.

Anyway, I think that we don't need to go there yet. At this point I think it would be good to come up with the simplest possible API for the most simple useful thing that we can think of and then figure out if there are problems with that. Is it too simple to be useful? How to make it more general? If there'a simpler and more general method, is there still value in keeping the simple version as convenience method?

Without putting too much thought into this, I guess that the most basic and most common functionality would be just to call a function with positional input arguments, getting a ResultSet back and processing those with Dalesbred's normal machinery.

So this would suggest a family of methods like:

List<MyObject> xs = db.findAllByCall(MyObject.class, "my_function", param1, param2)
MyObject x = db.findUniqueByCall(MyObject.class, "my_function", param1, param2)
...

(I'm not very satisfied with the names, but something like that.)

And just as normal db.findAll(MyObject.class, "select ...", param1, param2) is just syntax sugar for db.findAll(MyObject.class, SqlQuery.query("select ...", param1, param2)) these could be syntax sugar for something along the lines of:

List<MyObject> xs = db.findAllByCall(MyObject.class, SqlCall.call("my_function", param1, param2))
MyObject x = db.findUniqueByCall(MyObject.class, SqlCall.call("my_function", param1, param2))
...

This way support for more complex calls (named parameters etc.) might be added (perhaps not yet, but later) by allowing construction of more complex SqlCall-objects.

The key thing to remember that you can always later add more stuff to an API, but once something has been added you can't really take it away without breaking callers. And I really, really don't like breaking callers. So instead of thinking what are all the things we can add, we should start with the question "what's the simplest thing we could add that could be useful?"

So at this point we don't have to be more formal than the examples above. My examples are just strawmen and you don't need to worry about them too much, but do you have suggestions on how the API should look and feel?

I am Apache Ant proficient and Apache Maven intermediate.
What can I expect trying to build on headless (shell, no GUI) vanilla Debian?

As long as you have a Java installed, everything should be fine. The CI build runs against LTS releases (8, 11, 17 and 21), but any JDK later than 8 should work.

Gradle wrapper which downloads and runs rest of the Gradle is committed to project, so just running ./gradlew build should run the tests and produce the artifacts. (You can also run the tests with ./gradlew test or ./gradlew assemble to build artifacts without running tests.) Gradle is flexible, but the default project structure and tasks provided by the Java plugin mimic Maven so you should feel familiar.

Most of the tests use in-memory HsqlDb and don't need any configuration, but if you wish, you can copy dalesbred/src/test/resources/connection-example.properties to postgresql-connection.properties or mysql-connection.properties and then you get extra PostgreSQL/MySQL -specific tests. But you don't need to bother with those if you don't want to. Mostly they are about some advanced PostgreSQL support that's provided. CI runs those tests anyway if you don't.

@Conrad-T-Pino
Copy link
Contributor

I read your response in depth and found it useful, motivating, and provoking many ideas. Expect a focused response soon.

@Conrad-T-Pino
Copy link
Contributor

  • IMO Google Docs is useful but if found to be an encumbrance, I will adjust.
  • I encountered unexpected conclusion of ./gradlew build with attempt detailed in Dalesbred Build Session 1.
  • Please bear in mind this is my first Gradle project.

@Conrad-T-Pino
Copy link
Contributor

What are the Gradle equivalents of these Maven commands?

  • mvn --batch-mode clean compile
  • mvn --batch-mode clean package

@Conrad-T-Pino
Copy link
Contributor

Do you have a preference for reviewing Java code examples?

  1. Attach TGZ file
  2. Attach ZIP file
  3. Branch on this project
  4. Branch on forked project

@Conrad-T-Pino
Copy link
Contributor

Conrad-T-Pino commented Sep 7, 2024

Take nothing herein as criticism nor as a change request even though I may use sharp language. I do so because a clear understanding of boundary conditions is the key to correct programming.

  • Dalesbred upper API commendably manages JDBC ugliness tastefully out of view but at the risk of JVM heap exhaustion.
  • I agree upper API covers 90% of use cases and in particular human user driven interactive transactional web applications.
  • I prefer the smallest heap possible and therefore elect to focus on the lower level API supporting call backs.
  • IMO tables with large BLOB fields and millions of records streamed out a single HTTP request is feasible.

@komu
Copy link
Member Author

komu commented Sep 9, 2024

Regarding your build problems, what Java version do you have? What does java -version print? How about $JAVA_HOME/bin/java -version? And in general, to debug build issues, it's useful to have a link a build scan (run ./gradlew --scan build) containing the details of the build.

Looking at the code, the problems seems to be in this javadoc:

@throws IllegalArgumentException if size is < 0

Historically javadoc has accepted this without any complaints, but it's being lenient: the markup should be valid HTML and therefore < should be written as &lt;. So there is a problem with the code (and I fixed it in 13d1d51), but I'm really surprised to see that javadoc's doclint seems to have mismatches between versions on what it accepts. So I'd appreciate if you can tell me your Java version so I can try to reproduce the problem. I tried a few different versions, but couldn't reproduce the build failure.

What are the Gradle equivalents of these Maven commands?

  • mvn --batch-mode clean compile
  • mvn --batch-mode clean package

Those would be:

./gradlew --console=plain clean classes
./gradlew --console=plain clean assemble

Now it's good to note that even if clean builds the output folders, it might still you build caching for the next build. If you really want to make sure that everything is built from scratch, you can pass --no-build-cache. But this should not be necessarily unless you're debugging problems with some third-party plugins that have failed to define cache dependencies correctly.

And instead of passing --console=plain on every invocation, you can add the line org.gradle.console=plain to $HOME/.gradle/gradle.properties.

Do you have a preference for reviewing Java code examples?

As long as we're not really into implementation mode, my preferred mode for reviewing code examples is this issue, in a similar format that I used above.

Alternatively, if there are longer examples, a Gist or a fork of the repo with permalinks are probably best. They are inlined into the conversion like this:

try (PreparedStatement ps = prepareStatement(tx.getConnection(), sql, columnNames)) {
for (List<?> arguments : argumentLists) {
bindArguments(ps, arguments);
ps.addBatch();
}

Once we start reviewing actual code to merge, pull repository on this repository (but coming from a forked repository) will be fine. But it might be easier if I just write the code. I'm mostly interested in feedback on use cases.

Take nothing herein as criticism nor as a change request even though I may use sharp language. I do so because a clear understanding of boundary conditions is the key to correct programming.

  • Dalesbred upper API commendably manages JDBC ugliness tastefully out of view but at the risk of JVM heap exhaustion.
  • I agree upper API covers 90% of use cases and in particular human user driven interactive transactional web applications.
  • I prefer the smallest heap possible and therefore elect to focus on the lower level API supporting call backs.
  • IMO tables with large BLOB fields and millions of records streamed out a single HTTP request is feasible.

Yes, all of this is correct and pretty obvious. It's very much by design and you would not want to use Dalesbred (or similar queries using JPA or most other high-level APIs) in those cases.

Fetching all the rows of the results is such a common operation that e.g. PostgreSQL's JDBC driver does it by default and so does libpq. In most cases you want to limit the amount of results anyway in the query itself. And when you don't, you typically want to move the computation into the database instead of transferring huge amount of data over network.

Even when you do want to process huge amounts of data in caller, you often want to split the data using some key and handle it in smaller batches. You get better progress tracking, possibility to continue interrupted jobs, have less locks on database (or old versions alive in MVCC) etc.

But of course there are such cases where you really want to stream lots of data and it might be nice if Dalesbred could help you there as well. The normal RowMapper-stuff would of course be applicable in such cases as well. A very basic implementation from the top of my head would be just a few lines of code (ignoring all the exception handling here for simplicity):

public <T> Stream<T> executeQueryStream(@NotNull Class<T> cl, @NotNull SqlQuery query) {
    return executeQueryStream(rowMapperForClass(cl), query);
}

public <T> Stream<T> executeQueryStream(@NotNull RowMapper<T> rowMapper, @NotNull SqlQuery query) {
    Connection connection = transactionManager.getActiveTransaction().getConnection();

    PreparedStatement ps = connection.prepareStatement(query.getSql());
    prepareStatementFromQuery(ps, query);
    ResultSet rs = ps.executeQuery();

    return Stream.generate(() -> rs.next() ? Optional.of(rowMapper.mapRow(rs)) : Optional.empty())
        .filter(Optional::isPresent)
        .map(Optional::get)
        .onClose(() -> {
            rs.close();
            ps.close();
        });
}

So the problem is not the implementation, that's easy. It's that the API has all sorts of subtle corner cases and it makes robust error handling ugly. Also the semantics of the API would be quite different from all other calls. It would be a footgun that makes it easy for the caller to write code that looks correct, but is not. So I'm not quite sure that I want to expose and document all that.

Perhaps it would be better to expose rowMapperForClass to users so that they can handle their database objects as they wish, but still benefit from Dalesbred's result-mapping. That would be even more flexible and not much worse for this case. And that way you could write this streaming-function as a utility outside Dalesbred.

There's an old issue (#38) about this, but personally I've never had the actual need for this so I've opted for YAGNI and left it unimplemented since I'm not really happy about what the API should be.

Anyway, this discussion is at most tangential to stored procedures: if there will ever be a mechanism for streaming results, it will work for normal queries and stored procedures.

@Conrad-T-Pino
Copy link
Contributor

Regarding your build problems, what Java version do you have? What does java -version print? How about $JAVA_HOME/bin/java -version? ... So I'd appreciate if you can tell me your Java version so I can try to reproduce the problem. ...

cpino@quartz:~$ uname --all; cat /etc/debian_version; which java javac; java -version; javac -version; dpkg --list | fgrep jdk
Linux quartz 5.10.0-31-amd64 #1 SMP Debian 5.10.221-1 (2024-07-14) x86_64 GNU/Linux
11.10
/usr/bin/java
/usr/bin/javac
openjdk version "11.0.23" 2024-04-16
OpenJDK Runtime Environment (build 11.0.23+9-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.23+9-post-Debian-1deb11u1, mixed mode, sharing)
javac 11.0.23
ii  default-jdk-headless                  2:1.11-72                      amd64        Standard Java or Java compatible Development Kit (headless)
ii  openjdk-11-jdk-headless:amd64         11.0.23+9-1~deb11u1            amd64        OpenJDK Development Kit (JDK) (headless)
ii  openjdk-11-jre-headless:amd64         11.0.23+9-1~deb11u1            amd64        OpenJDK Java runtime, using Hotspot JIT (headless)

@Conrad-T-Pino
Copy link
Contributor

Conrad-T-Pino commented Sep 9, 2024

Perhaps recent commits resolved reported build issue as it did not replicate:

cpino@quartz:~$ cd dev/dalesbred
cpino@quartz:~/dev/dalesbred$ git pull
hint: Pulling without specifying how to reconcile divergent branches is
hint: discouraged. You can squelch this message by running one of the following
hint: commands sometime before your next pull:
hint:
hint:   git config pull.rebase false  # merge (the default strategy)
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
remote: Enumerating objects: 26, done.
remote: Counting objects: 100% (26/26), done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 26 (delta 10), reused 26 (delta 10), pack-reused 0 (from 0)
Unpacking objects: 100% (26/26), 2.12 KiB | 72.00 KiB/s, done.
From github.com:EvidentSolutions/dalesbred
   4ec4c08..13d1d51  main       -> origin/main
Updating 4ec4c08..13d1d51
Fast-forward
 buildSrc/src/main/kotlin/dalesbred.java-library-conventions.gradle.kts                 | 4 ++--
 dalesbred-junit/src/main/java/org/dalesbred/junit/TestDatabaseProvider.java            | 6 +++---
 dalesbred/src/main/java/org/dalesbred/integration/spring/SpringTransactionManager.java | 5 ++++-
 dalesbred/src/main/java/org/dalesbred/query/SqlQuery.java                              | 2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)
cpino@quartz:~/dev/dalesbred$ ./gradlew build
Starting a Gradle Daemon (subsequent builds will be faster)

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.7/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 1m 26s
32 actionable tasks: 22 executed, 10 up-to-date
cpino@quartz:~/dev/dalesbred$

@Conrad-T-Pino
Copy link
Contributor

And instead of passing --console=plain on every invocation, you can add the line org.gradle.console=plain to $HOME/.gradle/gradle.properties.

Thank you; so much better and informative.

... And in general, to debug build issues, it's useful to have a link a build scan (run ./gradlew --scan build) containing the details of the build.

cpino@quartz:~/dev/dalesbred$ ./gradlew --scan build
> Task :buildSrc:checkKotlinGradlePluginConfigurationErrors
> Task :buildSrc:generateExternalPluginSpecBuilders UP-TO-DATE
> Task :buildSrc:extractPrecompiledScriptPluginPlugins UP-TO-DATE
> Task :buildSrc:compilePluginsBlocks UP-TO-DATE
> Task :buildSrc:generatePrecompiledScriptPluginAccessors UP-TO-DATE
> Task :buildSrc:generateScriptPluginAdapters UP-TO-DATE
> Task :buildSrc:compileKotlin UP-TO-DATE
> Task :buildSrc:compileJava NO-SOURCE
> Task :buildSrc:compileGroovy NO-SOURCE
> Task :buildSrc:pluginDescriptors UP-TO-DATE
> Task :buildSrc:processResources UP-TO-DATE
> Task :buildSrc:classes UP-TO-DATE
> Task :buildSrc:jar UP-TO-DATE
> Task :assemble UP-TO-DATE
> Task :check UP-TO-DATE
> Task :build UP-TO-DATE
> Task :dalesbred:checkKotlinGradlePluginConfigurationErrors
> Task :dalesbred:compileKotlin UP-TO-DATE
> Task :dalesbred:compileJava UP-TO-DATE
> Task :dalesbred:processResources NO-SOURCE
> Task :dalesbred:classes UP-TO-DATE
> Task :dalesbred:jar UP-TO-DATE
> Task :dalesbred:javadoc UP-TO-DATE
> Task :dalesbred:javadocJar UP-TO-DATE
> Task :dalesbred:sourcesJar UP-TO-DATE
> Task :dalesbred:assemble UP-TO-DATE
> Task :dalesbred:compileTestKotlin UP-TO-DATE
> Task :dalesbred:compileTestJava NO-SOURCE
> Task :dalesbred:processTestResources UP-TO-DATE
> Task :dalesbred:testClasses UP-TO-DATE
> Task :dalesbred:test UP-TO-DATE
> Task :dalesbred:check UP-TO-DATE
> Task :dalesbred:build UP-TO-DATE
> Task :dalesbred-junit:compileJava UP-TO-DATE
> Task :dalesbred-junit:processResources NO-SOURCE
> Task :dalesbred-junit:classes UP-TO-DATE
> Task :dalesbred-junit:jar UP-TO-DATE
> Task :dalesbred-junit:javadoc UP-TO-DATE
> Task :dalesbred-junit:javadocJar UP-TO-DATE
> Task :dalesbred-junit:sourcesJar UP-TO-DATE
> Task :dalesbred-junit:assemble UP-TO-DATE
> Task :dalesbred-junit:compileTestJava UP-TO-DATE
> Task :dalesbred-junit:processTestResources UP-TO-DATE
> Task :dalesbred-junit:testClasses UP-TO-DATE
> Task :dalesbred-junit:test UP-TO-DATE
> Task :dalesbred-junit:check UP-TO-DATE
> Task :dalesbred-junit:build UP-TO-DATE
> Task :website:copyApi UP-TO-DATE
> Task :asciidoctor UP-TO-DATE
> Task :website:copyReference UP-TO-DATE
> Task :website:copySources UP-TO-DATE
> Task :website:assemble UP-TO-DATE
> Task :website:check UP-TO-DATE
> Task :website:build UP-TO-DATE

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.7/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 4s
32 actionable tasks: 2 executed, 30 up-to-date

Publishing build scan...
https://gradle.com/s/jr3xduws2rop6

cpino@quartz:~/dev/dalesbred$

@Conrad-T-Pino
Copy link
Contributor

As long as we're not really into implementation mode, my preferred mode for reviewing code examples is this issue, in a similar format that I used above.

I am coding to confirm low level implementation feasability as burdening the team with impractical proposals is counter productive.

Alternatively, if there are longer examples, a Gist or a fork of the repo with permalinks are probably best. ...

I have never used Gist and now recognize it could replace functions I now do with Google Docs.

Once we start reviewing actual code to merge, pull repository on this repository (but coming from a forked repository) will be fine. But it might be easier if I just write the code. I'm mostly interested in feedback on use cases.

I find learning in depth occurs during coding and often do so even if I will discard it. Issue 55 alone is sufficient cause to fork.
I shall branch within the fork to support independent tasks.

@Conrad-T-Pino
Copy link
Contributor

Yes, all of this is correct and pretty obvious. ...

There is knowledge characterized as intuitively obvious, but only if known. For example:

  • Even though you said gradlew downloads Gradle, that was not recognized until I tried it.
  • JVM heap limit is not obvious until implementation is read, and once known then it's obvious.

Newbies know nothing; if rapid adoption is a project goal then the obvious must be documented prominently.

@Conrad-T-Pino
Copy link
Contributor

Conrad-T-Pino commented Sep 9, 2024

But of course there are such cases where you really want to stream lots of data and it might be nice if Dalesbred could help you there as well. ...

...

There's an old issue (#38) about this, but personally I've never had the actual need for this so I've opted for YAGNI and left it unimplemented since I'm not really happy about what the API should be.

Anyway, this discussion is at most tangential to stored procedures: if there will ever be a mechanism for streaming results, it will work for normal queries and stored procedures.

Callables (functions or procedures) are closer to java.sql.Statement query batches, ResultSet (1 to N), except their range is 0 to N and we must consider Interface java.sql.CallableStatement language:

For maximum portability, a call's ResultSet objects and update counts should be processed prior to getting the values of output parameters.

An ideal callables upper API returns List<Stream<T>> but IMO conservative interpreation of the above is possibly a deal killer.

@Conrad-T-Pino
Copy link
Contributor

Conrad-T-Pino commented Sep 9, 2024

Perhaps recent commits resolved reported build issue as it did not replicate:

And perhaps not; after:

  • creating full fork Conrad-T-Pino / dalesbred
  • cd ~/dev
  • git clone git@github.com:Conrad-T-Pino/dalesbred.git
  • cd ~/dev/dalesbred
  • ./gradlew build
  • > Task :dalesbred:test long running
  • Control-C no change, short wait
  • Control-C no change, long wait
  • 1st ^C appears after delay
  • 2nd ^C appears after delay
  • Opening 2nd SSH session very slow and only after 1st session returned to shell.

Console output from 1st SSH session follows:

cpino@quartz:~/dev/dalesbred$ ./gradlew build
> Task :buildSrc:checkKotlinGradlePluginConfigurationErrors
> Task :buildSrc:generateExternalPluginSpecBuilders
> Task :buildSrc:extractPrecompiledScriptPluginPlugins
> Task :buildSrc:compilePluginsBlocks
> Task :buildSrc:generatePrecompiledScriptPluginAccessors
> Task :buildSrc:generateScriptPluginAdapters
> Task :buildSrc:pluginDescriptors
> Task :buildSrc:processResources
> Task :buildSrc:compileKotlin
> Task :buildSrc:compileJava NO-SOURCE
> Task :buildSrc:compileGroovy NO-SOURCE
> Task :buildSrc:classes
> Task :buildSrc:jar
> Task :assemble UP-TO-DATE
> Task :check UP-TO-DATE
> Task :build UP-TO-DATE
> Task :dalesbred:checkKotlinGradlePluginConfigurationErrors
> Task :dalesbred:processResources NO-SOURCE
> Task :dalesbred:sourcesJar
> Task :dalesbred:processTestResources
> Task :dalesbred-junit:processResources NO-SOURCE
> Task :dalesbred-junit:sourcesJar
> Task :dalesbred-junit:processTestResources
> Task :asciidoctor
> Task :website:copyReference
> Task :website:copySources
> Task :website:check UP-TO-DATE
> Task :dalesbred:compileKotlin
> Task :dalesbred:compileJava
> Task :dalesbred:classes
> Task :dalesbred:jar
> Task :dalesbred:javadoc
> Task :dalesbred:javadocJar
> Task :dalesbred:assemble
> Task :dalesbred-junit:compileJava
> Task :dalesbred-junit:classes
> Task :dalesbred-junit:jar
> Task :dalesbred-junit:javadoc
> Task :dalesbred-junit:javadocJar
> Task :dalesbred-junit:assemble
> Task :dalesbred-junit:compileTestJava
> Task :dalesbred-junit:testClasses
> Task :dalesbred-junit:test
> Task :dalesbred-junit:check
> Task :dalesbred-junit:build
> Task :website:copyApi
> Task :website:assemble
> Task :website:build
> Task :dalesbred:compileTestKotlin
> Task :dalesbred:compileTestJava NO-SOURCE
> Task :dalesbred:testClasses
> Task :dalesbred:test
^C^Ccpino@quartz:~/dev/dalesbred$
  • Whatever is running is disk I/O intensive, possibly including swap volume.
  • 2nd ./gradlew build same workspace ends with BUILD SUCCESSFUL in 22s
  • Smells like a fresh git clone ... related issue.

@komu
Copy link
Member Author

komu commented Sep 10, 2024

I am coding to confirm low level implementation feasability as burdening the team with impractical proposals is counter productive.

Yes, I understand, but please be sure that the actual implementation part will pose no problem and is more or less in my head already. So if you really like to do that, you're welcome to code, but it's not really necessary.

  • JVM heap limit is not obvious until implementation is read, and once known then it's obvious.

Yes, there's always a tradeoff. When communicating anything, one must have the target audience in mind and assume some level of shared understanding. As an extreme case, if one were to write documentation that a 5 year old would understand, no one else would want to read it.

So, as an example in the following example, I consider it to be sufficiently clear from the result type of the signature (List<T>), the javadoc and the principle of least surprise that findAll will actually produce an in-memory list with all the results.

/**
* Executes a query and processes each row of the result with given {@link RowMapper}
* to produce a list of results.
*/
public @NotNull <T> List<T> findAll(@NotNull RowMapper<T> rowMapper, @NotNull SqlQuery query) {

In general, List — to me at least — implies an in-memory collection that is not lazy. There might be cases where it's actually a view for some backing structure, but I'd expect those deviations from the normal expectation to be clearly documented, not the normal case. In case of ResultSets, you can't even fulfil the contract of List without reading everything to memory: you need to know the amount of results to implement size.

Now if there was a method that returned Stream<T>, I'd be surprised if it was eager instead of lazy. But of course I'd still want to read what sort of guarantees it makes about laziness. It opens a whole new can of worms because there are multiple ways to be lazy and none of them is right for a single situation.

Being curious, I just checked how JPA's TypedQuery.getResultList is documented. It's very similar to Dalesbred, making the same implicit — and in my view, very reasonable — assumptions.

Newbies know nothing; if rapid adoption is a project goal then the obvious must be documented prominently.

Well, it's very much not a project goal. 😄

I extracted a bunch of utility code into a library so that I don't need to write it again and again between different client projects. After a few of years, there were a bunch of other people using it, so I got a domain, cleaned up the codebase, wrote basic documentation and released it under the new domain name. 12 years have passed from that release. There are more people using the library now and I'm happy that people find it useful. I try to help people with their problems and accommodate for their needs, but in the end I don't really care if anyone uses it or not.

Anyway, I'm open to improving the documentation if there are concrete suggestions, but it feels off-topic on this issue. So feel free to open another issue or pull request if there is something that would benefit from clarification.

Anyway, this discussion is at most tangential to stored procedures: if there will ever be a mechanism for streaming results, it will work for normal queries and stored procedures.

Callables (functions or procedures) are closer to java.sql.Statement query batches, ResultSet (1 to N), except their range is 0 to N and we must consider Interface java.sql.CallableStatement language:

For maximum portability, a call's ResultSet objects and update counts should be processed prior to getting the values of output parameters.

I think there's a misunderstanding here. What I meant when I said "if there will ever be a mechanism for streaming results, it will work for normal queries and stored procedures", I didn't mean that the same implementation will automatically work for both. I meant that if there will be a family of API functions for streaming results of normal queries, the implementation will make sure that the family includes stored procedures as well.

So purely from an API design standpoint (which is the only thing I'm interested in right now), it's orthogonal to the other concerns.

An ideal callables upper API returns List<Stream<T>> but IMO conservative interpreation of the above is possibly a deal killer.

Well this is certainly not something that you'd need every day. We're trying to move complexity away from the caller, not into the caller.

So once again, if we look at the members of Database, there's an established pattern of findXXX-methods all using similar semantics. We could extend the pattern to include support for stored procedures.

At this point — or at least on this issue — I'm not interested in moving towards a different vector in the design space, adding support for more flexible lifecycles etc. I'm not saying it's not useful, but I'm saying it's tangential to the issue that I'd like to solve. So I'd like to return to what I said in my first reply:

Do you have any suggestions what the API should look like? What would be the simplest thing that would cover your use cases?

So at this point we're not looking at the most general thing that would cover all use cases, but the simplest thing that would cover some the basic cases. It's always possible to add more stuff later, if it feels necessary.

Regarding your build issues, I'm afraid that I can't help you with that. I've built the code myself on several different machines, there have been several contributors who have built the code and it builds on GitHub's runners. The console log does not provide enough information on what's going on. And as much as I'd like to help, I can't spend too much time on troubleshooting other people's environments.

That said, I looked at the build scan you provided earlier and it seems quite normal, although it was not a clean build and therefore skipped most of the tasks. I ran a full build on my own machine limiting the memory use to the same 512 MB that your Gradle was using and the peak heap use during the build was 130 MB, so that shouldn't be a problem. The tests should do very limited I/O as well. All I can say that perhaps try to monitor the resource consumption on your system using other tools to figure out what's going on.

@Conrad-T-Pino
Copy link
Contributor

  • The second and subsequent build attempts complete with reasonable duration. IMO a reasonable work around for my priority needs.
  • Only first build in fresh workspace fails excepting ./gradlew --scan build which works; see https://gradle.com/s/63lqa64bt5yo2
  • I certainly understand we have a replication failure issue and agree this is not a priority.

I will add what I have learned recently for the record:

  1. Primary Gradle JVM has reasonable heap arguments: java -Xmx64m -Xms64m ...
  2. Using ps -f and top -b -c -n 1 -o %MEM I see Gradle spawns multiple Java processes.
  3. Aggregate virtual memory allocated exceeds physical RAM and hard disk swap volume thrashing ensues.
  4. Gradle's process spawning seems based on virtual size which is physical RAM plus swap partition size.
  5. Initial build in fresh workspace, over commits virtual memory and no process makes reasonable progress.
  6. Three spawned Java processes use -Xms256m -Xmx512m or -Xmx512m; 6498m used by MariaDB and Tomcat.
  7. Smells like either a Gradle build configuration or a Gradle parallelism process spawning policy bug.
  8. IMO likely replication issues are competing processes, or absence of allocated swap, or swap device type.

If an issue is opened for this build anomaly, I will document problem processes there after we finish here.
Would access to a consistently replicating platform be useful to the project later? Example: AWS EC2 instance

@Conrad-T-Pino
Copy link
Contributor

I live in an HttpServlet world having distinct modes:

  1. User driven transactional web application where Dalesbred upper API fits.
  2. Administrator driven downloading and reporting where stream is better fit.

Dalesbred as now works; add a callback class carrying outer context references.

  • IMO SqlQuery fits define once and reuse until database schema changes very nicely.
  • I found :name useful in ColdFusion and attractive here; the SqlQuery.namedQuery(...) methods.
  • Where should I look for :name to value type mapping process in the current implementation?

@komu
Copy link
Member Author

komu commented Sep 11, 2024

Three spawned Java processes use -Xms256m -Xmx512m or -Xmx512m; 6498m used by MariaDB and Tomcat.

Nothing in Dalesbred references MariaDB or Tomcat in any way, so I believe that either you are measuring wrong things or I misunderstand what you are saying. It seems to me that you simply have too much stuff already running on your system when trying to build. There's nothing special in Dalesbred build — it's a straightforward Gradle build and neither it nor the tests should require excessive amount of memory.

If you were unable to build Dalesbred on a system that has a reasonable amount of free RAM available (say 2 GiB), I might be concerned, but I fail to see any proof of that. GitHub's build runners are pretty limited as far as development systems go and they build the code just fine. Looking at your most recent build-scan, I can see that Gradle's optional build parallelism is turned off and only 173.8 MiB/512 MiB of heap is used. That doesn't feel excessive to me.

Now on your build scan we can see that much of the time is spent in configuration phase. Looking at the timeline, we can see that it takes quite a while before we event get to Dalesbred's own build. Now just as Gradle downloads itself from the network when you run the first build, it also does some preprocessing when you execute it for the very first time. That's completely expected and normal.

Now on my machine a full rebuild without caches ./gradlew --no-build-cache clean build takes 19s and ./gradlew test after a small change takes 4s. GitHub's runners seem to take 1m 12s for a full build from a fresh installation. Those feels like reasonable numbers. If you're seeing numbers on a whole different scale, it's probably worth investigating, but in that case I kindly ask you to move to Gradle Forums since it feels quite obvious that the problem is not Dalesbred build per se.

Anyway, all of this is in no way related to the issue "Support for calling stored procedures". I can emphasize that you have frustrating issues, but I'm not an IT support guy. Even if I were, all of this is completely irrelevant to all the other people who are interested in the state of stored procedure support and happen to read this issue.

I still want to re-emphasize that I'm interested in one thing and we still haven't reached that point: Do you have any suggestions what the API should look like? What would be the simplest thing that would cover your use cases?

So I never asked for and don't need feasibility studies. And I don't want to spend my time on build support. I want simple use cases. It doesn't even have in the form of an imagined API for Dalesbred, but it could be some recurring JDBC code that feels like it would benefit from support.

Where should I look for :name to value type mapping process in the current implementation?

https://github.com/EvidentSolutions/dalesbred/blob/main/dalesbred/src/main/java/org/dalesbred/query/NamedParameterSqlParser.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants