channeldb+lnrpc+lncli: listpayments pagination support#3960
Conversation
|
Assigning myself because this falls under #3942. |
bjarnemagnussen
left a comment
There was a problem hiding this comment.
Hi,
Nice work! I was just looking into your PR and found a tiny typo that can trigger an index-out-of-range error. See my comments for more details!
e3d2e5c to
0f8a322
Compare
|
Hey @bjarnemagnussen, thank you very much for your review! |
carlaKC
left a comment
There was a problem hiding this comment.
Thanks for the PR @bitromortac!
Some nits and structural changes but this is looking nice 🐳
0f8a322 to
1fd34f9
Compare
1fd34f9 to
94cc676
Compare
|
Since #4003 is merged, I made the flag for pagination consistent with the listinvoices command. |
94cc676 to
ae30d53
Compare
carlaKC
left a comment
There was a problem hiding this comment.
Had another look at this. I don't think that the current approach of listing all payments and then filtering them for pagination is a good direction to go in. Say we are looking up 10k payments, 1k at a time, you'd end up looking up 100k payments, which is a performance decrease rather than the increase that we expect with pagination.
I think the best approach is to add indexing by sequence number. That's a scope increase, and a db migration. WDYT @joostjager ?
There was a problem hiding this comment.
All of this setup code can be moved inside of the t.Run function so that each test starts with a fresh setup.
There was a problem hiding this comment.
I would leave it that way, if you're ok with it, otherwise I would have to repeat the setup code for the last part of the test and all the tests are read-only on the database anyhow.
joostjager
left a comment
There was a problem hiding this comment.
Yes, it isn't ideal indeed, loading all payments in memory for every call. If the user wants to retrieve all payments in pages, it doesn't seem to make much sense because as @carlaKC says, the performance will be worse. If the caller stores payments and only requests new pages, there is less data to transfer. Could be an advantage depending on implementation.
I understand that adding another index is a much bigger project, but how about filtering the payments further down the call stack, in DB.FetchPayments? Pass in a sequence number range and skip everything not in the range. Memory usage will be lower and also less deserialization needs to happen.
|
Thanks for the feedback! I agree with the total resource usage argument, it's not ideal. In that sense it is good that default behavior in this PR (returning all payments) is not changed and therefore the resource consumption stays the same for that call. I would rather see this PR as a preparatory one, which burdens a little bit more the server side (for subsequent pagination), but adds already benefit for callers not having to transfer as much data as before, improving latency and client side data handling. The code is already prepared for a cursor-style traversal of the data, which would need another bucket which maps from the Concerning the propagation down the call stack, that could be done going down until The ideal way of implementing this would be using a database cursor like it is done in The big question is how to deal with the duplicate payments to the same payment hashes. I don't know what's the policy about deleting user data, but I'd suggest to do a migration, which discards duplicate payments (they were forbidden already some time ago, see #1719) and log the discarded duplicate payments upon migration. That would get rid of the legacy duplicate payments database structure and would simplify things. I'll nevertheless update requested changes. |
4beb8aa to
942cf0d
Compare
|
Some high sending nodes are already running into issues here, so going to go ahead with this for 0.10 😄 As you say, performance not affected for the default use case. Is this ready for another round of review @bitromortac? |
Oh, didn't know that it's already urgent. Yes, I tried to incorporate all requested changes. Second thought on @joostjager's suggestion: we could drop the |
From |
Great! Don't forget to re-request review when something is ready, otherwise we might miss it because the request doesn't show up on Github :) Also needs a rebase ⚾️ |
carlaKC
left a comment
There was a problem hiding this comment.
This is looking really good, thanks for the hard yards @bitromortac!
Only remaining comment is about test coverage, I think adding two more cases would be beneficial. Since this is pretty edge-casey stuff.
should we also export the sequence_index in the RPC to inform the consumer?
I think that exposing sequence index makes sense. I think it isn't exposed because there are gaps, but IMO we can expose it with an explanatory comment that it is unique and strictly increasing is sufficient. A unique id which isn't payment hash (historically non-unique) is useful.
it might be confusing for a consumer of the API if the sequence_index is gapped
On sequence number gaps, I think that we should live with them. When we index payments, we will do so by sequence number, so it makes sense to use sequence number as the index in this workaround.
Also needs another rebase, lnrpc-rebase-life is the realest 😅
There was a problem hiding this comment.
I think a comment mentioning that this index is exclusive with the default 0 meaning all payments is sufficient here.
joostjager
left a comment
There was a problem hiding this comment.
Some thoughts from me, mostly in the category 'what else could work'. Maybe there is something in it and otherwise just ignore.
There was a problem hiding this comment.
Document special case 0 in combined with Reversed? Would almost think that it is more natural to use maxint as the special value. It would then not be a special value anymore. This can also be done while still holding on to the magic 0 on the rpc level.
Also, the query doesn't necessarily start one index higher than IndexOffset. If there are gaps, it may start more than one index higher. I get the point, but it isn't fully correct?
There was a problem hiding this comment.
Right, have corrected both comments. Concerning the maxint, it also works when supplied, but left zero to be the documented special value.
There was a problem hiding this comment.
These changes can be reverted, probably caused by a different version of clang
There was a problem hiding this comment.
Have reverted the changes, which version of clang are you using? Somehow my make rpc-format messes things up.
There was a problem hiding this comment.
Instead of these two fields, how about exposing the sequence number on the payment? The caller can then figure out min and max themselves and it also creates a stronger link between the query and the response
I like the idea of just using the What we could do is exposing this sequence number on the RPC (maybe call it something more descriptive, like |
22e0d0b to
99abf2e
Compare
|
The latest state of the PR includes improved unit testing, where the case of a gap in the indices is covered. Also, the |
carlaKC
left a comment
There was a problem hiding this comment.
I'm happy with the state this is in. Great job on this one @bitromortac 🥇
I think it's fine to leave first/last index in the response, since it's the way we have it for invoices (which also have their index exposed, so it's also redundant there), but no strong feelings on this.
Just a few more comment level-nits, and since this needs a rebase anyway, the json_name tags should go before merge.
99abf2e to
7221077
Compare
|
Thank you already for the patience @carlaKC and @joostjager 🙂. I would also leave the first and last index fields for now, if you're not opposed to this. I'll keep on rebasing. |
0c7c676 to
90c610b
Compare
joostjager
left a comment
There was a problem hiding this comment.
Only comment standing from me is to remove the unused InvoiceQuery return parameter. If we need it (seems unlikely to me), we can add it back in.
90c610b to
6328891
Compare
Removed the initial query parameter form the PaymentsResponse. Thank you again for reviewing! Also rebased. |
Exports sequenceNum in MPPayment for later use in the rpcserver.
Adds a PaymentsQuery struct, which contains parameters to restrict the response of QueryPayments, returning a PaymentsQuerySlice with the payments query result. The behavior of this api is the same as the QueryInvoices one.
Adds tests for the payments query, where different edge cases for the index offsets and normal and reversed orders are tested.
Changes the grpc proto file, generates the protobuf, and enables a queried way to retrieve payments in the rpc, where backward compatibility is enforced by returning all payments in the database by default. Adds a payment index field to the returned payments of the rpc call.
6328891 to
4593cfa
Compare
| // we set our limit to maxint to include all payments from the highest | ||
| // sequence number on. | ||
| if query.Reversed && indexExclusiveLimit == 0 { | ||
| indexExclusiveLimit = math.MaxInt64 |
There was a problem hiding this comment.
Final comment about this is that this non-functional special casing could also be moved into rpcserver.go. For MaxPayments, the special case zero is handled there as well.
This PR tries to address issue #3753, pagination support for the ListPayments api.
The goal of making the ListPayments command paginable is to turn the retrival of conducted
payments more scalable and seekable (e.g. for mobile wallets).
The implementation closely follows the behavior of the ListInvoices api,
modifying the grpc ListPaymentsRequest to also include an index offset, the option
to sepcify the pagination direction, and to give a maximal number of returned payments constraint.
The call returns a ListPaymentsResponse where in addition to the payments, the first and
last index of the payments are included, which can be used to further seek forwards or
backwards. By default, the api call returns the same payments as before.
The old behavior of ListPayments is to return all payments via the FetchPayments method.
In the implementation of queried ListPayments in this PR, still all payments are loaded
into memory first by using the FetchPayments method and then discarding the unwanted entries.
This means, that memory usage is the same as before, while adding more functionality for the caller.
The reason why I decided to do that instead of reproducing the implementaion of ListInvoices
(where the database is seeked with a cursor) is that the the payments database structure
is complicated, with subbuckets for duplicate payments of the same payment hash on the one side and that the payments bucket is indexed by the payment hash instead of the sequence index (which could be used to walk through them in order). This could be addressed in a future PR, making the call more memory effective, by droppping duplicate payments and creating a bucket with a mapping from the squence index to the payment hash.
Pull Request Checklist
the positive and negative (error paths) conditions (if applicable)
go fmt(the tab character should be counted as 8 characters, not 4, as some IDEs do
per default)
make checkdoes not fail any testsgo vetdoes not report any issuesmake lintdoes not report any new issues that did notalready exist
cases it can be justifiable to violate this condition. In that case, the
reason should be stated in the commit message.