channeldb+lnrpc+lncli: listpayments pagination support by bitromortac · Pull Request #3960 · lightningnetwork/lnd

bitromortac · 2020-01-27T07:05:15Z

This PR tries to address issue #3753, pagination support for the ListPayments api.

The goal of making the ListPayments command paginable is to turn the retrival of conducted
payments more scalable and seekable (e.g. for mobile wallets).

The implementation closely follows the behavior of the ListInvoices api,
modifying the grpc ListPaymentsRequest to also include an index offset, the option
to sepcify the pagination direction, and to give a maximal number of returned payments constraint.
The call returns a ListPaymentsResponse where in addition to the payments, the first and
last index of the payments are included, which can be used to further seek forwards or
backwards. By default, the api call returns the same payments as before.

The old behavior of ListPayments is to return all payments via the FetchPayments method.
In the implementation of queried ListPayments in this PR, still all payments are loaded
into memory first by using the FetchPayments method and then discarding the unwanted entries.

This means, that memory usage is the same as before, while adding more functionality for the caller.
The reason why I decided to do that instead of reproducing the implementaion of ListInvoices
(where the database is seeked with a cursor) is that the the payments database structure
is complicated, with subbuckets for duplicate payments of the same payment hash on the one side and that the payments bucket is indexed by the payment hash instead of the sequence index (which could be used to walk through them in order). This could be addressed in a future PR, making the call more memory effective, by droppping duplicate payments and creating a bucket with a mapping from the squence index to the payment hash.

Pull Request Checklist

carlaKC · 2020-01-27T08:43:45Z

Assigning myself because this falls under #3942.
Thanks for the PR @bitromortac 🎉

bjarnemagnussen

Hi,

Nice work! I was just looking into your PR and found a tiny typo that can trigger an index-out-of-range error. See my comments for more details!

bitromortac · 2020-01-31T06:22:38Z

Hey @bjarnemagnussen, thank you very much for your review!

carlaKC

Thanks for the PR @bitromortac!
Some nits and structural changes but this is looking nice 🐳

bitromortac · 2020-02-27T09:19:38Z

Since #4003 is merged, I made the flag for pagination consistent with the listinvoices command.
Also rebased on master.

carlaKC

Had another look at this. I don't think that the current approach of listing all payments and then filtering them for pagination is a good direction to go in. Say we are looking up 10k payments, 1k at a time, you'd end up looking up 100k payments, which is a performance decrease rather than the increase that we expect with pagination.

I think the best approach is to add indexing by sequence number. That's a scope increase, and a db migration. WDYT @joostjager ?

carlaKC · 2020-02-27T13:55:15Z

All of this setup code can be moved inside of the t.Run function so that each test starts with a fresh setup.

I would leave it that way, if you're ok with it, otherwise I would have to repeat the setup code for the last part of the test and all the tests are read-only on the database anyhow.

joostjager

Yes, it isn't ideal indeed, loading all payments in memory for every call. If the user wants to retrieve all payments in pages, it doesn't seem to make much sense because as @carlaKC says, the performance will be worse. If the caller stores payments and only requests new pages, there is less data to transfer. Could be an advantage depending on implementation.

I understand that adding another index is a much bigger project, but how about filtering the payments further down the call stack, in DB.FetchPayments? Pass in a sequence number range and skip everything not in the range. Memory usage will be lower and also less deserialization needs to happen.

bitromortac · 2020-03-10T12:37:15Z

Thanks for the feedback! I agree with the total resource usage argument, it's not ideal. In that sense it is good that default behavior in this PR (returning all payments) is not changed and therefore the resource consumption stays the same for that call. I would rather see this PR as a preparatory one, which burdens a little bit more the server side (for subsequent pagination), but adds already benefit for callers not having to transfer as much data as before, improving latency and client side data handling. The code is already prepared for a cursor-style traversal of the data, which would need another bucket which maps from the sequenceIndex to the payment hash.

Concerning the propagation down the call stack, that could be done going down until fetchPayment to get parsing and memory benefits. Initially, I determined the first and last index for the range, but then encountered the problem that you don't know beforehand how many payments will be excluded with the IncludeIncomplete flag (in conjunction with the NumMaxPayments). I found the code also easier to understand in the way it's implemented now and was interested to work towards the ideal solution.

The ideal way of implementing this would be using a database cursor like it is done in listinvoices. We could postpone this PR until the payments database is cleaned up and then use the cursor. I would be interested in taking care of the cleanup.

The big question is how to deal with the duplicate payments to the same payment hashes. I don't know what's the policy about deleting user data, but I'd suggest to do a migration, which discards duplicate payments (they were forbidden already some time ago, see #1719) and log the discarded duplicate payments upon migration. That would get rid of the legacy duplicate payments database structure and would simplify things. I'll nevertheless update requested changes.

carlaKC · 2020-03-16T06:38:13Z

Some high sending nodes are already running into issues here, so going to go ahead with this for 0.10 😄 As you say, performance not affected for the default use case.

Is this ready for another round of review @bitromortac?

bitromortac · 2020-03-16T06:57:16Z

Some high sending nodes are already running into issues here, so going to go ahead with this for 0.10 smile As you say, performance not affected for the default use case.

Is this ready for another round of review @bitromortac?

Oh, didn't know that it's already urgent. Yes, I tried to incorporate all requested changes.

Second thought on @joostjager's suggestion: we could drop the IncludeIncomplete flag and include all payments by default. In this way we can determine firstIndex and lastIndex deterministically and filter down in the stack, getting a performance improvement.

carlaKC · 2020-03-16T11:16:42Z

we could drop the IncludeIncomplete flag and include all payments by default.

From rpcserver.go#ListPayments: To keep compatibility with the old API, we only return non-succeeded payments if requested. Would say that we still want to maintain backwards compatibility here. People are pretty reliant on ListPayments and could see some unexpected behaviour if they don't notice this in the changelog.

carlaKC · 2020-03-16T11:27:50Z

Yes, I tried to incorporate all requested changes.

Great! Don't forget to re-request review when something is ready, otherwise we might miss it because the request doesn't show up on Github :) Also needs a rebase ⚾️

carlaKC

This is looking really good, thanks for the hard yards @bitromortac!

Only remaining comment is about test coverage, I think adding two more cases would be beneficial. Since this is pretty edge-casey stuff.

should we also export the sequence_index in the RPC to inform the consumer?

I think that exposing sequence index makes sense. I think it isn't exposed because there are gaps, but IMO we can expose it with an explanatory comment that it is unique and strictly increasing is sufficient. A unique id which isn't payment hash (historically non-unique) is useful.

it might be confusing for a consumer of the API if the sequence_index is gapped

On sequence number gaps, I think that we should live with them. When we index payments, we will do so by sequence number, so it makes sense to use sequence number as the index in this workaround.

Also needs another rebase, lnrpc-rebase-life is the realest 😅

carlaKC · 2020-03-26T06:57:17Z

I think a comment mentioning that this index is exclusive with the default 0 meaning all payments is sufficient here.

joostjager

Some thoughts from me, mostly in the category 'what else could work'. Maybe there is something in it and otherwise just ignore.

joostjager · 2020-03-26T08:59:01Z

Document special case 0 in combined with Reversed? Would almost think that it is more natural to use maxint as the special value. It would then not be a special value anymore. This can also be done while still holding on to the magic 0 on the rpc level.

Also, the query doesn't necessarily start one index higher than IndexOffset. If there are gaps, it may start more than one index higher. I get the point, but it isn't fully correct?

Right, have corrected both comments. Concerning the maxint, it also works when supplied, but left zero to be the documented special value.

joostjager · 2020-03-26T09:14:51Z

These changes can be reverted, probably caused by a different version of clang

Have reverted the changes, which version of clang are you using? Somehow my make rpc-format messes things up.

joostjager · 2020-03-26T09:16:41Z

Instead of these two fields, how about exposing the sequence number on the payment? The caller can then figure out min and max themselves and it also creates a stronger link between the query and the response

halseth · 2020-03-26T10:09:52Z

Ok, so it's the question about whether we want to keep the indices constant or make them incremental by one, if there is a gap in the sequece_index. listinvoices keeps the index constant and also exports the equivalent add_index in the RPC call. On the other hand it might be confusing for a consumer of the API if the sequence_index is gapped, should we also export the sequence_index in the RPC to inform the consumer?

I like your approach more (great work!), it is more in line what exists in listinvoices (also deletion of payments is hypothetical). I updated the commits to use your approach and also the tests (is there a better git way to acknowledge your work?). Haven't worked on the payment exclusion part yet.

I like the idea of just using the sequence_num as a payment identifier! Since some old database actually can have duplicate payments to the same payment hash, the sequence_num is the only unique payment identifier we have. There might be gaps in the list, but that's okay, since we also want to be able to delete payments without changing the identifiers.

What we could do is exposing this sequence number on the RPC (maybe call it something more descriptive, like payment_id) and then use that to paginate (making it clear that they are increasing in order of payment attempt time). We wouldn't even need the first_index_offset and last_index_offset values, as the next offset for pagination could be retrieved from the last list of payments retrieved.

bitromortac · 2020-03-30T17:35:31Z

The latest state of the PR includes improved unit testing, where the case of a gap in the indices is covered. Also, the SequenceIndex is now exported in the rpc as the field payment_index (instead of Johan's suggestion, I took index instead of id, because we have index everywhere 🙂).
The last point would be if we should leave away the first and last index in the response. Pro keeping it would be that it's working the same way as in listinvoices and it adds additional explicit documentation to the rpc, but on the other hand it's redundant 🤔.

carlaKC

I'm happy with the state this is in. Great job on this one @bitromortac 🥇
I think it's fine to leave first/last index in the response, since it's the way we have it for invoices (which also have their index exposed, so it's also redundant there), but no strong feelings on this.

Just a few more comment level-nits, and since this needs a rebase anyway, the json_name tags should go before merge.

bitromortac · 2020-04-01T08:00:59Z

Thank you already for the patience @carlaKC and @joostjager 🙂. I would also leave the first and last index fields for now, if you're not opposed to this. I'll keep on rebasing.

joostjager

Only comment standing from me is to remove the unused InvoiceQuery return parameter. If we need it (seems unlikely to me), we can add it back in.

bitromortac · 2020-04-07T05:01:06Z

Only comment standing from me is to remove the unused InvoiceQuery return parameter. If we need it (seems unlikely to me), we can add it back in.

Removed the initial query parameter form the PaymentsResponse. Thank you again for reviewing! Also rebased.

Exports sequenceNum in MPPayment for later use in the rpcserver.

Adds a PaymentsQuery struct, which contains parameters to restrict the response of QueryPayments, returning a PaymentsQuerySlice with the payments query result. The behavior of this api is the same as the QueryInvoices one.

Adds tests for the payments query, where different edge cases for the index offsets and normal and reversed orders are tested.

Changes the grpc proto file, generates the protobuf, and enables a queried way to retrieve payments in the rpc, where backward compatibility is enforced by returning all payments in the database by default. Adds a payment index field to the returned payments of the rpc call.

joostjager · 2020-04-07T14:22:59Z

+	// we set our limit to maxint to include all payments from the highest
+	// sequence number on.
+	if query.Reversed && indexExclusiveLimit == 0 {
+		indexExclusiveLimit = math.MaxInt64


Final comment about this is that this non-functional special casing could also be moved into rpcserver.go. For MaxPayments, the special case zero is handled there as well.

bitromortac requested review from halseth and joostjager as code owners January 27, 2020 07:05

carlaKC requested review from carlaKC and removed request for halseth January 27, 2020 08:43

carlaKC self-assigned this Jan 27, 2020

carlaKC added accounting rpc Related to the RPC interface v0.10 labels Jan 27, 2020

bjarnemagnussen reviewed Jan 30, 2020

View reviewed changes

Comment thread channeldb/payments.go Outdated

Comment thread channeldb/payments_test.go Outdated

bitromortac force-pushed the listpayments-pagination branch from e3d2e5c to 0f8a322 Compare January 31, 2020 06:17

joostjager reviewed Jan 31, 2020

View reviewed changes

Comment thread channeldb/payments.go Outdated

carlaKC requested changes Feb 7, 2020

View reviewed changes

bitromortac force-pushed the listpayments-pagination branch from 0f8a322 to 1fd34f9 Compare February 12, 2020 07:03

bitromortac mentioned this pull request Feb 14, 2020

lncli: change listinvoices reversed flag to paginate-forwards #4003

Merged

bitromortac force-pushed the listpayments-pagination branch from 1fd34f9 to 94cc676 Compare February 27, 2020 09:15

bitromortac force-pushed the listpayments-pagination branch from 94cc676 to ae30d53 Compare February 27, 2020 09:34

carlaKC requested changes Feb 27, 2020

View reviewed changes

carlaKC requested a review from joostjager March 5, 2020 20:19

joostjager reviewed Mar 6, 2020

View reviewed changes

bitromortac force-pushed the listpayments-pagination branch 2 times, most recently from 4beb8aa to 942cf0d Compare March 11, 2020 11:12

carlaKC self-requested a review March 17, 2020 18:08

carlaKC reviewed Mar 26, 2020

View reviewed changes

joostjager reviewed Mar 26, 2020

View reviewed changes

Horndev mentioned this pull request Mar 26, 2020

LND becomes unresponsive and fills RAM/swap after a few 'listpayments' calls #4119

Closed

bitromortac force-pushed the listpayments-pagination branch from 22e0d0b to 99abf2e Compare March 30, 2020 17:15

bitromortac requested review from carlaKC and joostjager March 30, 2020 17:35

carlaKC approved these changes Mar 31, 2020

View reviewed changes

Comment thread channeldb/payments.go Outdated

Comment thread channeldb/payments_test.go Outdated

Comment thread lnrpc/rpc.proto Outdated

Comment thread lnrpc/rpc.proto Outdated

bitromortac force-pushed the listpayments-pagination branch from 99abf2e to 7221077 Compare April 1, 2020 07:50

joostjager reviewed Apr 1, 2020

View reviewed changes

Comment thread channeldb/payments.go Outdated

Comment thread channeldb/payments.go Outdated

Comment thread channeldb/payments.go Outdated

Comment thread channeldb/payments_test.go Outdated

Comment thread cmd/lncli/commands.go Outdated

bitromortac force-pushed the listpayments-pagination branch 2 times, most recently from 0c7c676 to 90c610b Compare April 6, 2020 05:11

bitromortac requested a review from joostjager April 6, 2020 05:13

joostjager reviewed Apr 6, 2020

View reviewed changes

bitromortac force-pushed the listpayments-pagination branch from 90c610b to 6328891 Compare April 6, 2020 17:14

bitromortac added 6 commits April 7, 2020 07:03

channeldb: export sequenceNum in MPPayment

4800a84

Exports sequenceNum in MPPayment for later use in the rpcserver.

channeldb: add payments query

4c5e8ae

Adds a PaymentsQuery struct, which contains parameters to restrict the response of QueryPayments, returning a PaymentsQuerySlice with the payments query result. The behavior of this api is the same as the QueryInvoices one.

channeldb/test: unit tests for payments query

d5dd48f

Adds tests for the payments query, where different edge cases for the index offsets and normal and reversed orders are tested.

itest: fix comment in list_outgoing_payments test

97b7597

lncli: modify listpayments to use queried payments and update cli docs

4593cfa

bitromortac force-pushed the listpayments-pagination branch from 6328891 to 4593cfa Compare April 7, 2020 05:06

joostjager approved these changes Apr 7, 2020

View reviewed changes

Roasbeef added this to the 0.10.0 milestone Apr 8, 2020

Roasbeef merged commit 7e6f3ec into lightningnetwork:master Apr 8, 2020

Roasbeef mentioned this pull request Apr 8, 2020

ListPayments should offer pagination #3753

Closed

bitromortac deleted the listpayments-pagination branch April 17, 2024 12:44

Conversation

bitromortac commented Jan 27, 2020

Pull Request Checklist

Uh oh!

carlaKC commented Jan 27, 2020

Uh oh!

bjarnemagnussen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bitromortac commented Jan 31, 2020

Uh oh!

Uh oh!

carlaKC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bitromortac commented Feb 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carlaKC left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joostjager left a comment

Choose a reason for hiding this comment

Uh oh!

bitromortac commented Mar 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carlaKC commented Mar 16, 2020

Uh oh!

bitromortac commented Mar 16, 2020

Uh oh!

carlaKC commented Mar 16, 2020

Uh oh!

carlaKC commented Mar 16, 2020

Uh oh!

carlaKC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitromortac Mar 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

halseth commented Mar 26, 2020

Uh oh!

bitromortac commented Feb 27, 2020 •

edited

Loading

bitromortac commented Mar 10, 2020 •

edited

Loading

bitromortac Mar 30, 2020 •

edited

Loading

bitromortac commented Apr 7, 2020 •

edited

Loading