Skip to content

channeldb+lnrpc+lncli: listpayments pagination support#3960

Merged
Roasbeef merged 6 commits into
lightningnetwork:masterfrom
bitromortac:listpayments-pagination
Apr 8, 2020
Merged

channeldb+lnrpc+lncli: listpayments pagination support#3960
Roasbeef merged 6 commits into
lightningnetwork:masterfrom
bitromortac:listpayments-pagination

Conversation

@bitromortac
Copy link
Copy Markdown
Collaborator

This PR tries to address issue #3753, pagination support for the ListPayments api.

The goal of making the ListPayments command paginable is to turn the retrival of conducted
payments more scalable and seekable (e.g. for mobile wallets).

The implementation closely follows the behavior of the ListInvoices api,
modifying the grpc ListPaymentsRequest to also include an index offset, the option
to sepcify the pagination direction, and to give a maximal number of returned payments constraint.
The call returns a ListPaymentsResponse where in addition to the payments, the first and
last index of the payments are included, which can be used to further seek forwards or
backwards. By default, the api call returns the same payments as before.

The old behavior of ListPayments is to return all payments via the FetchPayments method.
In the implementation of queried ListPayments in this PR, still all payments are loaded
into memory first by using the FetchPayments method and then discarding the unwanted entries.

This means, that memory usage is the same as before, while adding more functionality for the caller.
The reason why I decided to do that instead of reproducing the implementaion of ListInvoices
(where the database is seeked with a cursor) is that the the payments database structure
is complicated, with subbuckets for duplicate payments of the same payment hash on the one side and that the payments bucket is indexed by the payment hash instead of the sequence index (which could be used to walk through them in order). This could be addressed in a future PR, making the call more memory effective, by droppping duplicate payments and creating a bucket with a mapping from the squence index to the payment hash.

Pull Request Checklist

  • All changes are Go version 1.12 compliant
  • The code being submitted is commented according to Code Documentation and Commenting
  • For new code: Code is accompanied by tests which exercise both
    the positive and negative (error paths) conditions (if applicable)
  • Code has been formatted with go fmt
  • For code and documentation: lines are wrapped at 80 characters
    (the tab character should be counted as 8 characters, not 4, as some IDEs do
    per default)
  • Running make check does not fail any tests
  • Running go vet does not report any issues
  • Running make lint does not report any new issues that did not
    already exist
  • All commits build properly and pass tests. Only in exceptional
    cases it can be justifiable to violate this condition. In that case, the
    reason should be stated in the commit message.
  • Commits have a logical structure according to Ideal Git Commit Structure

@carlaKC
Copy link
Copy Markdown
Collaborator

carlaKC commented Jan 27, 2020

Assigning myself because this falls under #3942.
Thanks for the PR @bitromortac 🎉

@carlaKC carlaKC requested review from carlaKC and removed request for halseth January 27, 2020 08:43
@carlaKC carlaKC self-assigned this Jan 27, 2020
@carlaKC carlaKC added accounting rpc Related to the RPC interface v0.10 labels Jan 27, 2020
Copy link
Copy Markdown
Contributor

@bjarnemagnussen bjarnemagnussen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Nice work! I was just looking into your PR and found a tiny typo that can trigger an index-out-of-range error. See my comments for more details!

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments_test.go Outdated
@bitromortac bitromortac force-pushed the listpayments-pagination branch from e3d2e5c to 0f8a322 Compare January 31, 2020 06:17
@bitromortac
Copy link
Copy Markdown
Collaborator Author

Hey @bjarnemagnussen, thank you very much for your review!

Comment thread channeldb/payments.go Outdated
Copy link
Copy Markdown
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @bitromortac!
Some nits and structural changes but this is looking nice 🐳

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments_test.go Outdated
Comment thread rpcserver.go Outdated
Comment thread lnrpc/rpc.proto Outdated
Comment thread cmd/lncli/commands.go Outdated
Comment thread cmd/lncli/commands.go Outdated
Comment thread channeldb/payments.go Outdated
@bitromortac
Copy link
Copy Markdown
Collaborator Author

bitromortac commented Feb 27, 2020

Since #4003 is merged, I made the flag for pagination consistent with the listinvoices command.
Also rebased on master.

@bitromortac bitromortac force-pushed the listpayments-pagination branch from 94cc676 to ae30d53 Compare February 27, 2020 09:34
Copy link
Copy Markdown
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another look at this. I don't think that the current approach of listing all payments and then filtering them for pagination is a good direction to go in. Say we are looking up 10k payments, 1k at a time, you'd end up looking up 100k payments, which is a performance decrease rather than the increase that we expect with pagination.

I think the best approach is to add indexing by sequence number. That's a scope increase, and a db migration. WDYT @joostjager ?

Comment thread channeldb/payments_test.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this setup code can be moved inside of the t.Run function so that each test starts with a fresh setup.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it that way, if you're ok with it, otherwise I would have to repeat the setup code for the last part of the test and all the tests are read-only on the database anyhow.

Comment thread channeldb/payments_test.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread lnrpc/rpc.proto Outdated
@carlaKC carlaKC requested a review from joostjager March 5, 2020 20:19
Copy link
Copy Markdown
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it isn't ideal indeed, loading all payments in memory for every call. If the user wants to retrieve all payments in pages, it doesn't seem to make much sense because as @carlaKC says, the performance will be worse. If the caller stores payments and only requests new pages, there is less data to transfer. Could be an advantage depending on implementation.

I understand that adding another index is a much bigger project, but how about filtering the payments further down the call stack, in DB.FetchPayments? Pass in a sequence number range and skip everything not in the range. Memory usage will be lower and also less deserialization needs to happen.

@bitromortac
Copy link
Copy Markdown
Collaborator Author

bitromortac commented Mar 10, 2020

Thanks for the feedback! I agree with the total resource usage argument, it's not ideal. In that sense it is good that default behavior in this PR (returning all payments) is not changed and therefore the resource consumption stays the same for that call. I would rather see this PR as a preparatory one, which burdens a little bit more the server side (for subsequent pagination), but adds already benefit for callers not having to transfer as much data as before, improving latency and client side data handling. The code is already prepared for a cursor-style traversal of the data, which would need another bucket which maps from the sequenceIndex to the payment hash.

Concerning the propagation down the call stack, that could be done going down until fetchPayment to get parsing and memory benefits. Initially, I determined the first and last index for the range, but then encountered the problem that you don't know beforehand how many payments will be excluded with the IncludeIncomplete flag (in conjunction with the NumMaxPayments). I found the code also easier to understand in the way it's implemented now and was interested to work towards the ideal solution.

The ideal way of implementing this would be using a database cursor like it is done in listinvoices. We could postpone this PR until the payments database is cleaned up and then use the cursor. I would be interested in taking care of the cleanup.

The big question is how to deal with the duplicate payments to the same payment hashes. I don't know what's the policy about deleting user data, but I'd suggest to do a migration, which discards duplicate payments (they were forbidden already some time ago, see #1719) and log the discarded duplicate payments upon migration. That would get rid of the legacy duplicate payments database structure and would simplify things. I'll nevertheless update requested changes.

@bitromortac bitromortac force-pushed the listpayments-pagination branch 2 times, most recently from 4beb8aa to 942cf0d Compare March 11, 2020 11:12
@carlaKC
Copy link
Copy Markdown
Collaborator

carlaKC commented Mar 16, 2020

Some high sending nodes are already running into issues here, so going to go ahead with this for 0.10 😄 As you say, performance not affected for the default use case.

Is this ready for another round of review @bitromortac?

@bitromortac
Copy link
Copy Markdown
Collaborator Author

Some high sending nodes are already running into issues here, so going to go ahead with this for 0.10 smile As you say, performance not affected for the default use case.

Is this ready for another round of review @bitromortac?

Oh, didn't know that it's already urgent. Yes, I tried to incorporate all requested changes.

Second thought on @joostjager's suggestion: we could drop the IncludeIncomplete flag and include all payments by default. In this way we can determine firstIndex and lastIndex deterministically and filter down in the stack, getting a performance improvement.

@carlaKC
Copy link
Copy Markdown
Collaborator

carlaKC commented Mar 16, 2020

we could drop the IncludeIncomplete flag and include all payments by default.

From rpcserver.go#ListPayments: To keep compatibility with the old API, we only return non-succeeded payments if requested. Would say that we still want to maintain backwards compatibility here. People are pretty reliant on ListPayments and could see some unexpected behaviour if they don't notice this in the changelog.

@carlaKC
Copy link
Copy Markdown
Collaborator

carlaKC commented Mar 16, 2020

Yes, I tried to incorporate all requested changes.

Great! Don't forget to re-request review when something is ready, otherwise we might miss it because the request doesn't show up on Github :) Also needs a rebase ⚾️

@carlaKC carlaKC self-requested a review March 17, 2020 18:08
Copy link
Copy Markdown
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, thanks for the hard yards @bitromortac!

Only remaining comment is about test coverage, I think adding two more cases would be beneficial. Since this is pretty edge-casey stuff.

should we also export the sequence_index in the RPC to inform the consumer?

I think that exposing sequence index makes sense. I think it isn't exposed because there are gaps, but IMO we can expose it with an explanatory comment that it is unique and strictly increasing is sufficient. A unique id which isn't payment hash (historically non-unique) is useful.

it might be confusing for a consumer of the API if the sequence_index is gapped

On sequence number gaps, I think that we should live with them. When we index payments, we will do so by sequence number, so it makes sense to use sequence number as the index in this workaround.

Also needs another rebase, lnrpc-rebase-life is the realest 😅

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments_test.go Outdated
Comment thread lnrpc/rpc.proto Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment mentioning that this index is exclusive with the default 0 meaning all payments is sufficient here.

Copy link
Copy Markdown
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts from me, mostly in the category 'what else could work'. Maybe there is something in it and otherwise just ignore.

Comment thread channeldb/payments.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document special case 0 in combined with Reversed? Would almost think that it is more natural to use maxint as the special value. It would then not be a special value anymore. This can also be done while still holding on to the magic 0 on the rpc level.

Also, the query doesn't necessarily start one index higher than IndexOffset. If there are gaps, it may start more than one index higher. I get the point, but it isn't fully correct?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, have corrected both comments. Concerning the maxint, it also works when supplied, but left zero to be the documented special value.

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread lnrpc/rpc.proto Outdated
Comment thread lnrpc/rpc.proto Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes can be reverted, probably caused by a different version of clang

Copy link
Copy Markdown
Collaborator Author

@bitromortac bitromortac Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have reverted the changes, which version of clang are you using? Somehow my make rpc-format messes things up.

Comment thread lnrpc/rpc.proto Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of these two fields, how about exposing the sequence number on the payment? The caller can then figure out min and max themselves and it also creates a stronger link between the query and the response

@halseth
Copy link
Copy Markdown
Contributor

halseth commented Mar 26, 2020

Ok, so it's the question about whether we want to keep the indices constant or make them incremental by one, if there is a gap in the sequece_index. listinvoices keeps the index constant and also exports the equivalent add_index in the RPC call. On the other hand it might be confusing for a consumer of the API if the sequence_index is gapped, should we also export the sequence_index in the RPC to inform the consumer?

I like your approach more (great work!), it is more in line what exists in listinvoices (also deletion of payments is hypothetical). I updated the commits to use your approach and also the tests (is there a better git way to acknowledge your work?). Haven't worked on the payment exclusion part yet.

I like the idea of just using the sequence_num as a payment identifier! Since some old database actually can have duplicate payments to the same payment hash, the sequence_num is the only unique payment identifier we have. There might be gaps in the list, but that's okay, since we also want to be able to delete payments without changing the identifiers.

What we could do is exposing this sequence number on the RPC (maybe call it something more descriptive, like payment_id) and then use that to paginate (making it clear that they are increasing in order of payment attempt time). We wouldn't even need the first_index_offset and last_index_offset values, as the next offset for pagination could be retrieved from the last list of payments retrieved.

@bitromortac
Copy link
Copy Markdown
Collaborator Author

The latest state of the PR includes improved unit testing, where the case of a gap in the indices is covered. Also, the SequenceIndex is now exported in the rpc as the field payment_index (instead of Johan's suggestion, I took index instead of id, because we have index everywhere 🙂).
The last point would be if we should leave away the first and last index in the response. Pro keeping it would be that it's working the same way as in listinvoices and it adds additional explicit documentation to the rpc, but on the other hand it's redundant 🤔.

Copy link
Copy Markdown
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the state this is in. Great job on this one @bitromortac 🥇
I think it's fine to leave first/last index in the response, since it's the way we have it for invoices (which also have their index exposed, so it's also redundant there), but no strong feelings on this.

Just a few more comment level-nits, and since this needs a rebase anyway, the json_name tags should go before merge.

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments_test.go Outdated
Comment thread lnrpc/rpc.proto Outdated
Comment thread lnrpc/rpc.proto Outdated
@bitromortac bitromortac force-pushed the listpayments-pagination branch from 99abf2e to 7221077 Compare April 1, 2020 07:50
@bitromortac
Copy link
Copy Markdown
Collaborator Author

Thank you already for the patience @carlaKC and @joostjager 🙂. I would also leave the first and last index fields for now, if you're not opposed to this. I'll keep on rebasing.

Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payments_test.go Outdated
Comment thread cmd/lncli/commands.go Outdated
@bitromortac bitromortac force-pushed the listpayments-pagination branch 2 times, most recently from 0c7c676 to 90c610b Compare April 6, 2020 05:11
@bitromortac bitromortac requested a review from joostjager April 6, 2020 05:13
Copy link
Copy Markdown
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only comment standing from me is to remove the unused InvoiceQuery return parameter. If we need it (seems unlikely to me), we can add it back in.

@bitromortac bitromortac force-pushed the listpayments-pagination branch from 90c610b to 6328891 Compare April 6, 2020 17:14
@bitromortac
Copy link
Copy Markdown
Collaborator Author

bitromortac commented Apr 7, 2020

Only comment standing from me is to remove the unused InvoiceQuery return parameter. If we need it (seems unlikely to me), we can add it back in.

Removed the initial query parameter form the PaymentsResponse. Thank you again for reviewing! Also rebased.

Exports sequenceNum in MPPayment for later use
in the rpcserver.
Adds a PaymentsQuery struct, which contains parameters to restrict the
response of QueryPayments, returning a PaymentsQuerySlice with the
payments query result. The behavior of this api is the same as
the QueryInvoices one.
Adds tests for the payments query, where different edge cases for
the index offsets and normal and reversed orders are tested.
Changes the grpc proto file, generates the protobuf, and
enables a queried way to retrieve payments in the rpc, where
backward compatibility is enforced by returning all payments
in the database by default. Adds a payment index field to
the returned payments of the rpc call.
@bitromortac bitromortac force-pushed the listpayments-pagination branch from 6328891 to 4593cfa Compare April 7, 2020 05:06
Comment thread channeldb/payments.go
// we set our limit to maxint to include all payments from the highest
// sequence number on.
if query.Reversed && indexExclusiveLimit == 0 {
indexExclusiveLimit = math.MaxInt64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final comment about this is that this non-functional special casing could also be moved into rpcserver.go. For MaxPayments, the special case zero is handled there as well.

@Roasbeef Roasbeef added this to the 0.10.0 milestone Apr 8, 2020
@Roasbeef Roasbeef merged commit 7e6f3ec into lightningnetwork:master Apr 8, 2020
@bitromortac bitromortac deleted the listpayments-pagination branch April 17, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

accounting rpc Related to the RPC interface

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants