Skip to content

fix: handling ErrNamespaceNotFound for node v0.10.0#953

Merged
gupadhyaya merged 4 commits into
evstack:mainfrom
distractedm1nd:node_upgrade_v0.10.0
Jun 8, 2023
Merged

fix: handling ErrNamespaceNotFound for node v0.10.0#953
gupadhyaya merged 4 commits into
evstack:mainfrom
distractedm1nd:node_upgrade_v0.10.0

Conversation

@distractedm1nd

Copy link
Copy Markdown
Contributor

We have made a breaking change in node v0.10.0 regarding data requests. celestiaorg/celestia-node@62a0b97

Previously, a "data not found" meant both "the node failed to retrieve that data, please retry" as well as "the node retrieved the data successfully, but your namespace is not included in the data". We have split these errors, so now we have an ErrNamespaceNotFound which indicates a successful request for empty data.

I have no idea how to test rollkit so would love some help on getting that set up

@codecov

codecov Bot commented May 23, 2023

Copy link
Copy Markdown

Codecov Report

Patch coverage: 50.00% and project coverage change: -0.89 ⚠️

Comparison is base (5766a91) 55.75% compared to head (f13c2bc) 54.87%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #953      +/-   ##
==========================================
- Coverage   55.75%   54.87%   -0.89%     
==========================================
  Files          59       60       +1     
  Lines        6268     6380     +112     
==========================================
+ Hits         3495     3501       +6     
- Misses       2418     2525     +107     
+ Partials      355      354       -1     
Impacted Files Coverage Δ
da/celestia/celestia.go 7.14% <50.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@gupadhyaya

Copy link
Copy Markdown
Contributor

@distractedm1nd
tested without your PR using local-devnet v0.10.0 and got:

10:37PM ERR failed to retrieve block from DALC daHeight=14 errors="failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data; failed to retrieve block: getter/store: failed to retrieve shares by namespace: share: namespace not found in data" module=BlockManager

which disappeared after your fix. so, it is working.

however, we still have this error:

10:38PM ERR failed to retrieve block from DALC daHeight=27 errors="failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later; failed to retrieve block: current head local chain head: 26 is lower than requested height: 27 give header sync some time and retry later" module=BlockManager

while it is not preventing the block production, i think it is premature retrieve blocks error. any easy way to handle that?

@Wondertan

Copy link
Copy Markdown
Contributor

@gupadhyaya, via API we now always allow for height head+1 to be requested before throwing this error.
@distractedm1nd. I think we should do the same for Gateway or ensure Gateway uses this API codepath

@gupadhyaya

Copy link
Copy Markdown
Contributor

@gupadhyaya, via API we now always allow for height head+1 to be requested before throwing this error. @distractedm1nd. I think we should do the same for Gateway or ensure Gateway uses this API codepath

@Wondertan so where should we make the fix?

@Wondertan

Copy link
Copy Markdown
Contributor

@gupadhyaya, in the node

@distractedm1nd

Copy link
Copy Markdown
Contributor Author

@gupadhyaya Can you check celestiaorg/celestia-node#2291?

@gupadhyaya

Copy link
Copy Markdown
Contributor

@gupadhyaya Can you check celestiaorg/celestia-node#2291?

thanks @distractedm1nd what is the easiest way to test this? I tried to build a local image and then run local-devnet with latest celestia-app and running into panics.

@gupadhyaya

Copy link
Copy Markdown
Contributor

@Wondertan @distractedm1nd just tested rollkit using node 0.10.3 and i confirm that fix is working and no longer have local/current head issue. I think this PR can be merged now.

@gupadhyaya

Copy link
Copy Markdown
Contributor

@nashqueue @tzdybal could you review this PR?

@gupadhyaya gupadhyaya added the T:bug Something isn't working label Jun 2, 2023
@gupadhyaya gupadhyaya force-pushed the node_upgrade_v0.10.0 branch from d3718b1 to f4ff976 Compare June 6, 2023 02:54
gupadhyaya
gupadhyaya previously approved these changes Jun 6, 2023
Comment thread da/celestia/celestia.go

@MSevey MSevey left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need helper function needs a unit test

MSevey pushed a commit that referenced this pull request Jun 6, 2023
This PR fixes the data race issue caused in the header exchange when the
syncer status is checked (in stop) and set (in start , separate go
routine).

Also fixes the golangci-lint errors. 

Blocking #963 and
#953 from merge.

---------

Co-authored-by: Ganesha Upadhyaya <gupadhyaya@Ganeshas-MacBook-Pro-2.local>
@gupadhyaya gupadhyaya force-pushed the node_upgrade_v0.10.0 branch 3 times, most recently from 7a3118e to c23e749 Compare June 8, 2023 00:10
@gupadhyaya gupadhyaya enabled auto-merge June 8, 2023 00:38
@gupadhyaya gupadhyaya force-pushed the node_upgrade_v0.10.0 branch from c23e749 to eea4800 Compare June 8, 2023 12:45
@gupadhyaya gupadhyaya self-requested a review June 8, 2023 13:45
@gupadhyaya gupadhyaya added this pull request to the merge queue Jun 8, 2023
Merged via the queue into evstack:main with commit 784a455 Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T:bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants