Skip to content

Hbase 22567#3

Closed
wchevreuil wants to merge 17 commits into
apache:masterfrom
wchevreuil:HBASE-22567
Closed

Hbase 22567#3
wchevreuil wants to merge 17 commits into
apache:masterfrom
wchevreuil:HBASE-22567

Conversation

@wchevreuil
Copy link
Copy Markdown
Contributor

First PR for addMissingRegionsToMeta

@asfgit
Copy link
Copy Markdown

asfgit commented Jun 11, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/4/

return EXIT_FAILURE;
}

int addMissingRegionsInMeta(String... tableNames) throws IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we break this apart into two steps?

  1. Report missing regions in meta
  2. Add a region to meta

We would be able to compose this higher-level tool then, in something like:

REGIONS=$(hbase hbck2 report_missing_regions)
for region in REGIONS; do
  hbase hbck2 add_region "region"
done

I think this would help keep the complexity of this method down, making it easier to test as a by-product. It would also let an admin inspect the output from the first step (making sure we want to re-create all of those regions, maybe doing some external validation) before trying to re-create them all in one step.

I think the tricky part would be coming up with the "format" to take the output from step1 and pass it to step2 with minimal "massaging".

WDYT about that, Wellington?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useful for backgound IMHO is git's "plumbing" vs "porcelain":

https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain

analogous here would be "report missing" and "add region" as plumbing level. The output of either need not be user friendly and should focus on being maintainable. (so like encoded region name for the output of report missing would be fine)

something like the wrapper Josh mentions would be a porcelain command, probably taking the encoded regions from report missing and providing a user friendly info (like table, region keys, etc). it'd be useful for it to have a --dryrun mode that skipped calling add_region so that operators could get the more user friendly version of info about what regions appear to be missing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshelser @busbey , Sorry for the delay. I think these are all good ideas. Had pushed a commit where I'm trying to implement the plumbing/porcelain approach within 3 methods: addMissingRegionsInMeta, addMissingRegionsInMetaForTables and reportTablesWithMissingRegionsInMeta. The original porcelain method is now addMissingRegionsInMetaForTables. It combines addMissingRegionsInMeta and reportTablesWithMissingRegionsInMeta calls to add regions back in meta. reportTablesWithMissingRegionsInMeta can is also exposed as a CLI method, as a mean to provide a reporting only command for operators. Some additional UTs are still pending, but an initial manual testing on a broke cluster did look promising. Let me know what you guys think about these, while I will add more UTs for those.

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
try {
admin.disableTable(tableName);
} catch (IOException e) {
LOG.warn("Failed to disable table {}, "
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it safe to proceed even failing to disable the table?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to disable mostly to make sure we will not be dealing with regions that are transient, say if split/merge is happening while we scan meta. Assumption here is that if disable fails, the table is already somehow offline. That would be the case when even namespace table has missing regions, for example, so we need to proceed with regions re-insertion even if we are not able to disable table.

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
List<RegionInfo> regionInfos = MetaTableAccessor.
getTableRegions(this.conn, tableName, false);
for(final FileStatus regionDir : regionsDirs){
if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted some minor formatting issues and also can we put ".tabledesc" into a constant? thanks.
nit:
how about moving regionFoundInMeta() into a separate function that we can unit test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted some minor formatting issues and also can we put ".tabledesc" into a constant?

Indeed, moving it together with ".tmp" into constants. Will be available in next commit.

how about moving regionFoundInMeta() into a separate function that we can unit test?

I think it's already getting tested by the available tests for findMissingRegionsInMETA method. Since there's not any other client method requiring such logic, I didn't feel it should deserve its own separate method.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tabledesc not already in a constant over in hbase?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing ... need spaces around operators as we have in rest of codebase.

Yeah, is the .tmp defined already too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?

Copy link
Copy Markdown
Contributor Author

@wchevreuil wchevreuil Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tabledesc not already in a constant over in hbase?

I could only find a package private constant in FSTableDescriptors, so couldn't reuse it here.

Copy link
Copy Markdown
Contributor Author

@wchevreuil wchevreuil Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing ... need spaces around operators as we have in rest of codebase.

Addressed on current PR version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, is the .tmp defined already too?

Yep, found several constants defined for it, decided to use HConstants.HBASE_TEMP_DIRECTORY on current PR version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?

I'm not aware of any, but maybe there is. Any ideas on where about in the code to to look for? Tried review some of the hbase-server util package classes, but no luck.

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 1, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/5/

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/hbck2/meta/MetaFixer.java Outdated
Copy link
Copy Markdown
Contributor

@saintstack saintstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java
return EXIT_FAILURE;
}

Map<String,List<Path>> reportTablesWithMissingRegionsInMeta(String... nameSpaceOrTable)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've done the String that could be one of two things in the past and it proved more trouble than the seeming simplicity it promises. Something to watch out for.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if operator mixes table and namespace in the list passed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would still list missing regions specific for each namespace/table passed as parameter. Even if we pass a namespace and then a table within that namespace, output would show the missing regions grouped. For example, say namespace ns1 has two tables tbl1 and tbl2, and each of these two tables has a missing region r1 and r2, respectively. Then calling reportTablesWithMissingRegionsInMeta ns1 ns1:tbl1, should print:
ns1 -> r1 r2
ns1:tbl1 -> r1

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/hbck2/meta/MetaFixer.java Outdated
List<RegionInfo> regionInfos = MetaTableAccessor.
getTableRegions(this.conn, tableName, false);
for(final FileStatus regionDir : regionsDirs){
if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tabledesc not already in a constant over in hbase?

List<RegionInfo> regionInfos = MetaTableAccessor.
getTableRegions(this.conn, tableName, false);
for(final FileStatus regionDir : regionsDirs){
if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing ... need spaces around operators as we have in rest of codebase.

Yeah, is the .tmp defined already too?

List<RegionInfo> regionInfos = MetaTableAccessor.
getTableRegions(this.conn, tableName, false);
for(final FileStatus regionDir : regionsDirs){
if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?

}

public void putRegionInfoFromHdfsInMeta(Path region) throws IOException {
RegionInfo info = HRegionFileSystem.loadRegionInfoFileContent(fs, region);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a check we don't overlap with an existing region?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. My first idea was to just leave it break, and operator would need to take further actions, such as merge overlapping regions. We could add additional check, as a mean to warn operator that extra merge would be required later.

Any other thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tabledesc not already in a constant over in hbase?

I could only find a package private constant in FSTableDescriptors, so couldn't reuse it here.

Spacing ... need spaces around operators as we have in rest of codebase.

Addressed on current PR version.

Yeah, is the .tmp defined already too?

Yep, found several constants defined for it, decided to use HConstants.HBASE_TEMP_DIRECTORY on current PR version.

Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?

I'm not aware of any, but maybe there is. Any ideas on where about in the code to to look for? Tried review some of the hbase-server util package classes, but no luck.

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 4, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/6/

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 5, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/7/

Comment thread hbase-hbck2/pom.xml Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java
writer.println(" no matches in META, it reads regioninfo metadata file and ");
writer.println(" re-creates given region in META. Regions are re-created in 'CLOSED' ");
writer.println(" state at META table only, but not in Masters' cache, and are not ");
writer.println(" assigned either. A rolling Masters restart, followed by a ");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add one that hbck2 can call?

Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
Comment thread hbase-hbck2/src/main/java/org/apache/hbase/HBCK2.java Outdated
@asf-ci
Copy link
Copy Markdown

asf-ci commented Jul 15, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/29/

@wchevreuil
Copy link
Copy Markdown
Contributor Author

Had pushed a new commit d40cfb1, addressing latest suggestions and adding more UTs.

@asf-ci
Copy link
Copy Markdown

asf-ci commented Jul 17, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/41/

@saintstack
Copy link
Copy Markdown
Contributor

Looking at this again, because I'm looking at how to have hbck2 plug holes. This is a hole plugger but only for the case where there is a region in HDFS that has been dropped. This should run first. Lets get it in (smile). There are outstanding comments. Maybe take a look?

@busbey
Copy link
Copy Markdown
Contributor

busbey commented Aug 6, 2019

I'd also like to get this landed as a part of getting a released artifact out for hbase-operator-tools soon. I'll keep an eye out for the update, but if I miss it and you're looking for reviews please ping me.

Copy link
Copy Markdown
Contributor Author

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the insights @saintstack @busbey . Had pushed another commit addressing some styling issues. I had now revisit all outstanding comments. Most of those had been addressed, but for some, I had replied either with some explanation or with alternative solutions. Let me know on your thoughts about current state for this.

@wchevreuil
Copy link
Copy Markdown
Contributor Author

Had started a new PR after rebasing these changes with latest master version:
#18

Am closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants