Hbase 22567#3
Conversation
|
Refer to this link for build results (access rights to CI server needed): |
| return EXIT_FAILURE; | ||
| } | ||
|
|
||
| int addMissingRegionsInMeta(String... tableNames) throws IOException { |
There was a problem hiding this comment.
Could we break this apart into two steps?
- Report missing regions in meta
- Add a region to meta
We would be able to compose this higher-level tool then, in something like:
REGIONS=$(hbase hbck2 report_missing_regions)
for region in REGIONS; do
hbase hbck2 add_region "region"
doneI think this would help keep the complexity of this method down, making it easier to test as a by-product. It would also let an admin inspect the output from the first step (making sure we want to re-create all of those regions, maybe doing some external validation) before trying to re-create them all in one step.
I think the tricky part would be coming up with the "format" to take the output from step1 and pass it to step2 with minimal "massaging".
WDYT about that, Wellington?
There was a problem hiding this comment.
useful for backgound IMHO is git's "plumbing" vs "porcelain":
https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain
analogous here would be "report missing" and "add region" as plumbing level. The output of either need not be user friendly and should focus on being maintainable. (so like encoded region name for the output of report missing would be fine)
something like the wrapper Josh mentions would be a porcelain command, probably taking the encoded regions from report missing and providing a user friendly info (like table, region keys, etc). it'd be useful for it to have a --dryrun mode that skipped calling add_region so that operators could get the more user friendly version of info about what regions appear to be missing.
There was a problem hiding this comment.
@joshelser @busbey , Sorry for the delay. I think these are all good ideas. Had pushed a commit where I'm trying to implement the plumbing/porcelain approach within 3 methods: addMissingRegionsInMeta, addMissingRegionsInMetaForTables and reportTablesWithMissingRegionsInMeta. The original porcelain method is now addMissingRegionsInMetaForTables. It combines addMissingRegionsInMeta and reportTablesWithMissingRegionsInMeta calls to add regions back in meta. reportTablesWithMissingRegionsInMeta can is also exposed as a CLI method, as a mean to provide a reporting only command for operators. Some additional UTs are still pending, but an initial manual testing on a broke cluster did look promising. Let me know what you guys think about these, while I will add more UTs for those.
…in order to retrieve threads execution errors
| try { | ||
| admin.disableTable(tableName); | ||
| } catch (IOException e) { | ||
| LOG.warn("Failed to disable table {}, " |
There was a problem hiding this comment.
why is it safe to proceed even failing to disable the table?
There was a problem hiding this comment.
We try to disable mostly to make sure we will not be dealing with regions that are transient, say if split/merge is happening while we scan meta. Assumption here is that if disable fails, the table is already somehow offline. That would be the case when even namespace table has missing regions, for example, so we need to proceed with regions re-insertion even if we are not able to disable table.
| List<RegionInfo> regionInfos = MetaTableAccessor. | ||
| getTableRegions(this.conn, tableName, false); | ||
| for(final FileStatus regionDir : regionsDirs){ | ||
| if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) { |
There was a problem hiding this comment.
I spotted some minor formatting issues and also can we put ".tabledesc" into a constant? thanks.
nit:
how about moving regionFoundInMeta() into a separate function that we can unit test?
There was a problem hiding this comment.
I spotted some minor formatting issues and also can we put ".tabledesc" into a constant?
Indeed, moving it together with ".tmp" into constants. Will be available in next commit.
how about moving regionFoundInMeta() into a separate function that we can unit test?
I think it's already getting tested by the available tests for findMissingRegionsInMETA method. Since there's not any other client method requiring such logic, I didn't feel it should deserve its own separate method.
There was a problem hiding this comment.
tabledesc not already in a constant over in hbase?
There was a problem hiding this comment.
Spacing ... need spaces around operators as we have in rest of codebase.
Yeah, is the .tmp defined already too?
There was a problem hiding this comment.
Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?
There was a problem hiding this comment.
tabledesc not already in a constant over in hbase?
I could only find a package private constant in FSTableDescriptors, so couldn't reuse it here.
There was a problem hiding this comment.
Spacing ... need spaces around operators as we have in rest of codebase.
Addressed on current PR version.
There was a problem hiding this comment.
Yeah, is the .tmp defined already too?
Yep, found several constants defined for it, decided to use HConstants.HBASE_TEMP_DIRECTORY on current PR version.
There was a problem hiding this comment.
Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?
I'm not aware of any, but maybe there is. Any ideas on where about in the code to to look for? Tried review some of the hbase-server util package classes, but no luck.
…to smaller pieces following the plumbing/porcelain approach
|
Refer to this link for build results (access rights to CI server needed): |
| return EXIT_FAILURE; | ||
| } | ||
|
|
||
| Map<String,List<Path>> reportTablesWithMissingRegionsInMeta(String... nameSpaceOrTable) |
There was a problem hiding this comment.
We've done the String that could be one of two things in the past and it proved more trouble than the seeming simplicity it promises. Something to watch out for.
There was a problem hiding this comment.
What if operator mixes table and namespace in the list passed?
There was a problem hiding this comment.
It would still list missing regions specific for each namespace/table passed as parameter. Even if we pass a namespace and then a table within that namespace, output would show the missing regions grouped. For example, say namespace ns1 has two tables tbl1 and tbl2, and each of these two tables has a missing region r1 and r2, respectively. Then calling reportTablesWithMissingRegionsInMeta ns1 ns1:tbl1, should print:
ns1 -> r1 r2
ns1:tbl1 -> r1
| List<RegionInfo> regionInfos = MetaTableAccessor. | ||
| getTableRegions(this.conn, tableName, false); | ||
| for(final FileStatus regionDir : regionsDirs){ | ||
| if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) { |
There was a problem hiding this comment.
tabledesc not already in a constant over in hbase?
| List<RegionInfo> regionInfos = MetaTableAccessor. | ||
| getTableRegions(this.conn, tableName, false); | ||
| for(final FileStatus regionDir : regionsDirs){ | ||
| if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) { |
There was a problem hiding this comment.
Spacing ... need spaces around operators as we have in rest of codebase.
Yeah, is the .tmp defined already too?
| List<RegionInfo> regionInfos = MetaTableAccessor. | ||
| getTableRegions(this.conn, tableName, false); | ||
| for(final FileStatus regionDir : regionsDirs){ | ||
| if(!regionDir.getPath().getName().equals(".tabledesc")&&!regionDir.getPath().getName().equals(".tmp")) { |
There was a problem hiding this comment.
Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?
| } | ||
|
|
||
| public void putRegionInfoFromHdfsInMeta(Path region) throws IOException { | ||
| RegionInfo info = HRegionFileSystem.loadRegionInfoFileContent(fs, region); |
There was a problem hiding this comment.
Should there be a check we don't overlap with an existing region?
There was a problem hiding this comment.
Good point. My first idea was to just leave it break, and operator would need to take further actions, such as merge overlapping regions. We could add additional check, as a mean to warn operator that extra merge would be required later.
Any other thoughts?
There was a problem hiding this comment.
tabledesc not already in a constant over in hbase?
I could only find a package private constant in FSTableDescriptors, so couldn't reuse it here.
Spacing ... need spaces around operators as we have in rest of codebase.
Addressed on current PR version.
Yeah, is the .tmp defined already too?
Yep, found several constants defined for it, decided to use HConstants.HBASE_TEMP_DIRECTORY on current PR version.
Is there not a filitering mechanism elsewhere to rule out these exceptions? Could it be reused?
I'm not aware of any, but maybe there is. Any ideas on where about in the code to to look for? Tried review some of the hbase-server util package classes, but no luck.
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
| writer.println(" no matches in META, it reads regioninfo metadata file and "); | ||
| writer.println(" re-creates given region in META. Regions are re-created in 'CLOSED' "); | ||
| writer.println(" state at META table only, but not in Masters' cache, and are not "); | ||
| writer.println(" assigned either. A rolling Masters restart, followed by a "); |
There was a problem hiding this comment.
We should add one that hbck2 can call?
|
Refer to this link for build results (access rights to CI server needed): |
|
Had pushed a new commit d40cfb1, addressing latest suggestions and adding more UTs. |
… to the one from reportMissing)
|
Refer to this link for build results (access rights to CI server needed): |
|
Looking at this again, because I'm looking at how to have hbck2 plug holes. This is a hole plugger but only for the case where there is a region in HDFS that has been dropped. This should run first. Lets get it in (smile). There are outstanding comments. Maybe take a look? |
|
I'd also like to get this landed as a part of getting a released artifact out for hbase-operator-tools soon. I'll keep an eye out for the update, but if I miss it and you're looking for reviews please ping me. |
wchevreuil
left a comment
There was a problem hiding this comment.
Thanks for all the insights @saintstack @busbey . Had pushed another commit addressing some styling issues. I had now revisit all outstanding comments. Most of those had been addressed, but for some, I had replied either with some explanation or with alternative solutions. Let me know on your thoughts about current state for this.
|
Had started a new PR after rebasing these changes with latest master version: Am closing this one. |
First PR for addMissingRegionsToMeta