记一次SNN回放EditsLog异常后的宕机恢复过程(下)

阿凡达2018-07-04 13:00

这样再讲修改后的edits文件上传到三个JN上,重启 SNNSNN 正常启动一段时间后,开始redo edits,然后就出现以下的异常:

2016-08-13 14:22:10,451 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 1559399/2004627 transactions completed. (78%)
2016-08-13 14:22:10,695 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/x/x/x/x/x, penultimateBlock=NULL, lastBlock=blk_1168011668_98411958, RpcClientId=, RpcCallId=-2]
java.lang.IllegalStateException
    at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.addNewBlock(FSEditLogLoader.java:885)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:464)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:228)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:137)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:816)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:676)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
2016-08-13 14:22:10,698 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: java.lang.IllegalStateException
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:238)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:137)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:816)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:676)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
Caused by: java.lang.IllegalStateException
    at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.addNewBlock(FSEditLogLoader.java:885)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:464)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:228)
    ... 12 more

这里出现异常的文件正是我们之前转换Binary文件出现异常的文件,我们将其异常的OP记录进行了修复,但看起来还是有问题。我们继续查看原来的XML文件,发现在原来的OP_UPDATE_BLOCK之后,该文件还有一个OP_ADD_BLOCK的操作,如下

   <RECORD>
    <OPCODE>OP_ADD_BLOCK</OPCODE>
    <DATA>
      <TXID>603485954</TXID>
      <!-- use x to replace the true path -->
      <PATH>/x/x/x/x/x</PATH>
      <BLOCK>
        <BLOCK_ID>1168011668</BLOCK_ID>
        <NUM_BYTES>0</NUM_BYTES>
        <GENSTAMP>98411958</GENSTAMP>
      </BLOCK>
      <RPC_CLIENTID></RPC_CLIENTID>
      <RPC_CALLID>-2</RPC_CALLID>
    </DATA>
  </RECORD>

查看了相关代码,发现出现这个异常,是因为之前的block找不到了导致的。同时想到是否ANN的日志里面是否会丢下一些蛛丝马迹,查看了相应的BLOCK记录(blk_1168011667_98411957),得到以下信息

2016-08-13 08:22:03,179 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /x/x/x/x/x. BP-1322638497-x.x.x.x-1463652191029 blk_1168011667_98411957{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-40b43e77-e1f9-4627-91c5-6a8cd9a515f7:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-72efca67-31b7-48bf-807e-e8833d6ddaba:NORMAL|RBW], 
ReplicaUnderConstruction[[DISK]DS-d2400b34-94df-42e3-9750-a1e05e76ddc0:NORMAL|RBW]]}
2016-08-13 11:48:09,354 INFO BlockStateChange: BLOCK* processReport: blk_1168011667_98411957 on x.x.x.x:1004 size 0 does not belong to any file
2016-08-13 11:48:09,354 INFO BlockStateChange: BLOCK* InvalidateBlocks: add blk_1168011667_98411957 to x.x.x.x:1004
2016-08-13 11:48:10,762 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* BlockManager: ask x.x.x.x:1004 to delete [blk_1168011667_98411957]

说明这个BLOCK在创建后就被销毁了。再查blk_1168011668_98411958这个BLOCK的记录

2016-08-13 08:22:03,201 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /x/x/x/x/x. BP-1322638497-x.x.x.x-1463652191029 blk_1168011668_98411958{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-1d65d1ef-a628-4e83-a38b-b7d1f2ea3a48:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-7cae56a7-0bc2-49ed-aa3f-874e030c699c:NORMAL|RBW], 
ReplicaUnderConstruction[[DISK]DS-7d41e294-f43c-47e2-bce6-2137b8e5fb46:NORMAL|RBW]]}
2016-08-13 08:22:06,295 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: x.x.x.x1:1004 is added to blk_1168011668_98411958{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-7e96702d-e2af-4351-b6b6-c16eef78a92e:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-fd418185-de07-4927-bd8b-290451e66b9a:NORMAL|RBW], 
ReplicaUnderConstruction[[DISK]DS-86154aec-2b9e-471d-bd89-f0f8caeba1c3:NORMAL|RBW]]} size 0
2016-08-13 08:22:06,296 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: x.x.x.x2:1004 is added to blk_1168011668_98411958{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-7e96702d-e2af-4351-b6b6-c16eef78a92e:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-fd418185-de07-4927-bd8b-290451e66b9a:NORMAL|RBW], 
ReplicaUnderConstruction[[DISK]DS-86154aec-2b9e-471d-bd89-f0f8caeba1c3:NORMAL|RBW]]} size 0
2016-08-13 08:22:06,296 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: x.x.x.x3:1004 is added to blk_1168011668_98411958{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-7e96702d-e2af-4351-b6b6-c16eef78a92e:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-fd418185-de07-4927-bd8b-290451e66b9a:NORMAL|RBW], 
ReplicaUnderConstruction[[DISK]DS-86154aec-2b9e-471d-bd89-f0f8caeba1c3:NORMAL|RBW]]} size 0

说明这个BLOCK才是正常的BLOCK记录。最后我们在edits文件中也发现OP_CLOSE给我们提供了相应的证明

 <RECORD>
    <OPCODE>OP_CLOSE</OPCODE>
    <DATA>
      <TXID>603486489</TXID>
      <LENGTH>0</LENGTH>
      <INODEID>0</INODEID>
      <PATH>/x/x/x/x/x</PATH>
      <REPLICATION>3</REPLICATION>
      <MTIME>1471047726297</MTIME>
      <ATIME>1471047723138</ATIME>
      <BLOCKSIZE>268435456</BLOCKSIZE>
      <CLIENT_NAME></CLIENT_NAME>
      <CLIENT_MACHINE></CLIENT_MACHINE>
      <BLOCK>
        <BLOCK_ID>1168011668</BLOCK_ID>
        <NUM_BYTES>6168701</NUM_BYTES>
        <GENSTAMP>98411958</GENSTAMP>
      </BLOCK>
      <PERMISSION_STATUS>
        <USERNAME>x</USERNAME>
        <GROUPNAME>x</GROUPNAME>
        <MODE>416</MODE>
      </PERMISSION_STATUS>
    </DATA>
  </RECORD>

这条文件关闭的操作说明这个文件只有一个BLOCK(blk_11011668_98411958), 之前的BLOCK(blk_1168011667_98411957)是一个无效BLOCK。那我们将其相关操作删除即可(这里还是用了OP_TIMES来作为替换),修改如下:

  <RECORD>
    <OPCODE>OP_UPDATE_BLOCKS</OPCODE>
    <DATA>
      <TXID>603485950</TXID>
      <!-- use x to replace the true path -->
      <PATH>/x</PATH>
      <LENGTH>0</LENGTH>
      <MTIME>1471047726297</MTIME>
      <ATIME>1471047723138</ATIME>
    </DATA>
  </RECORD>

   <RECORD>
    <OPCODE>OP_UPDATE_BLOCKS</OPCODE>
    <DATA>
      <TXID>603485951</TXID>
      <!-- use x to replace the true path -->
      <PATH>/x</PATH>
      <LENGTH>0</LENGTH>
      <MTIME>1471047726297</MTIME>
      <ATIME>1471047723138</ATIME>
    </DATA>
  </RECORD>

重新执行OEV命令将XML转换成Binary文件,并将其发送到3个JN上的edits目录下,重启 SNN ,发现异常消失,服务已正常启动。

查看WEB界面,未发现数据丢失。说明已成功恢复。

后记

事后我们找到出发异常的开发人员,确认是他们是在使用APPEND方式在反复读写文件(即异常信息里提及的文件)

Edit log corruption can happen if append fails with a quota violation Fixed in 2.7.0

OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag Fixed in 2.8.0

当前的使用版本Apache Hadoop 2.5.2, 故需要后续进行版本更新来解决此问题。

相关阅读:记一次SNN回放EditsLog异常后的宕机恢复过程(上)

本文来自网易实践者社区,经作者金川授权发布。