文章目录
- 性能优化
- 原方案
- 缓冲区备份方案
- 优点
- 缺点
- 缓冲区备份方案实现
- 备份原理
- Controller
- Service
- 说明
性能优化
原方案
递归扫描数据源的所有文件,每扫描一个,就判断当前文件需不需要备份,如果需要备份,直接执行备份,并将数据插入到数据库中。该实现方式会造成程序与数据库的通讯时间长、索引维护时间长、数据库日志写入次数更多、IO效率较低。从下图发现整个备份时长竟达到了一个小时(备份目录大小:8.15G
,文件个数:211470
),这个性能肯定是属于不可用的
缓冲区备份方案
该方案即使用缓冲区来暂存需要插入或者更新的数据,等待缓冲区的数据量较多时,再进行批量插入或批量更新。通过下图可以发现,优化后的程序只需要46秒即可完成备份,备份效率相较于原方案大大提升
优点
- 效率高
缺点
- 实时性不强,原方案每次备份完文件就会将数据插入数据库,但当前方案则是等数据够多才批量存储,如果程序在备份过程中被关闭,则部分备份过程数据会丢失,导致部分文件在下次备份时会替换本次备份已经备份过的文件,注意这里丢失的数据不是指数据源中的数据,而是要存储到数据库的那些数据
- 占用内存相比原方案会稍微大一点
缓冲区备份方案实现
备份原理
备份原理其实非常简单。在文件第一次备份的时候,会在数据库中存储文件大小
、修改日期
、MD5码信息
,等第二次备份的时候,会对比文件现在的状态,如判断文件大小、修改日期有没有变化。如果两者都没有变化,说明文件没有被修改,无需替换;如果大小有变化,说明文件被修改了,需要进行替换;如果修改日期变化,文件大小没有变化,则需要进一步判断文件当前的MD5码是否和数据库中存储的一致,因为文件大小相同不能说明文件一定没有修改。如果MD5码不一致说明文件真正被修改了,因为同样的输入通过算法输出的MD5码一定是相同的
Controller
/*** 对指定的数据源进行备份*/
@GetMapping("/backupBySourceId/{sourceId}")
public Result backupBySourceId(@PathVariable Long sourceId) throws IOException {if (backupingSourceIDSet.contains(sourceId)) {throw new ClientException("当前备份源正在备份中,请稍后再试");} // 检查 备份源目录是否存在 和 准备好备份目标目录List<Task> taskList = backupService.checkSourceAndTarget(sourceId);if (taskList == null || taskList.size() == 0) {removeSourceIdFromBacking(backupingSourceIDSet, sourceId);return Results.failure();}// 开始备份backupingSourceIDSet.add(sourceId);CompletableFuture.runAsync(() -> {try {backupService.backupBySourceId(sourceId, taskList);} catch (ServerException e) {try {throw new ServerException(e.getMessage());} catch (ServerException ex) {throw new RuntimeException(ex);}} catch (IOException e) {throw new RuntimeException(e);}}, executor).exceptionally(throwable -> {log.error(throwable.getMessage());removeSourceIdFromBacking(backupingSourceIDSet, sourceId);return null;});return Results.success();
}/*** 将数据源Id从正在备份的数据源set中移除** @param backupingSourceIDSet* @param sourceId*/
private void removeSourceIdFromBacking(HashSet<Long> backupingSourceIDSet, Long sourceId) {if (backupingSourceIDSet.contains(sourceId)) {backupingSourceIDSet.remove(sourceId);}
}
这里面主要有如下细节:
- 在备份之前,首先判断当前数据源是否处于备份状态(backupingSourceIDSet可以理解为一个备份ID池,ID在里面则说明数据源正在备份),如果数据源处于备份状态,则直接返回提示告诉用户数据源正在备份,让其稍后再尝试
- 在真正开始备份之前,需要检测数据源和备份目标目录是否存在,有时候用户可能忘记插上硬盘或者输错目录路径
- 如果数据量较大,备份需要花费一定的时间,但是用户点击备份按钮之后,系统应该有所提示让用户知道备份是否成功开始,因此使用
CompletableFuture
来开启异步任务来执行备份,然后给用户返回数据源加入备份成功 - 备份完成之后,将数据源ID从备份ID池中移除
Service
备份功能的实现需要使用的表如下:
backup_source
:存储备份数据源backup_target
:存储备份目标目录,关联数据源,数据源和备份目标目录是一对多关系backup_task
:存储备份任务backup_file
:存储已备份的文件backup_file_history
:存储已备份文件对应的备份记录sys_param
:存储系统在备份时忽略的文件或目录
下面代码开始真正的业务介绍:
/*** 对指定的备份源进行备份** @param sourceId*/
@Override
public void backupBySourceId(Long sourceId, List<Task> taskList) throws IOException {// 更新数据源备份次数backupSourceService.updateBackupNum(sourceId);// 查询忽略文件和忽略目录List<String> ignoreFileList = sysParamService.getIgnoreFileOrIgnoreDir(SystemParamEnum.IGNORE_FILE_NAME.getParamName());List<String> ignoreDirectoryList = sysParamService.getIgnoreFileOrIgnoreDir(SystemParamEnum.IGNORE_DIRECTORY_NAME.getParamName());// 执行备份CompletableFuture[] futureArr = new CompletableFuture[taskList.size()];for (int i = 0; i < taskList.size(); i++) {int finalI = i;Task task = taskList.get(finalI);
// backUpByTask(task, ignoreFileList, ignoreDirectoryList);futureArr[i] = CompletableFuture.runAsync(() -> {try {backUpByTask(task, ignoreFileList, ignoreDirectoryList);} catch (IOException e) {throw new RuntimeException(e);}}, executor).exceptionally(e -> {log.error(e.getMessage());// 备份失败(出现异常),移除相应数据源IDif (backupController.backupingSourceIDSet.contains(sourceId)) {backupController.backupingSourceIDSet.remove(sourceId);}Map<String, Object> dataMap = new HashMap<>();dataMap.put("code", WebsocketNoticeEnum.BACKUP_ERROR.getCode());dataMap.put("message", e.getMessage());webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));return null;});}CompletableFuture.allOf(futureArr).join();// 备份完成,移除相应数据源IDif (backupController.backupingSourceIDSet.contains(sourceId)) {backupController.backupingSourceIDSet.remove(sourceId);}
}
该方法业务流程如下:
- 执行备份之前先更新数据库中数据源的备份次数
- 通过
sys_param
查询出要忽略的文件和忽略目录,在备份过程中对这些文件和目录进行忽略,因为部分文件是不需要备份的,例如Java项目的.idea
文件,该文件使用IDEA启动项目会自动生成,而且不同版本IDEA生成的.idea文件有所区别,因此不需要进行备份 - 如果需要将一个数据源的数据同时备份到多个目标目录中,同时开多个线程来分别执行每个备份任务,提高备份效率,一个备份任务负责将数据源的数据备份到一个目标目录中
/*** 根据备份任务来进行备份** @param task 备份任务* @param ignoreFileList 忽略文件名列表* @param ignoreDirectoryList 忽略目录名列表*/
private void backUpByTask(Task task, List<String> ignoreFileList, List<String> ignoreDirectoryList) throws IOException {BackupSource backupSource = task.getSource();BackupTarget backupTarget = task.getTarget();// 找到备份目录下面的所有文件BackupStatistic sta = new BackupStatistic(0, 0, 0, 0, new Date().getTime() / 1000);// 获取数据源的统计数据getStatisticMessage(new File(backupSource.getRootPath()), sta);
// log.info("当前数据源(id={})下的总文件数量:{},总字节数:{}", backupSource.getId(), sta.totalBackupFileNum, sta.totalBackupByteNum);String targetRootPath = getTargetRootPath(task, backupSource, backupTarget);// 将任务插入到数据库中BackupTask backupTask = new BackupTask(backupSource.getRootPath(), targetRootPath,sta.totalBackupFileNum, 0, sta.totalBackupByteNum, 0L,0, "0.0", "0.0", 0L, new Date());backupTaskService.save(backupTask);
// log.info("发送任务消息,通知前端任务创建成功");Map<String, Object> dataMap = new HashMap<>();dataMap.put("code", WebsocketNoticeEnum.BACKUP_START.getCode());dataMap.put("message", WebsocketNoticeEnum.BACKUP_START.getDetail());dataMap.put("backupTask", backupTask);webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));log.info("任务创建成功,开始备份");/// 查询出数据源和备份目标对应的 备份文件信息// 查询出当前数据源中所有已经备份过的文件QueryWrapper<BackupFile> backupFileQueryWrapper = new QueryWrapper<BackupFile>().eq("backup_source_id", backupSource.getId()).eq("father_id", 0L).select("id", "source_file_path", "target_file_path", "file_name");if (backupSource.getBackupType() == 0) {// 集中备份的时候,根据目标id查询;分散备份的时候,目标id不确定,所以都查询出来backupFileQueryWrapper.eq("backup_target_id", backupTarget.getId());}List<BackupFile> backupFileList = backupFileService.list(backupFileQueryWrapper);// 将数据源的数据备份到多个目标目录下面sta.second = new Date().getTime() / 1000;/// 开始备份List<BackupFile> backupFileBuffer1 = new ArrayList<>();List<BackupFile> backupFileBuffer2 = new ArrayList<>();List<BackupFileHistory> backupFileHistoryBuffer1 = new ArrayList<>();List<BackupFileHistory> backupFileHistoryBuffer2 = new ArrayList<>();backUpAllFilesOfFatherFile(task, new File(backupSource.getRootPath()),backupSource, backupTarget, task.getTargetList(), sta,"", backupTask.getId(), backupTask.getCreateTime(),0L, backupFileList, ignoreFileList, ignoreDirectoryList,backupFileBuffer1, backupFileHistoryBuffer1,backupFileBuffer2, backupFileHistoryBuffer2);// 处理缓冲区中残留数据buffer1Process(backupFileBuffer1, backupFileHistoryBuffer1);buffer2Process(backupTask.getId(), backupSource, backupFileBuffer2, backupFileHistoryBuffer2);/// 备份结束if (Cache.STOP_TASK_ID_SET.contains(backupTask.getId())) {// --if-- 因为备份任务被暂停才结束的Cache.STOP_TASK_ID_SET.remove(backupTask.getId());} else {// --if-- 备份完成了,修改备份任务的状态为完成backupTask.setBackupStatus(2);backupTask.setFinishFileNum(sta.getTotalBackupFileNum());backupTask.setFinishByteNum(sta.getTotalBackupByteNum());backupTask.setEndTime(new Date());backupTask.setBackupTime(backupTask.getEndTime().getTime() - backupTask.getCreateTime().getTime());backupTaskService.updateById(backupTask);setProgress(backupTask);log.info("发送任务消息,通知前端任务备份完成");dataMap = new HashMap<>();dataMap.put("code", WebsocketNoticeEnum.BACKUP_SUCCESS.getCode());dataMap.put("message", WebsocketNoticeEnum.BACKUP_SUCCESS.getDetail());dataMap.put("backupTask", backupTask);webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));}
}/*** 获取一个目录下面的统计信息* 1. 需要备份的文件数量* 2. 需要备份的字节数量** @param file* @param sta 用来存储统计信息*/
private void getStatisticMessage(File file, BackupStatistic sta) {File[] fileArr = file.listFiles();for (File f : fileArr) {if (f.isDirectory()) {// --if-- 若是目录,则递归统计该目录下的文件数量getStatisticMessage(f, sta);} else {// --if-- 若是文件,添加到文件夹中sta.totalBackupFileNum++;sta.totalBackupByteNum += f.length();}}
}
该方法主要负责一个任务的备份,业务流程如下:
- 使用递归方法
getStatisticMessage
来统计数据源根目录下面一个有多少个文件,方便后面实现进度可视化(大数据量时,这个方法较慢,需要进一步优化) - 将备份任务插入的数据库中进行保存、然后通过
Websocket
双向通讯技术通知前端备份开始啦,顺便告诉前端当前任务需要备份的文件总数是多少、文件个数是多少,类似下图的效果
- 将当前数据源所备份过第一层深度的备份文件一起查询出来,这些备份文件的
father_id
为0。现实情况中,目录下面可能会包含子目录和子文件,而子目录下面又可能会有子目录或子文件,可以将此结构理解成一个文件树,所以就有了深度这个概念 - 进入递归备份方法
backUpAllFilesOfFatherFile
,检验每个目录、每个文件是否需要进行备份 - 备份完成之后,将缓冲区中残留的数据存储到数据库中
- 更新数据库中的备份任务状态
- 使用
Websocket
通知前端当前任务备份完成
/*** 将一个 父文件夹 的所有文件 备份到 目标目录中** @param fatherFile* @param backupSource* @param backupTarget* @param backupStatistic* @param middlePath*/private void backUpAllFilesOfFatherFile(Task task, File fatherFile,BackupSource backupSource, BackupTarget backupTarget, List<BackupTarget> targetList,BackupStatistic backupStatistic, String middlePath,Long backupTaskId, Date taskBackupStartTime,Long fatherId, List<BackupFile> backupFileList,List<String> ignoreFileList, List<String> ignoreDirectoryList,List<BackupFile> backupFileBuffer1, List<BackupFileHistory> backupFileHistoryBuffer1,List<BackupFile> backupFileBuffer2, List<BackupFileHistory> backupFileHistoryBuffer2) {
// System.out.println("execSingleFileBackUp_TIME:" + execSingleFileBackUp_TIME * 1.0 / 1000 + "s");File[] sonFileArr = fatherFile.listFiles();HashMap<String, BackupFile> fileNameAndBackupFileMap = new HashMap<>();if (backupFileList != null) {// 记录要移除的 文件信息ID
// List<Long> removeBackupFileIdList = new ArrayList<>();// 存储数据源中存在的文件的名称HashSet<String> fileNameSet = new HashSet<>();for (int i = 0; i < sonFileArr.length; i++) {fileNameSet.add(sonFileArr[i].getName());}for (BackupFile backupFile : backupFileList) {fileNameAndBackupFileMap.put(backupFile.getFileName(), backupFile);if (!fileNameSet.contains(backupFile.getFileName())) {
// removeBackupFileIdList.add(backupFile.getId());}}// 如果数据源中没有相应文件,将其也从数据库中删除
// backupFileService.recursionRemoveBackupFile(removeBackupFileIdList);}for (File file : sonFileArr) {if (Cache.STOP_TASK_ID_SET.contains(backupTaskId)) {// --if-- 如果任务被暂停,退出备份,存储当前备份任务的信息BackupTask backupTask = new BackupTask();backupTask.setId(backupTaskId);backupTask.setBackupStatus(4);backupTask.setFinishFileNum(backupStatistic.getFinishBackupFileNum());backupTask.setFinishByteNum(backupStatistic.getFinishBackupByteNum());backupTask.setEndTime(new Date());backupTask.setBackupTime(backupTask.getEndTime().getTime() - taskBackupStartTime.getTime());backupTaskService.updateById(backupTask);backupTask.setTotalFileNum(backupStatistic.getTotalBackupFileNum());backupTask.setTotalByteNum(backupStatistic.getTotalBackupByteNum());setProgress(backupTask);backupTask.setBackupSourceRoot(backupSource.getRootPath());backupTask.setBackupTargetRoot(backupTarget.getTargetRootPath());backupTask.setCreateTime(taskBackupStartTime);log.info("发送任务消息,通知前端任务暂停");Map<String, Object> dataMap = new HashMap<>();dataMap.put("code", WebsocketNoticeEnum.BACKUP_STOP.getCode());dataMap.put("message", WebsocketNoticeEnum.BACKUP_STOP.getDetail());dataMap.put("backupTask", backupTask);webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));break;}
// if (file.toString().indexOf("/.") != -1 || file.toString().indexOf("\\.") != -1) {
// continue;
// }if (file.isDirectory()) {// --if-- 若是目录,先在目标目录下创建目录,然后递归备份文件if (isContainedInIgnoreList(ignoreDirectoryList, file)) {continue;}String targetFilePath = getTargetFilePath(backupSource, backupTarget, targetList, middlePath, file);// 查询备份文件数据表是否已经包含这个记录BackupFile backupFile = fileNameAndBackupFileMap.get(file.getName());Long curBackupFileId = backupFile == null ? null : backupFile.getId();File targetFile = new File(targetFilePath);if (!targetFile.exists()) {boolean mkdirs = targetFile.mkdirs();if (mkdirs) {// 将目录插入到数据库中if (curBackupFileId == null) {curBackupFileId = saveBackupFileDir(backupSource, backupTarget, targetFilePath, fatherId, file);}} else {throw new ServiceException("无法创建目录,可能是权限不够");}} else {// --if-- 虽然目录已经存在,但是数据库中没有信息,还是需要存储相关信息if (curBackupFileId == null) {curBackupFileId = saveBackupFileDir(backupSource, backupTarget, targetFilePath, fatherId, file);}}// 是否存在对应的文件信息,如果备份类型不是是分散存储,那么文件信息肯定不存在boolean haveBackupFile = fileNameAndBackupFileMap.get(file.getName()) != null;List<BackupFile> children = null;if (haveBackupFile) {children = new ArrayList<>();long start = System.currentTimeMillis();children.addAll(backupFileService.list(new QueryWrapper<BackupFile>().eq("backup_source_id", backupSource.getId()).eq("father_id", curBackupFileId)));
// DATABASE_BACKUP_FILE_SEARCH_TIME += System.currentTimeMillis() - start;
// System.out.println("备份文件查询时间:" + DATABASE_BACKUP_FILE_SEARCH_TIME * 1.0 / 1000 + "s");}backUpAllFilesOfFatherFile(task, file, backupSource, backupTarget,targetList, backupStatistic,middlePath + file.getName() + File.separator, backupTaskId, taskBackupStartTime,curBackupFileId, children,ignoreFileList, ignoreDirectoryList,backupFileBuffer1, backupFileHistoryBuffer1,backupFileBuffer2, backupFileHistoryBuffer2);} else {// --if-- 若是文件,执行备份操作if (isContainedInIgnoreList(ignoreFileList, file)) {continue;}if (file.getName().contains(".DS_Store")) {// 跳过Macos的Finder创建文件continue;}try {execSingleFileBackUp(task, backupSource, backupTarget, targetList, file.toString(),backupStatistic, middlePath, backupTaskId, taskBackupStartTime, fatherId,fileNameAndBackupFileMap, backupFileBuffer1, backupFileHistoryBuffer1,backupFileBuffer2, backupFileHistoryBuffer2);} catch (SQLException e) {throw new RuntimeException(e);} catch (IOException e) {throw new RuntimeException(e);}}}}
该方法用来递归处理一个目录的备份,业务逻辑如下:
- 将目录对应的备份文件集合封装到字典中,优化后续校验文件是否修改的时候查询效率
- 在循环处理
sonFileArr
的时候,首先判断当前任务是否被暂停备份,如果任务ID存在于暂停ID池STOP_TASK_ID_SET
中,则暂停当前任务,更新数据库的任务状态,并通知前端任务暂停成功 - 判断当前所循环到的子文件是目录还是文件,如果是目录,进入第4步;否则进入第5步
- 检查当前目录是否被忽略,如果被忽略直接continue,否则继续执行;检查
backup_file
中是否有相应信息,没有则存储到数据库中,有则继续执行;若备份目标目录没有对应的目录,则创建目录;查询当前所遍历目录的子备份文件集合children
,递归调用backUpAllFilesOfFatherFile
- 检查当前文件是否被忽略,如果被忽略直接continue,否则继续执行;调用
execSingleFileBackUp
执行单个文件的备份
/*** 执行一个文件的备份* 首先判断文件是否已经备份或者是否有所修改,是则进行备份** @param source* @param target* @param backupSourceFilePath* @param backupStatistic* @param middlePath* @throws SQLException* @throws IOException*/
private void execSingleFileBackUp(Task task, BackupSource source, BackupTarget target,List<BackupTarget> targetList, String backupSourceFilePath,BackupStatistic backupStatistic, String middlePath,Long backupTaskId, Date taskBackupStartTime,Long fatherId, HashMap<String, BackupFile> fileNameAndBackupFileMap,List<BackupFile> backupFileBuffer1, List<BackupFileHistory> backupFileHistoryBuffer1,List<BackupFile> backupFileBuffer2, List<BackupFileHistory> backupFileHistoryBuffer2) throws SQLException, IOException {long start = System.currentTimeMillis();/* if (backupSourceFilePath.indexOf("/.") != -1 || backupSourceFilePath.indexOf("\\.") != -1) {// 不拷贝.开头的文件夹和文件return;}*/// 获取源文件File backupSourceFile = new File(backupSourceFilePath);if (!backupSourceFile.exists()) {int temp = 0;}Long targetId = source.getBackupType() == 0 ? target.getId() : 0;if (fileNameAndBackupFileMap.get(backupSourceFile.getName()) == null) {// --if-- 文件还没有备份过,将其插入到数据库中,并取出id// 获取备份目标路径String targetFilePath = getTargetFilePath(source, target, targetList, middlePath, backupSourceFile);int isCompress = 0;if (isNeedCompress(source, backupSourceFile)) {// --if-- 当数据源设置了压缩,且文件的大小等于10M才进行压缩isCompress = 1;targetFilePath = updateTargetFilePath(targetFilePath);}BackupFile backupFile = constructBackupFile(source, backupSourceFilePath, targetFilePath, targetId,fatherId, isCompress, backupSourceFile);FileInputStream sourceFileInputStream = new FileInputStream(backupSourceFilePath);String md5str = DigestUtil.md5Hex(sourceFileInputStream);sourceFileInputStream.close();// backupFileId 待定,还不是准确的BackupFileHistory backupFileHistory = constructBackupFileHistory(backupSourceFilePath, source.getId(), targetId, targetFilePath, 0L, backupTaskId, new Date(), backupSourceFile, md5str);addToBuffer1(backupFile, backupFileHistory, backupFileBuffer1, backupFileHistoryBuffer1,isCompress, backupSourceFile, targetFilePath);} else {// 直接从字典中获取BackupFile backupFileInDatabase = fileNameAndBackupFileMap.get(backupSourceFile.getName());addToBuffer2(source.getId(), targetId, backupTaskId,source, backupFileInDatabase,backupFileBuffer2, backupFileHistoryBuffer2);}// 每隔一秒输出一下拷贝进度backupStatistic.finishBackupFileNum++;backupStatistic.finishBackupByteNum += backupSourceFile.length();long curTime = System.currentTimeMillis();if ((curTime / 1000) != backupStatistic.second) {backupStatistic.second = curTime / 1000;
// log.info("文件数量:拷贝进度:" + statistic.finishBackupFileNum * 100.0 / statistic.totalBackupFileNum + "% " + statistic.finishBackupFileNum + "/" + statistic.totalBackupFileNum +
// "; 文件大小:拷贝进度:" + statistic.finishBackupByteNum * 100.0 / statistic.totalBackupByteNum + "% " + statistic.finishBackupByteNum + "/" + statistic.totalBackupByteNum);BackupTask backupTask = new BackupTask();backupTask.setId(backupTaskId);backupTask.setBackupStatus(1);backupTask.setFinishFileNum(backupStatistic.finishBackupFileNum);backupTask.setFinishByteNum(backupStatistic.finishBackupByteNum);backupTask.setBackupTime(curTime - taskBackupStartTime.getTime());backupTaskService.updateById(backupTask);// 剩下的信息用来给前端看的,不需要更新到数据库中backupTask.setBackupSourceRoot(source.getRootPath());backupTask.setBackupTargetRoot(getTargetRootPath(task, source, target));backupTask.setTotalFileNum(backupStatistic.totalBackupFileNum);backupTask.setTotalByteNum(backupStatistic.totalBackupByteNum);backupTask.setCreateTime(taskBackupStartTime);setProgress(backupTask);log.info("发送任务消息,通知前端备份进度变化");Map<String, Object> dataMap = new HashMap<>();dataMap.put("code", WebsocketNoticeEnum.BACKUP_PROCESS.getCode());dataMap.put("message", WebsocketNoticeEnum.BACKUP_PROCESS.getDetail());dataMap.put("backupTask", backupTask);webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));}
// execSingleFileBackUp_TIME += (System.currentTimeMillis() - start);
}/*** 处理还没有存储到数据库中的备份文件, 这些备份文件 百分之百 是没有进行备份的* 1. 将其进行备份* 2. 直接给这些备份文件添加备份记录** @param backupFile* @param backupFileBuffer1*/
private void buffer1Process(BackupFile backupFile, BackupFileHistory backupFileHistory,List<BackupFile> backupFileBuffer1, List<BackupFileHistory> backupFileHistoryBuffer1,int isCompress, File backupSourceFile, String targetFilePath) {// 执行文件备份try {if (execBackupSingleFile(isCompress, backupSourceFile, targetFilePath)) {backupFileBuffer1.add(backupFile);backupFileHistoryBuffer1.add(backupFileHistory);} else {log.error("备份出错");}} catch (Exception e) {log.error("文件备份出错");throw new RuntimeException(e);}if (backupFileBuffer1.size() > this.BATCH_SIZE) {buffer1Process(backupFileBuffer1, backupFileHistoryBuffer1);}
}private void addToBuffer2(Long backupTaskId, BackupSource backupSource, BackupFile backupFileInDatabase,List<BackupFile> backupFileBuffer2, List<BackupFileHistory> backupFileHistoryBuffer2) throws IOException {backupFileBuffer2.add(backupFileInDatabase);if (backupFileBuffer2.size() >= this.BATCH_SIZE) {buffer2Process(backupTaskId, backupSource, backupFileBuffer2, backupFileHistoryBuffer2);}
}
当前方法主要判断文件是否被备份,或者距上次备份是否有修改,如果没有备份过或者修改过,则需要进行备份。业务流程如下:
- 检查
fileNameAndBackupFileMap
中是否包含当前文件名,包含则说明文件之前已经被备份过,进入第2步;否则进入第3步 - 构建
backupFile
、backupFileHistory
对象,并添加到缓冲区buffer1 ,同时执行文件的备份 - 从
fileNameAndBackupFileMap
中取出backupFile
,将其加入缓冲区buffer2 - 除了上面步骤之外,每隔一秒需要通知前端当前的备份进度
private void buffer1Process(List<BackupFile> backupFileBuffer1, List<BackupFileHistory> backupFileHistoryBuffer1) {backupFileService.saveBatch(backupFileBuffer1);for (int i = 0; i < backupFileHistoryBuffer1.size(); i++) {backupFileHistoryBuffer1.get(i).setBackupFileId(backupFileBuffer1.get(i).getId());}// 批量存储备份历史记录backupFileHistoryService.saveBatch(backupFileHistoryBuffer1);backupFileHistoryBuffer1.clear();backupFileBuffer1.clear();
}
该方法是缓冲区1满了之后的处理逻辑,即简单地批量存储备份文件数据
以及备份历史记录数据
,并清空缓冲区
private void buffer2Process(Long backupSourceId, Long backupTargetId, Long backupTaskId, BackupSource backupSource,List<BackupFile> backupFileBuffer2, List<BackupFileHistory> backupFileHistoryBuffer2) throws IOException {String md5str = "";List<BackupFile> updateBackupFileBuffer = new ArrayList<>();List<Long> backupFileIdList = backupFileBuffer2.stream().map(item -> {return item.getId();}).collect(Collectors.toList());// 获取这些备份文件对应的备份历史记录Map<Long, BackupFileHistory> fileIdAndFileHistoryMap = new HashMap<>();long start = System.currentTimeMillis();List<BackupFileHistory> historyList = backupFileHistoryService.listLastBackupHistoryByBackupFileIdList(backupFileIdList);
// DATABASE_BACKUP_FILE_HISTORY_SEARCH_TIME += System.currentTimeMillis() - start;
// System.out.println("备份历史查询时间:" + DATABASE_BACKUP_FILE_HISTORY_SEARCH_TIME * 1.0 / 1000 + "s");for (BackupFileHistory fileHistory : historyList) {fileIdAndFileHistoryMap.put(fileHistory.getBackupFileId(), fileHistory);}for (BackupFile backupFile : backupFileBuffer2) {FileInputStream sourceFileInputStream = null;boolean isNeedBackup = true;BackupFileHistory fileHistory = fileIdAndFileHistoryMap.get(backupFile.getId());File backupSourceFile = new File(backupFile.getSourceFilePath());// 获取备份目标路径String targetFilePath = backupFile.getTargetFilePath();int isCompress = 0;if (isNeedCompress(backupSource, backupSourceFile)) {// --if-- 当数据源设置了压缩,且文件的大小等于10M才进行压缩isCompress = 1;targetFilePath = updateTargetFilePath(targetFilePath);}if (fileHistory != null) {long lastModify = fileHistory.getModifyTime();long fileSize = fileHistory.getFileSize();String historyMD5 = fileHistory.getMd5();if (lastModify == backupSourceFile.lastModified() && fileSize == backupSourceFile.length()) {// 如果文件的 修改时间 和 文件大小 都和数据库中的对应,认为文件没有被修改,无需备份isNeedBackup = false;}// 如果修改时间不一样,文件大小一样,追加校验一次hash,如果hash一样,则更新修改时间,不执行备份if (lastModify != backupSourceFile.lastModified() && fileSize == backupSourceFile.length()) {// 只要输入一样,输出的MD5码就是一样的,如果md5一样,不执行备份sourceFileInputStream = new FileInputStream(backupSourceFile);md5str = DigestUtil.md5Hex(sourceFileInputStream);if (md5str.equals(historyMD5)) {isNeedBackup = false;}}}if (isNeedBackup == false) {// --if-- 判断备份目标目录中没有文件,也要备份过去File file = new File(targetFilePath);if (!file.exists()) {isNeedBackup = true;}}if (isNeedBackup) {Date startDate = new Date();try {// 检查目标目录的文件对应的目录是否存在,不存在则创建(有可能文件被备份到目标目录之后,目标目录的文件夹被删除)String dirPath = targetFilePath.substring(0, targetFilePath.lastIndexOf(File.separator));File dir = new File(dirPath);if (!dir.exists()) {dir.mkdirs();}if (!execBackupSingleFile(isCompress, backupSourceFile, targetFilePath)) {log.error("备份出错");} else {if (sourceFileInputStream == null) {sourceFileInputStream = new FileInputStream(backupSourceFile);md5str = DigestUtil.md5Hex(sourceFileInputStream);}/// 保存文件备份历史BackupFileHistory history = constructBackupFileHistory(backupFile.getSourceFilePath(), backupSourceId, backupTargetId,targetFilePath, backupFile.getId(), backupTaskId, startDate, backupSourceFile, md5str);history.setId(fileHistory.getId());updateBackupFileHistory(history, backupFileHistoryBuffer2);/// 更新文件信息BackupFile newBackupFile = new BackupFile();// 文件的大小可能会改变newBackupFile.setFileLength(backupSourceFile.length());// 文件大小改变之后,压缩之后的文件大小也会改变if (isCompress == 1) {File targetFile = new File(targetFilePath);newBackupFile.setFileLengthAfterCompress(targetFile.length());}// 本来可以压缩的文件,修改之后可能不再可以压缩,因为空间可能变大newBackupFile.setIsCompress(isCompress);// 更新文件的备份次数int backupNum = backupFile.getBackupNum();newBackupFile.setBackupNum(++backupNum);// 修改文件的上次备份时间newBackupFile.setLastBackupTime(new Date());updateBackupFileBuffer.add(newBackupFile);}} catch (Exception e) {log.error("文件备份出错");throw new RuntimeException(e);}}if (sourceFileInputStream != null) {sourceFileInputStream.close();}}// 批量更新备份文件信息if (updateBackupFileBuffer.size() > 0) {backupFileService.updateBatchById(updateBackupFileBuffer);}backupFileBuffer2.clear();
}
该方法是缓冲区2满了之后的处理逻辑,解释如下:
- 根据备份文件集合批量查询出每个备份文件所对应的备份历史记录,并封装成字典
fileIdAndFileHistoryMap
,方便后续使用 - 遍历缓冲区的所有
backupFile
,从fileIdAndFileHistoryMap
中获取对应的fileHistory
,根据fileHistory
判断文件是否需要重新备份 - 如果需要重新备份,调用
execBackupSingleFile
进行备份,备份成功之后更新备份历史和备份文件,注意这里还是使用批量更新,等攒够一定的数据量再进行更新
注意,如下代码是起到一个兜底作用,即为了避免备份目标目录中的数据被误删,如果备份目标目录中没有对应的文件,说明文件被误删了,也需要重新进行备份
if (isNeedBackup == false) {// --if-- 判断备份目标目录中没有文件,也要备份过去File file = new File(targetFilePath);if (!file.exists()) {isNeedBackup = true;}
}
/*** 执行 单个文件 的拷贝** @param isCompress 是否压缩* @param targetFilePath 备份的目标文件路径* @return* @throws IOException*/
private boolean execBackupSingleFile(int isCompress, File backupSourceFile, String targetFilePath) throws IOException {
// System.out.println("执行备份");try {if (isCompress == 1) {// 对文件进行压缩GzipCompressUtil.compressFile(backupSourceFile, targetFilePath);} else {// 直接拷贝backupWithFileChannel(backupSourceFile, new File(targetFilePath));}
// log.info("备份文件成功,从" + sourceFilePath + " 到 " + targetFilePath);} catch (Exception e) {
// log.info("备份文件失败,从" + sourceFilePath + " 到 " + targetFilePath);return false;}return true;
}/*** 将 source 备份到 target** @param source* @param target* @throws IOException*/
private static void backupWithFileChannel(File source, File target) throws IOException {if (!source.exists()) {log.error("备份源文件不存在");return;}FileChannel inputChannel = null;FileChannel outputChannel = null;try {inputChannel = new FileInputStream(source).getChannel();outputChannel = new FileOutputStream(target).getChannel();outputChannel.transferFrom(inputChannel, 0, inputChannel.size());} catch (Exception e) {e.printStackTrace();} finally {if (inputChannel != null) {inputChannel.close();}if (outputChannel != null) {outputChannel.close();}}
}
该方法主要使用nio来实现文件的拷贝,当然,如果选择了压缩形式,则直接将文件压缩之后输出到目标路径
/*** 检查 备份源目录是否存在 和 准备好备份目标目录** @param sourceId*/
@Override
public List<Task> checkSourceAndTarget(Long sourceId) {BackupSource source = backupSourceService.getById(sourceId);if (source == null) {throw new ClientException("id对应备份源信息不存在于数据库中");}File sourceFile = new File(source.getRootPath());if (!sourceFile.exists()) {throw new ServiceException("备份源目录不存在,请检查备份源是否被删除");}// 查询备份源对应的所有 备份目标目录 准备好相关的目录List<BackupTarget> backupTargetList = backupTargetService.list(new QueryWrapper<BackupTarget>().eq("backup_source_id", source.getId()));if (backupTargetList.size() == 0) {throw new ClientException("没有为 备份源 配置 备份目标目录,请先配置 备份目标目录");}// 存储不正常的目标目录List<BackupTarget> unNormalTargetList = new ArrayList<>();for (BackupTarget backupTarget : backupTargetList) {File file = new File(backupTarget.getTargetRootPath());if (!file.exists()) {boolean mkdir = file.mkdir();if (!mkdir) {unNormalTargetList.add(backupTarget);throw new ServiceException("目标目录创建失败,请检查备份目标磁盘是否正常连接电脑");}}}backupTargetList.removeAll(unNormalTargetList);if (backupTargetList.size() == 0) {// --if-- 如果当前数据源没有一个备份目标目录正常,则将当前数据源从正在备份的备份源列表中移除if (backupController.backupingSourceIDSet.contains(sourceId)) {backupController.backupingSourceIDSet.remove(sourceId);}return new ArrayList<>();}List<Task> taskList = null;if (source.getBackupType() == 0) {taskList = backupTargetList.stream().map(item -> {return new Task(source, item, null);}).collect(Collectors.toList());} else if (source.getBackupType() == 1) {Task task = new Task(source, null, backupTargetList);taskList = new ArrayList<>();taskList.add(task);}return taskList;
}
该方法主要用来检查数据源和备份目标目录的准备状态,并准备好备份任务
说明
备份业务比较复杂,代码随时会被优化,文章中的代码仅供参考,如果对最新代码感兴趣的话,还请到Git仓库中进行查看